emscripten-fastcomp - LLVM with the emscripten fastcomp javascript backend

Age	Commit message (Collapse)	Author
2013-10-30	Move global FlagSfi variables to common module	Petar Jovanovic
	When built as nexe, llc is configured and built for one arch only. Variables FlagSfiData, FlagSfiLoad, FlagSfiStore, FlagSfiStack, and FlagSfiBranch have to availabe for MIPS as well, so this change moves them from ARM-only code to common code. BUG= building pnacl-llc.nexe for MIPS fails TEST= build sandboxed tools for MIPS R=mseaborn@chromium.org Review URL: https://codereview.chromium.org/46193002
2013-09-06	Emit MachineMoves for ARM floating point callee-saved registers	Derek Schuff
	Currently when a function uses floating-point callee-saved registers, it does not emit unwind info for adjusting the CFA and showing the locations of the saved registers on the stack. This results in the unwinder getting a bad value for the return address when it attempts to unwind past the function's frame, which breaks gdb backtracing and exception handling unwinding. Add to the existing MachineMoves describing the CFA and register locations to handle the float registers BUG= https://code.google.com/p/nativeclient/issues/detail?id=3670 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/23691041
2013-08-30	Revert some ARM byval localmods since byval+varargs are not in stable pexes.	Jan Voung
	Localmods came from: https://codereview.chromium.org/10825082/, and earlier. (1) The original change was so that byval parameters always go on the stack. That part was added because the original ARM code was buggy, and did not actually make a copy of the value, modifying the caller's struct (ouch!). (2) Then came a localmod to make all arguments following a byval go on the stack and to make the var-args code aware of that. This is so that arguments stay in the correct order for var-args to pick up. For (1) there has been some work upstream to make it work better. In any case, clang with --target=armv7a-...-gnueabi only used byval in some limited cases -- when the size of the struct is > 64 bytes where the backend will know that part of it could be in regs, and the rest can be memcpy'ed to the stack. For le32, clang will still generate byval without satisfying the same ARM condition (only for structs bigger than 64 bytes), so it could be very bad if we didn't have the ABI simpification passes rewrite the byval and try to let the ARM backend do things with byval... TEST=the GCC torture tests: va-arg-4.c, and 20030914-2.c and the example in issue 2746 still pass. BUG=none, cleanup R=dschuff@chromium.org Review URL: https://codereview.chromium.org/23691009
2013-08-19	Remove FlagNaClUseM23ArmAbi since M23 was a long time ago.	Jan Voung
	It was used to support old r9/TLS model: https://codereview.chromium.org/11345042/ BUG=none (cleanup) R=jfb@chromium.org Review URL: https://codereview.chromium.org/23135011
2013-07-22	Cherrypick upstream ARM FastISel ext patches	JF Bastien
	Specifically: r186489 - Fix ARMFastISel::ARMEmitIntExt shift emission r183794 - ARM FastISel fix sext/zext fold r183601 - Fix unused variable warning from my previous patch r183551 - ARM FastISel integer sext/zext improvements These should fix some failures that I had run into back then, as well as make ARM FastISel faster because it doesn't go to SelectionDAG. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3501 R=jvoung@chromium.org TEST= make check-all Review URL: https://codereview.chromium.org/19992002
2013-07-19	Clean up a debug printout left in by mistake during the 3.3 merge	Eli Bendersky
	BUG=None R=dschuff@chromium.org Review URL: https://codereview.chromium.org/19472003
2013-07-17	Fixed constpool pattern matching problem that made the crtbeginS build	Eli Bendersky
	on ARM fail
2013-07-17	Remove unnecessary debug printout	Eli Bendersky

2013-07-16	Fixing make check errors...	Eli Bendersky

2013-07-16	Make it compile	Eli Bendersky

2013-07-15	Trying to get the thing to copmile...	Eli Bendersky

2013-07-15	Merge commit '7dfcb84fc16b3bf6b2379713b53090757f0a45f9'	Eli Bendersky
	Conflicts: docs/LangRef.rst include/llvm/CodeGen/CallingConvLower.h include/llvm/IRReader/IRReader.h include/llvm/Target/TargetMachine.h lib/CodeGen/CallingConvLower.cpp lib/IRReader/IRReader.cpp lib/IRReader/LLVMBuild.txt lib/IRReader/Makefile lib/LLVMBuild.txt lib/Makefile lib/Support/MemoryBuffer.cpp lib/Support/Unix/PathV2.inc lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMISelLowering.cpp lib/Target/ARM/ARMInstrInfo.td lib/Target/ARM/ARMSubtarget.cpp lib/Target/ARM/ARMTargetMachine.cpp lib/Target/Mips/CMakeLists.txt lib/Target/Mips/MipsDelaySlotFiller.cpp lib/Target/Mips/MipsISelLowering.cpp lib/Target/Mips/MipsInstrInfo.td lib/Target/Mips/MipsSubtarget.cpp lib/Target/Mips/MipsSubtarget.h lib/Target/X86/X86FastISel.cpp lib/Target/X86/X86ISelDAGToDAG.cpp lib/Target/X86/X86ISelLowering.cpp lib/Target/X86/X86InstrControl.td lib/Target/X86/X86InstrFormats.td lib/Transforms/IPO/ExtractGV.cpp lib/Transforms/InstCombine/InstCombineCompares.cpp lib/Transforms/Utils/SimplifyLibCalls.cpp test/CodeGen/X86/fast-isel-divrem.ll test/MC/ARM/data-in-code.ll tools/Makefile tools/llvm-extract/llvm-extract.cpp tools/llvm-link/CMakeLists.txt tools/opt/CMakeLists.txt tools/opt/LLVMBuild.txt tools/opt/Makefile tools/opt/opt.cpp
2013-07-11	Fix ARM paired GPR COPY lowering	JF Bastien
	ARM paired GPR COPY was being lowered to two MOVr without CC. This patch puts the CC back. I sent this patch upstream (with a test) but haven't received a review yet. This seems like a simple oversight in the code, and is holding my atomics patch so I'd like to get it into our repo. R=dschuff@chromium.org TEST= ./scons run_llvm_bitmanip_intrinsics_test platform=arm BUG= no CC on MOVr Review URL: https://codereview.chromium.org/18047006
2013-07-11	Sandbox ARM load/store exclusive dual	JF Bastien
	PNaCl's LLVM sandboxing wasn't correct for ARM load/store exclusive dual. I encountered this while running our testsuite with my atomic changes: the tests which use volatile 64-bit values started failing validation. R=dschuff@chromium.org BUG= validation failure TEST= ./pnacl/test.sh test-arm Review URL: https://codereview.chromium.org/18978015
2013-07-10	Fix ARM LDMIA MI.	JF Bastien
	The instruction that was generated for paired register stack slot load was wrong. This was fixed by Tim Northover in LLVM 3.3 commit 179977, but his patch does much more and doesn't apply as-is to our tree. I'll therefore punt applying the full patch to when we rebase to 3.3. I encountered the issue while working on atomics (64-bit atomics require paired registers on ARM), and saw the 3.3 fix when I tried upstreaming my fix. BUG= non TEST= ./pnacl/test.sh test-arm R=eliben@chromium.org Review URL: https://codereview.chromium.org/18699004
2013-06-25	Revert "Apply upstream r183551, r183601, r183624 and r183794"	Jan Voung
	Revert this until we fix i1 sext. Currently, it uses LSL and ASR, which are pseudo-instructions and get dropped on the floor when generating .o files. We'll fix that, but for now revert to green the bots. BUG=https://code.google.com/p/nativeclient/issues/detail?id=3501 R=jfb@chromium.org Review URL: https://codereview.chromium.org/17715002
2013-06-17	PNaCl: Turn on ABI verifier by default in sandboxed translator	Mark Seaborn
	Change pnacl-llc.cpp to enable the verifier. This causes two problems which we fix: * The ABI check for declared-but-not-defined functions fails in streaming mode. Fixing this would involve changing the bitcode reader. For now, disable this check when in streaming mode. Add a flag to PNaClABIVerifyModule. * ARM's GlobalMerge pass modifies functions' global variable references to use ConstantExprs that the ABI checker rejects. Address this by disabling GlobalMerge for now. GlobalMerge does not provide much benefit at the moment anyway, because, with the FlattenGlobals pass applied, GlobalMerge doesn't merge variables with alignment >1. BUG=https://code.google.com/p/nativeclient/issues/detail?id=3465 TEST=PNaCl toolchain trybots Review URL: https://codereview.chromium.org/17190002
2013-06-11	Apply upstream r183551, r183601, r183624 and r183794	JF Bastien
	Rename countTrailingZeros to the older CountTrailingZeros_32, mark as localmod. These patches fix correctness issues with ARM FastISel, and should make it faster while generating better code. BUG= none TEST= self R=jvoung@chromium.org Review URL: https://codereview.chromium.org/16712002
2013-05-31	Apply LLVM upstream: r182877 - Enable FastISel on ARM for Linux and NaCl	JF Bastien
	This also pulls in a TargetMachine.h change from r176986 and changes NaCl's intrinsics-bitmanip.ll test to account for register spills at O0. FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl. Thumb2 support needs a bit more work, mainly around register class restrictions. The patch punts to SelectionDAG when doing TLS relocation on non-Darwin targets. I will fix this and other FastISel-to-SelectionDAG failures in a separate patch. The patch also forces FastISel to retain frame pointers: iOS always keeps them for backtracking (so emitted code won't change because of this), but Linux was getting much worse code that was incorrect when using big frames (such as test-suite's lencod). I'll also fix this in a later patch, it will probably require a peephole so that FastISel doesn't rematerialize frame pointers back-to-back. The test changes are straightforward, similar to: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html They also add a vararg test that got dropped in that change. I ran all of test-suite on A15 hardware with --optimize-option=-O0 and all the tests pass. R=dschuff@chromium.org, jvoung@chromium.org BUG= https://code.google.com/p/nativeclient/issues/detail?id=3120 Review URL: https://codereview.chromium.org/15671004
2013-05-29	Apply LLVM upstream: r182175 - Support unaligned load/store on more ARM targets	JF Bastien
	This patch matches GCC behavior: the code used to only allow unaligned load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for v6+ Darwin as well as for v7+ on Linux and NaCl. The distinction is made because v6 doesn't guarantee support (but LLVM assumes that Apple controls hardware+kernel and therefore have conformant v6 CPUs), whereas v7 does provide this guarantee (and Linux/NaCl behave sanely). The patch keeps the -arm-strict-align command line option, and adds -arm-no-strict-align. They behave similarly to GCC's -mstrict-align and -mnostrict-align. I originally encountered this discrepancy in FastIsel tests which expect unaligned load/store generation. Overall this should slightly improve performance in most cases because of reduced I$ pressure. R=dschuff@chromium.org Review URL: https://codereview.chromium.org/15677005
2013-05-15	Merging r181842:	Bill Wendling
	------------------------------------------------------------------------ r181842 \| arnolds \| 2013-05-14 15:33:24 -0700 (Tue, 14 May 2013) \| 14 lines ARM ISel: Don't create illegal types during LowerMUL The transformation happening here is that we want to turn a "mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have to make sure that X still has a valid vector type - possibly recreate an extension to a smaller type. In case of a extload of a memory type smaller than 64 bit we used create a ext(load()). The problem with doing this - instead of recreating an extload - is that an illegal type is exposed. This patch fixes this by creating extloads instead of ext(load()) sequences. Fixes PR15970. radar://13871383 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@181946 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-10	LLVM: Add ELF Note section to NaCl object files identifying them as such to gold	Derek Schuff
	This is needed to switch the native linker to one based on upstream binutils 2.23 R=mseaborn@chromium.org BUG= https://code.google.com/p/nativeclient/issues/detail?id=2971 also related to bug https://code.google.com/p/nativeclient/issues/detail?id=3424 Review URL: https://codereview.chromium.org/15067009
2013-05-08	Add dependency on NaClTransforms to lib/Target/ARM.	Jan Voung
	It depends on NaClTransforms for the denominator zero checks transform. BUG=https://code.google.com/p/nativeclient/issues/detail?id=2833 (fix build) R=dschuff@chromium.org Review URL: https://codereview.chromium.org/15067004
2013-05-08	Insert denominator zero checks for NaCl	David Sehr
	This IR pass for ARM inserts a comparison and a branch to trap if the denominator of a DIV or REM instruction is zero. This makes ARM fault identically to x86 in this case. BUG= https://code.google.com/p/nativeclient/issues/detail?id=2833 R=eliben@chromium.org Review URL: https://codereview.chromium.org/14607004
2013-05-05	ARM AnalyzeBranch should conservatively return true when it sees a predicated	Evan Cheng
	indirect branch at the end of the BB. Otherwise if-converter, branch folding pass may incorrectly update its successor info if it consider BB as fallthrough to the next BB. rdar://13782395 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181161 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-05	For ARM backend, fixed "byval" attribute support.	Stepan Dyatkovskiy
	Now even the small structures could be passed within byval (small enough to be stored in GPRs). In regression tests next function prototypes are checked: PR15293: %artz = type { i32 } define void @foo(%artz* byval %s) define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2) foo: "s" stored in R0 foo2: "s" stored in R0, "s2" stored in R2. Next AAPCS rules are checked: 5.5 Parameters Passing, C.4 and C.5, "ParamSize" is parameter size in 32bit words: -- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4. Parameter should be sent to the stack; NCRN := R4. -- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4. Parameter stored in GPRs; NCRN += ParamSize. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181148 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-05	Add ArrayRef constructor from None, and do the cleanups that this ↵	Dmitri Gribenko
	constructor enables Patch by Robert Wilhelm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181138 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-03	Revert r181009.	Amara Emerson
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181079 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-03	Add support for reading ARM ELF build attributes.	Amara Emerson
	Build attribute sections can now be read if they exist via ELFObjectFile, and the llvm-readobj tool has been extended with an option to dump this information if requested. Regression tests are also included which exercise these features. Also update the docs with a fixed ARM ABI link and a new link to the Addenda which provides the build attributes specification. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181009 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30	Text files should not be marked executable.	Rafael Espindola
	Patch by Oliver Pinter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180797 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30	s tightens up the encoding description for ARM post-indexed ldr ↵	Mihai Popa
	instructions. All instructions in this class have bit 4 cleared. It turns out that there is a test case for this, but it was marked XFAIL. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180778 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30	Refactoring patch.	Stepan Dyatkovskiy
	1. VarArgStyleRegisters: functionality that emits "store" instructions for byval regs moved out into separated method "StoreByValRegs". Before this patch VarArgStyleRegisters had confused use-cases. It was used for both variadic functions and for regular functions with byval parameters. In last case it created new stack-frame and registered it as VarArg frame, that is wrong. This patch replaces VarArgsStyleRegisters usage for byval parameters with StoreByValRegs method. 2. In ARMMachineFunctionInfo, "get/setVarArgsRegSaveSize" was renamed to "get/setArgRegsSaveSize". By the same reason. Sometimes it was used for variadic functions, and sometimes for byval parameters in regular functions. Actually, this property means the size of registers, that keeps arguments, and thats why it was renamed. 3. In ARMISelLowering.cpp, ARMTargetLowering class, in methods computeRegArea and StoreByValRegs, VARegXXXXXX was renamed to ArgRegsXXXXXX still by the same reasons. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180774 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-26	ARM: Fix encoding of hint instruction for Thumb.	Quentin Colombet
	"hint" space for Thumb actually overlaps the encoding space of the CPS instruction. In actuality, hints can be defined as CPS instructions where imod and M bits are all nil. Handle decoding of permitted nop-compatible hints (i.e. nop, yield, wfi, wfe, sev) in DecodeT2CPSInstruction. This commit adds a proper diagnostic message for Imm0_4 and updates all tests. Patch by Mihail Popa <Mihail.Popa@arm.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180617 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-26	ARM/NEON: Pattern match vector integer abs to vabs.	Benjamin Kramer
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180604 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-25	ARM cost model: Integer div and rem is lowered to a function call	Arnold Schwaighofer
	Reflect this in the cost model. I observed this in MiBench/consumer-lame. radar://13354716 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180576 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23	Add more tests for r179925 to verify correct handling of signext/zeroext; ↵	Stephen Lin
	strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180138 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23	Lowercase "is" boolean variable prefix for consistency within function, no ↵	Stephen Lin
	functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180136 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22	No really, don't store anything to this since it's unconditionally	Eric Christopher
	set below. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180015 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22	Remove variable store that is never read.	Eric Christopher
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180014 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22	Fix for 5.5 Parameter Passing --> Stage C:	Stepan Dyatkovskiy
	-- C.4 and C.5 statements, when NSAA is not equal to SP. -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a variadic procedure. Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are some exceptions in AAPCS. 1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs. 2. Check that for VA functions all params uses GPRs and then stack. No exceptions, no CPRCs here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180011 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21	Legalize vector truncates by parts rather than just splitting.	Jim Grosbach
	Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179989 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21	ARM: Use ldrd/strd to spill 64-bit pairs when available.	Tim Northover
	This allows common sp-offsets to be part of the instruction and is probably faster on modern CPUs too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179977 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20	ARM: don't add FrameIndex offset for LDMIA (has no immediate)	Tim Northover
	Previously, when spilling 64-bit paired registers, an LDMIA with both a FrameIndex and an offset was produced. This kind of instruction shouldn't exist, and the extra operand was being confused with the predicate, causing aborts later on. This removes the invalid 0-offset from the instruction being produced. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179956 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20	Remove unused ShouldFoldAtomicFences flag.	Tim Northover
	I think it's almost impossible to fold atomic fences profitably under LLVM/C++11 semantics. As a result, this is now unused and just cluttering up the target interface. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179940 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20	Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.	Tim Northover
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179939 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20	Add CodeGen support for functions that always return arguments via a new ↵	Stephen Lin
	parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179925 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20	Test commit	Stephen Lin
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179913 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19	Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'm	Eli Bendersky
	trying to move as much FastISel logic as possible out of the main path in SelectionDAGISel - intermixing them just adds confusion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179902 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19	ArrayRefize getMachineNode(). No functionality change.	Michael Liao
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179901 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19	ARM: Permit "sp" in ARM variant of STREXD instructions	Tim Northover
	Patch from Mihail Popa git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179854 91177308-0d34-0410-b5e6-96231b3b80d8