aboutsummaryrefslogtreecommitdiff
path: root/lib/CodeGen/SelectionDAG/LegalizeTypes.h
AgeCommit message (Collapse)Author
2013-04-21Legalize vector truncates by parts rather than just splitting.Jim Grosbach
Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179989 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.Tim Northover
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179939 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.Bill Schmidt
For this we need to use a libcall. Previously LLVM didn't implement libcall support for frem, so I've added it in the usual straightforward manner. A test case from the bug report is included. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178639 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20Move SDNode order propagation to SDNodeOrdering, which also fixes a missedJustin Holewinski
case of order propagation during isel. Thanks Owen for the suggestion! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177525 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20Propagate DAG node ordering during type legalization and instruction selectionJustin Holewinski
A node's ordering is only propagated during legalization if (a) the new node does not have an ordering (is not a CSE'd node), or (b) the new node has an ordering that is higher than the node being legalized. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177465 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-07SDAG: Handle scalarizing an extend of a <1 x iN> vector.Jim Grosbach
Just scalarize the element and rebuild a vector of the result type from that. rdar://13281568 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176614 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-25This patch aims to reduce compile time in LegalizeTypes by using SmallDenseMap,Preston Gurd
with an initial number of elements, instead of DenseMap, which has zero initial elements, in order to avoid the copying of elements when the size changes and to avoid allocating space every time LegalizeTypes is run. This patch will not affect the memory footprint, because DenseMap will increase the element size to 64 when the first element is added. Patch by Wan Xiaofei. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173448 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09Refactor to expose RTLIB calls to targets.Tim Northover
fp128 is almost but not quite completely illegal as a type on AArch64. As a result it needs to have a register class (for argument passing mainly), but all operations need to be lowered to runtime calls. Currently there's no way for targets to do this (without duplicating code), as the relevant functions are hidden in SelectionDAG. This patch changes that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171971 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04Sort includes for all of the .h files under the 'lib' tree. These wereChandler Carruth
missed in the first pass because the script didn't yet handle include guards. Note that the script is now able to handle all of these headers without manual edits. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169224 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29Teach the legalizer how to handle operands for VSELECT nodesJustin Holewinski
If we need to split the operand of a VSELECT, it must be the mask operand. We split the entire VSELECT operand with EXTRACT_SUBVECTOR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168883 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10Add alternative support for FP_ROUND from v2f32 to v2f64Michael Liao
- Due to the current matching vector elements constraints in ISD::FP_EXTEND, rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening to convert it into a target-specific X86ISD::VFPEXT to work around this constraints. This patch also reverts a previous attempt to fix this issue by recovering the scalarized ISD::FP_EXTEND pattern and thus significantly reduces the overhead of supporting non-power-2 vector FP extend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165625 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30Add support for FMA to WidenVectorResult.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162893 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be ↵Nadav Rotem
wider than the output element type. Make sure to trunc them if needed. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160235 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24DAG legalisation can now handle illegal fma vector types by scalarisationPete Cooper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159092 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-16Fix a thinko in DisintegrateMERGE_VALUES. Patch by Xiaoyi Guo.Duncan Sands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156909 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-20Register DAGUpdateListeners with SelectionDAG.Jakob Stoklund Olesen
Instead of passing listener pointers to RAUW, let SelectionDAG itself keep a linked list of interested listeners. This makes it possible to have multiple listeners active at once, like RAUWUpdateListener was already doing. It also makes it possible to register listeners up the call stack without controlling all RAUW calls below. DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG list of active listeners. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155248 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it ↵Pete Cooper
would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153976 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-211. Fix the widening of SETCC in WidenVecOp_SETCC. Use the correct return CC ↵Nadav Rotem
type. 2. Fix a typo in CONCAT_VECTORS which exposed the bug in #1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142648 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-19Add support for the vector-widening of vselect and vector-setccNadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142488 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-23Tweak the handling of MERGE_VALUES nodes: remove the need forDuncan Sands
DecomposeMERGE_VALUES to "know" that results are legalized in a particular order, by passing it the number of the result being legalized (the type legalization core provides this, it just needs to be passed on). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140373 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-23Vector-Select: Address one of the problems in pr10902. Add handling for theNadav Rotem
integer-promotion of CONCAT_VECTORS. Test: test/CodeGen/X86/widen_shuffle-1.ll This patch fixes the above tests (when running in with -promote-elements). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140372 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-15Some legalization fixes for atomic load and store.Eli Friedman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139851 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-14Add integer promotion support for vselectNadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139692 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-06Add codegen support for vector select (in the IR this means a selectDuncan Sands
with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139159 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31Misc cleanup; addresses Duncan's comments on r138877.Eli Friedman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138887 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31Fill in type legalization for MERGE_VALUES in all the various cases. Patch ↵Eli Friedman
by Micah Villmow. (No testcase because the issue only showed up in an out-of-tree backend.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138877 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; ↵Eli Friedman
implements 64-bit atomic load/store for ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138872 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-08Add an intrinsic and codegen support for fused multiply-accumulate. The intentCameron Zwarich
is to use this for architectures that have a native FMA instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134742 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-17Lower multiply with overflow checking to __mulo<mode>Eric Christopher
calls if we haven't been able to lower them any other way. Fixes rdar://9090077 and rdar://9210061 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133288 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-15Enable the simplification of truncating-store after fixing the usage ofNadav Rotem
GetDemandBits (which must operate on the vector element type). Fix the a usage of getZeroExtendInReg which must also be done on scalar types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133052 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-06Add methods to support the integer-promotion of vector types. Methods toNadav Rotem
legalize SDNodes such as BUILD_VECTOR, EXTRACT_VECTOR_ELT, etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132689 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01Refactor LegalizeTypes: Erase LegalizeAction and make the type legalizer useNadav Rotem
the TargetLowering enum. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132418 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-28Refactor the type legalizer. Switch TargetLowering to a new enum - ↵Nadav Rotem
LegalizeTypeAction. This patch does not change the behavior of the type legalizer. The codegen produces the same code. This infrastructural change is needed in order to enable complex decisions for vector types (needed by the vector-select patch). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132263 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-27Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'Nadav Rotem
code in one place. Re-apply 131534 and fix the multi-step promotion of integers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132217 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-18Revert commit 131534 since it seems to have broken several buildbots.Duncan Sands
Original log entry: Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131536 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-18Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'Nadav Rotem
code in one place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131534 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-23PR9535: add support for splitting and scalarizing vector ISD::FP_ROUND.Eli Friedman
Also cleaning up some duplicated code while I'm here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128176 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-03Revert r123908; the code in question is completely untested and wrong.Eli Friedman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126964 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-14fix PR9210 by implementing some type legalization logic for Chris Lattner
vector fp conversions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125482 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20My editor's indent went crazy. Fix.Eric Christopher
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123909 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20Expand invalid return values for umulo and smulo. Handle these similarlyEric Christopher
to add/sub by doing the normal operation and then checking for overflow afterwards. This generally relies on the DAG handling the later invalid operations as well. Fixes the 64-bit part of rdar://8622122 and rdar://8774702. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123908 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-06Add some fairly duplicated code to let type legalization split illegalEric Christopher
typed atomics. This will lower exclusively to libcalls at the moment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122979 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-23Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept.Wesley Peck
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119990 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-26implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1.Chris Lattner
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112171 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-25remove some llvmcontext arguments that are now dead post-refactoring.Chris Lattner
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112104 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-24tidy up, reduce indentationChris Lattner
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111982 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-03Implement expansion in type legalization for add/sub with overflow. TheEli Friedman
expansion is the same as that used by LegalizeDAG. The resulting code sucks in terms of performance/codesize on x86-32 for a 64-bit operation; I haven't looked into whether different expansions might be better in general. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105378 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-01Fill in missing support for ISD::FEXP, ISD::FPOWI, and friends.Dan Gohman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105283 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-11I got tired of VISIBILITY_HIDDEN colliding with the gcc enum. Rename itDuncan Sands
to LLVM_LIBRARY_VISIBILITY and introduce LLVM_GLOBAL_VISIBILITY, which is the opposite, for future use by dragonegg. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103495 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-17Use const qualifiers with TargetLowering. This eliminates severalDan Gohman
const_casts, and it reinforces the design of the Target classes being immutable. SelectionDAGISel::IsLegalToFold is now a static member function, because PIC16 uses it in an unconventional way. There is more room for API cleanup here. And PIC16's AsmPrinter no longer uses TargetLowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101635 91177308-0d34-0410-b5e6-96231b3b80d8