<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm/lib/Target/PowerPC, branch testing</title>
<subtitle>http://llvm.org</subtitle>
<id>https://git.amat.us/llvm/atom/lib/Target/PowerPC?h=testing</id>
<link rel='self' href='https://git.amat.us/llvm/atom/lib/Target/PowerPC?h=testing'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/'/>
<updated>2013-03-21T23:45:03Z</updated>
<entry>
<title>Remove the G8RC_NOX0_and_GPRC_NOR0 PPC register class</title>
<updated>2013-03-21T23:45:03Z</updated>
<author>
<name>Hal Finkel</name>
<email>hfinkel@anl.gov</email>
</author>
<published>2013-03-21T23:45:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=7697370adff8983e2a3de493362f0d8c9f9b0e17'/>
<id>urn:sha1:7697370adff8983e2a3de493362f0d8c9f9b0e17</id>
<content type='text'>
As Jakob pointed out in his review of r177423, having a shared ZERO
register between the 32- and 64-bit register classes causes this
odd G8RC_NOX0_and_GPRC_NOR0 class to be created. As recommended,
this adds a ZERO8 register which differentiates the 32- and 64-bit
zeros.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177683 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Fix a register-class comparison bug in PPCCTRLoops</title>
<updated>2013-03-21T23:23:34Z</updated>
<author>
<name>Hal Finkel</name>
<email>hfinkel@anl.gov</email>
</author>
<published>2013-03-21T23:23:34Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=3ea1b064a0b9c3d161b0f77a9e957970f98907ab'/>
<id>urn:sha1:3ea1b064a0b9c3d161b0f77a9e957970f98907ab</id>
<content type='text'>
Thanks to Jakob for isolating the underlying problem from the
test case in r177423. The original commit had introduced
asymmetric copy operations, but these turned out to be a work-around
to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177679 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Implement builtin_{setjmp/longjmp} on PPC</title>
<updated>2013-03-21T21:37:52Z</updated>
<author>
<name>Hal Finkel</name>
<email>hfinkel@anl.gov</email>
</author>
<published>2013-03-21T21:37:52Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=7ee74a663a3b4d4ee6b55d23362f347ed1d390c2'/>
<id>urn:sha1:7ee74a663a3b4d4ee6b55d23362f347ed1d390c2</id>
<content type='text'>
This implements SJLJ lowering on PPC, making the Clang functions
__builtin_{setjmp/longjmp} functional on PPC platforms. The implementation
strategy is similar to that on X86, with the exception that a branch-and-link
variant is used to get the right jump address. Credit goes to Bill Schmidt for
suggesting the use of the unconditional bcl form (instead of the regular bl
instruction) to limit return-address-cache pollution.

Benchmarking the speed at -O3 of:

static jmp_buf env_sigill;

void foo() {
                __builtin_longjmp(env_sigill,1);
}

main() {
	...

        for (int i = 0; i &lt; c; ++i) {
                if (__builtin_setjmp(env_sigill)) {
                        goto done;
                } else {
                        foo();
                }

done:;
        }

	...
}

vs. the same code using the libc setjmp/longjmp functions on a P7 shows that
this builtin implementation is ~4x faster with Altivec enabled and ~7.25x
faster with Altivec disabled. This comparison is somewhat unfair because the
libc version must also save/restore the VSX registers which we don't yet
support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177666 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Add support for spilling VRSAVE on PPC</title>
<updated>2013-03-21T19:03:21Z</updated>
<author>
<name>Hal Finkel</name>
<email>hfinkel@anl.gov</email>
</author>
<published>2013-03-21T19:03:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=10f7f2a222d0e83dc0c33ad506a7686190c2f7a2'/>
<id>urn:sha1:10f7f2a222d0e83dc0c33ad506a7686190c2f7a2</id>
<content type='text'>
Although there is only one Altivec VRSAVE register, it is a member of
a register class, and we need the ability to spill it. Because this
register is normally callee-preserved and handled by special code this
has never before been necessary. However, this capability will be required by
a forthcoming commit adding SjLj support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177654 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Correct PPC FRAMEADDR lowering using a pseudo-register</title>
<updated>2013-03-21T19:03:19Z</updated>
<author>
<name>Hal Finkel</name>
<email>hfinkel@anl.gov</email>
</author>
<published>2013-03-21T19:03:19Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=e9cc0a09ae38c87b1b26a44f5e32222ede4f84e6'/>
<id>urn:sha1:e9cc0a09ae38c87b1b26a44f5e32222ede4f84e6</id>
<content type='text'>
The old code used to lower FRAMEADDR tried to replicate the logic in the real
frame-lowering code that determines whether or not the frame pointer (r31) will
be used. When it seemed as through the frame pointer would not be used, the
stack pointer (r1) was used instead. Unfortunately, because the stack size is
not yet known, this does not work. Instead, this change introduces new
always-reserved pseudo-registers (FP and FP8) that are replaced during prologue
insertion with the real frame-pointer register (either r1 or r31).

It is important that this intrinsic always return a valid frame address because
it is used by Clang to store the frame address as part of code generation for
__builtin_setjmp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177653 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Add missing mayLoad flag to LHAUX8 and LWAUX.</title>
<updated>2013-03-19T19:53:27Z</updated>
<author>
<name>Ulrich Weigand</name>
<email>ulrich.weigand@de.ibm.com</email>
</author>
<published>2013-03-19T19:53:27Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=dff4d1522a3a14df3c40c33421e24f59633da67b'/>
<id>urn:sha1:dff4d1522a3a14df3c40c33421e24f59633da67b</id>
<content type='text'>
All pre-increment load patterns need to set the mayLoad flag (since
they don't provide a DAG pattern).

This was missing for LHAUX8 and LWAUX, which is added by this patch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177431 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Rewrite LHAU8 pattern to use standard memory operand.</title>
<updated>2013-03-19T19:52:30Z</updated>
<author>
<name>Ulrich Weigand</name>
<email>ulrich.weigand@de.ibm.com</email>
</author>
<published>2013-03-19T19:52:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=8353d1e0e5fd23bb9b6c11acda8157d728d89223'/>
<id>urn:sha1:8353d1e0e5fd23bb9b6c11acda8157d728d89223</id>
<content type='text'>
As opposed to to pre-increment store patterns, the pre-increment
load patterns were already using standard memory operands, with
the sole exception of LHAU8.

As there's no real reason why LHAU8 should be different here,
this patch simply rewrites the pattern to also use a memri
operand, just like all the other patterns.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177430 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Rewrite pre-increment store patterns to use standard memory operands.</title>
<updated>2013-03-19T19:52:04Z</updated>
<author>
<name>Ulrich Weigand</name>
<email>ulrich.weigand@de.ibm.com</email>
</author>
<published>2013-03-19T19:52:04Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=5882e3d82831710a7ea1fe8de4813350d4eecf05'/>
<id>urn:sha1:5882e3d82831710a7ea1fe8de4813350d4eecf05</id>
<content type='text'>
Currently, pre-increment store patterns are written to use two separate
operands to represent address base and displacement:

  stwu $rS, $ptroff($ptrreg)

This causes problems when implementing the assembler parser, so this
commit changes the patterns to use standard (complex) memory operands
like in all other memory access instruction patterns:

  stwu $rS, $dst

To still match those instructions against the appropriate pre_store
SelectionDAG nodes, the patch uses the new feature that allows a Pat
to match multiple DAG operands against a single (complex) instruction
operand.

Approved by Hal Finkel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177429 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Fix sub-operand size mismatch in tocentry operands.</title>
<updated>2013-03-19T19:50:30Z</updated>
<author>
<name>Ulrich Weigand</name>
<email>ulrich.weigand@de.ibm.com</email>
</author>
<published>2013-03-19T19:50:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=880d82e3dbf8ae6c2babf5943d524bbe25015eba'/>
<id>urn:sha1:880d82e3dbf8ae6c2babf5943d524bbe25015eba</id>
<content type='text'>
The tocentry operand class refers to 64-bit values (it is only used in 64-bit,
where iPTR is a 64-bit type), but its sole suboperand is designated as 32-bit
type.  This causes a mismatch to be detected at compile-time with the TableGen
patch I'll check in shortly.

To fix this, this commit changes the suboperand to a 64-bit type as well.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177427 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Prepare to make r0 an allocatable register on PPC</title>
<updated>2013-03-19T18:51:05Z</updated>
<author>
<name>Hal Finkel</name>
<email>hfinkel@anl.gov</email>
</author>
<published>2013-03-19T18:51:05Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=a548afc98fd4c61a8dfdd550ba57c37f2cfe3ed9'/>
<id>urn:sha1:a548afc98fd4c61a8dfdd550ba57c37f2cfe3ed9</id>
<content type='text'>
Currently the PPC r0 register is unconditionally reserved. There are two reasons
for this:

 1. r0 is treated specially (as the constant 0) by certain instructions, and so
    cannot be used with those instructions as a regular register.

 2. r0 is used as a temporary register in the CR-register spilling process
    (where, under some circumstances, we require two GPRs).

This change addresses the first reason by introducing a restricted register
class (without r0) for use by those instructions that treat r0 specially. These
register classes have a new pseudo-register, ZERO, which represents the r0-as-0
use. This has the side benefit of making the existing target code simpler (and
easier to understand), and will make it clear to the register allocator that
uses of r0 as 0 don't conflict will real uses of the r0 register.

Once the CR spilling code is improved, we'll be able to allocate r0.

Adding these extra register classes, for some reason unclear to me, causes
requests to the target to copy 32-bit registers to 64-bit registers. The
resulting code seems correct (and causes no test-suite failures), and the new
test case covers this new kind of asymmetric copy.

As r0 is still reserved, no functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177423 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
</feed>
