<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm/test/Analysis/CostModel/X86, branch testing</title>
<subtitle>http://llvm.org</subtitle>
<id>https://git.amat.us/llvm/atom/test/Analysis/CostModel/X86?h=testing</id>
<link rel='self' href='https://git.amat.us/llvm/atom/test/Analysis/CostModel/X86?h=testing'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/'/>
<updated>2013-03-20T22:01:10Z</updated>
<entry>
<title>Correct cost model for vector shift on AVX2</title>
<updated>2013-03-20T22:01:10Z</updated>
<author>
<name>Michael Liao</name>
<email>michael.liao@intel.com</email>
</author>
<published>2013-03-20T22:01:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=f74e9bf650d7c40d595d3bb60e3c901e2bccec4b'/>
<id>urn:sha1:f74e9bf650d7c40d595d3bb60e3c901e2bccec4b</id>
<content type='text'>
- After moving logic recognizing vector shift with scalar amount from
  DAG combining into DAG lowering, we declare to customize all vector
  shifts even vector shift on AVX is legal. As a result, the cost model
  needs special tuning to identify these legal cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177586 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Optimize sext &lt;4 x i8&gt; and &lt;4 x i16&gt; to &lt;4 x i64&gt;.</title>
<updated>2013-03-19T18:38:27Z</updated>
<author>
<name>Nadav Rotem</name>
<email>nrotem@apple.com</email>
</author>
<published>2013-03-19T18:38:27Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=b05130e1b20ed17ae9d5ab3351933babd27213e1'/>
<id>urn:sha1:b05130e1b20ed17ae9d5ab3351933babd27213e1</id>
<content type='text'>
Patch by Ahmad, Muhammad T &lt;muhammad.t.ahmad@intel.com&gt;



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177421 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>X86 cost model: Adjust cost for custom lowered vector multiplies</title>
<updated>2013-03-02T04:02:52Z</updated>
<author>
<name>Arnold Schwaighofer</name>
<email>aschwaighofer@apple.com</email>
</author>
<published>2013-03-02T04:02:52Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=5f0d9dbdf48a9efe16bfadf88e5335f7b9a8ec3f'/>
<id>urn:sha1:5f0d9dbdf48a9efe16bfadf88e5335f7b9a8ec3f</id>
<content type='text'>
This matters for example in following matrix multiply:

int **mmult(int rows, int cols, int **m1, int **m2, int **m3) {
  int i, j, k, val;
  for (i=0; i&lt;rows; i++) {
    for (j=0; j&lt;cols; j++) {
      val = 0;
      for (k=0; k&lt;cols; k++) {
        val += m1[i][k] * m2[k][j];
      }
      m3[i][j] = val;
    }
  }
  return(m3);
}

Taken from the test-suite benchmark Shootout.

We estimate the cost of the multiply to be 2 while we generate 9 instructions
for it and end up being quite a bit slower than the scalar version (48% on my
machine).

Also, properly differentiate between avx1 and avx2. On avx-1 we still split the
vector into 2 128bits and handle the subvector muls like above with 9
instructions.
Only on avx-2 will we have a cost of 9 for v4i64.

I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an
add instead of a mul because with a mul we now no longer vectorize. I did
verify that the mul would be indeed more expensive when vectorized with 3
kernels:

for (i ...)
   r += a[i] * 3;
for (i ...)
  m1[i] = m1[i] * 3; // This matches the test case in avx1.ll
and a matrix multiply.

In each case the vectorized version was considerably slower.

radar://13304919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176403 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Cost model support for lowered math builtins.</title>
<updated>2013-02-28T19:09:33Z</updated>
<author>
<name>Benjamin Kramer</name>
<email>benny.kra@googlemail.com</email>
</author>
<published>2013-02-28T19:09:33Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=8611d4449a77ca05e808823bc966573a85da00cb'/>
<id>urn:sha1:8611d4449a77ca05e808823bc966573a85da00cb</id>
<content type='text'>
We make the cost for calling libm functions extremely high as emitting the
calls is expensive and causes spills (on x86) so performance suffers. We still
vectorize important calls like ceilf and friends on SSE4.1. and fabs.

Differential Revision: http://llvm-reviews.chandlerc.com/D466

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176287 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>I optimized the following patterns:</title>
<updated>2013-02-20T12:42:54Z</updated>
<author>
<name>Elena Demikhovsky</name>
<email>elena.demikhovsky@intel.com</email>
</author>
<published>2013-02-20T12:42:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=52981c4b6016d9f0e295e0771ec0a50dd073b4b3'/>
<id>urn:sha1:52981c4b6016d9f0e295e0771ec0a50dd073b4b3</id>
<content type='text'>
 sext &lt;4 x i1&gt; to &lt;4 x i64&gt;
 sext &lt;4 x i8&gt; to &lt;4 x i64&gt;
 sext &lt;4 x i16&gt; to &lt;4 x i64&gt;
 
I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns:
 (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -&gt; (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT)))
 
 The sext_in_reg (v4i32 x) may be lowered to shl+sar operations.
 The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution.

I also added a cost of this operations to the AVX costs table.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175619 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>ARM cost model: Address computation in vector mem ops not free</title>
<updated>2013-02-08T14:50:48Z</updated>
<author>
<name>Arnold Schwaighofer</name>
<email>aschwaighofer@apple.com</email>
</author>
<published>2013-02-08T14:50:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=fb55a8fd7c38aa09d9c243d48a8a72d890f36a3d'/>
<id>urn:sha1:fb55a8fd7c38aa09d9c243d48a8a72d890f36a3d</id>
<content type='text'>
Adds a function to target transform info to query for the cost of address
computation. The cost model analysis pass now also queries this interface.
The code in LoopVectorize adds the cost of address computation as part of the
memory instruction cost calculation. Only there, we know whether the instruction
will be scalarized or not.
Increase the penality for inserting in to D registers on swift. This becomes
necessary because we now always assume that address computation has a cost and
three is a closer value to the architecture.

radar://13097204

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174713 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>We are not ready to estimate the cost of integer expansions based on the number of parts. This test is too noisy.</title>
<updated>2012-12-23T09:11:07Z</updated>
<author>
<name>Nadav Rotem</name>
<email>nrotem@apple.com</email>
</author>
<published>2012-12-23T09:11:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=f85ec865f0f803273ab38e3b1a19fe185c7e88ac'/>
<id>urn:sha1:f85ec865f0f803273ab38e3b1a19fe185c7e88ac</id>
<content type='text'>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170999 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Improve the X86 cost model for loads and stores.</title>
<updated>2012-12-21T01:33:59Z</updated>
<author>
<name>Nadav Rotem</name>
<email>nrotem@apple.com</email>
</author>
<published>2012-12-21T01:33:59Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=f5637c399711e37287e01f9d9ca9ce7cd2f3d14f'/>
<id>urn:sha1:f5637c399711e37287e01f9d9ca9ce7cd2f3d14f</id>
<content type='text'>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170830 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Reverse order of checking SSE level when calculating compare cost, so we check</title>
<updated>2012-12-18T22:57:56Z</updated>
<author>
<name>Jakub Staszak</name>
<email>kubastaszak@gmail.com</email>
</author>
<published>2012-12-18T22:57:56Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=270bfbd3d1fb42000b23e5747ac7957b0e9fcab8'/>
<id>urn:sha1:270bfbd3d1fb42000b23e5747ac7957b0e9fcab8</id>
<content type='text'>
AVX2 before AVX.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170464 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
<entry>
<title>Cost Model: change the default cost of control flow instructions (br / ret / ...) to zero.</title>
<updated>2012-12-05T21:21:26Z</updated>
<author>
<name>Nadav Rotem</name>
<email>nrotem@apple.com</email>
</author>
<published>2012-12-05T21:21:26Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/llvm/commit/?id=0602bb46596485b43fa8b86abbc8485d502025ce'/>
<id>urn:sha1:0602bb46596485b43fa8b86abbc8485d502025ce</id>
<content type='text'>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169423 91177308-0d34-0410-b5e6-96231b3b80d8
</content>
</entry>
</feed>
