Enable SSE4 codegen and pattern matching.

Add some notes to the README. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46949 91177308-0d34-0410-b5e6-96231b3b80d8
author: Nate Begeman <natebegeman@mac.com> 2008-02-11 04:19:36 +0000
committer: Nate Begeman <natebegeman@mac.com> 2008-02-11 04:19:36 +0000
commit: 14d12caf1d2de9618818646d12b30d647a860817 (patch)
tree: d7bcb670b24ecb227f91407faf81ac2da765ada0 /lib/Target/X86/README-SSE.txt
parent: a6ed0aa8ec1d374857cf94f56eead7f0b775ac28 (diff)
1 files changed, 20 insertions, 0 deletions
diff --git a/lib/Target/X86/README-SSE.txt b/lib/Target/X86/README-SSE.txt
index d3f91bfabc..d9a03a3610 100644
--- a/lib/Target/X86/README-SSE.txt
+++ b/lib/Target/X86/README-SSE.txt
@@ -761,3 +761,23 @@ an X86 fxor.  This means that we need to handle this case in the x86 backend
 instead of in target independent code.
 
 //===---------------------------------------------------------------------===//
+
+Non-SSE4 insert into 16 x i8 is atrociously bad.
+
+//===---------------------------------------------------------------------===//
+
+<2 x i64> extract is substantially worse than <2 x f64>, even if the destination
+is memory.
+
+//===---------------------------------------------------------------------===//
+
+SSE4 extract-to-mem ops aren't being pattern matched because of the AssertZext
+sitting between the truncate and the extract.
+
+//===---------------------------------------------------------------------===//
+
+INSERTPS can match any insert (extract, imm1), imm2 for 4 x float, and insert
+any number of 0.0 simultaneously.  Currently we only use it for simple
+insertions.
+
+See comments in LowerINSERT_VECTOR_ELT_SSE4.
author	Nate Begeman <natebegeman@mac.com>	2008-02-11 04:19:36 +0000
committer	Nate Begeman <natebegeman@mac.com>	2008-02-11 04:19:36 +0000
commit	14d12caf1d2de9618818646d12b30d647a860817 (patch)
tree	d7bcb670b24ecb227f91407faf81ac2da765ada0 /lib/Target/X86/README-SSE.txt
parent	a6ed0aa8ec1d374857cf94f56eead7f0b775ac28 (diff)