llvm/test/Transforms/SROA, branch stable

llvm/test/Transforms/SROA, branch stable http://llvm.org https://git.amat.us/llvm/atom/test/Transforms/SROA?h=stable 2013-03-14T11:32:24Z PR14972: SROA vs. GVN exposed a really bad bug in SROA. 2013-03-14T11:32:24Z Chandler Carruth chandlerc@gmail.com 2013-03-14T11:32:24Z urn:sha1:41b55f5556d1332934cefa7c14862313eb87fa29 The fundamental problem is that SROA didn't allow for overly wide loads where the bits past the end of the alloca were masked away and the load was sufficiently aligned to ensure there is no risk of page fault, or other trapping behavior. With such widened loads, SROA would delete the load entirely rather than clamping it to the size of the alloca in order to allow mem2reg to fire. This was exposed by a test case that neatly arranged for GVN to run first, widening certain loads, followed by an inline step, and then SROA which miscompiles the code. However, I see no reason why this hasn't been plaguing us in other contexts. It seems deeply broken. Diagnosing all of the above took all of 10 minutes of debugging. The really annoying aspect is that fixing this completely breaks the pass. ;] There was an implicit reliance on the fact that no loads or stores extended past the alloca once we decided to rewrite them in the final stage of SROA. This was used to encode information about whether the loads and stores had been split across multiple partitions of the original alloca. That required threading explicit tracking of whether a *use* of a partition is split across multiple partitions. Once that was done, another problem arose: we allowed splitting of integer loads and stores iff they were loads and stores to the entire alloca. This is a really arbitrary limitation, and splitting at least some integer loads and stores is crucial to maximize promotion opportunities. My first attempt was to start removing the restriction entirely, but currently that does Very Bad Things by causing *many* common alloca patterns to be fully decomposed into i8 operations and lots of or-ing together to produce larger integers on demand. The code bloat is terrifying. That is still the right end-goal, but substantial work must be done to either merge partitions or ensure that small i8 values are eagerly merged in some other pass. Sadly, figuring all this out took essentially all the time and effort here. So the end result is that we allow splitting only when the load or store at least covers the alloca. That ensures widened loads and stores don't hurt SROA, and that we don't rampantly decompose operations more than we have previously. All of this was already fairly well tested, and so I've just updated the tests to cover the wide load behavior. I can add a test that crafts the pass ordering magic which caused the original PR, but that seems really brittle and to provide little benefit. The fundamental problem is that widened loads should Just Work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177055 91177308-0d34-0410-b5e6-96231b3b80d8 Rename the test so that we can add additional vectors-of-pointers tests 2012-12-18T05:50:54Z Nadav Rotem nrotem@apple.com 2012-12-18T05:50:54Z urn:sha1:aaf3b420b7bc35e52501cc0398dcc294040e7523 into the same file in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170414 91177308-0d34-0410-b5e6-96231b3b80d8 SROA: Replace calls to getScalarSizeInBits to DataLayout's API because 2012-12-18T05:23:31Z Nadav Rotem nrotem@apple.com 2012-12-18T05:23:31Z urn:sha1:e21708e4aaf741d0c9ccb5a5ddc75738fea7b61f getScalarSizeInBits could not handle vectors of pointers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170412 91177308-0d34-0410-b5e6-96231b3b80d8 Fix another SROA crasher, PR14601. 2012-12-17T18:48:07Z Chandler Carruth chandlerc@gmail.com 2012-12-17T18:48:07Z urn:sha1:b0de1e31d11056037c4db3e2ecfe1547e85c3e1c This was a silly oversight, we weren't pruning allocas which were used by variable-length memory intrinsics from the set that could be widened and promoted as integers. Fix that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170353 91177308-0d34-0410-b5e6-96231b3b80d8 Teach the rewriting of memcpy calls to support subvector copies. 2012-12-17T14:51:24Z Chandler Carruth chandlerc@gmail.com 2012-12-17T14:51:24Z urn:sha1:99a54942ae0fb6fdca03e91b2e492e9738fa4436 This also cleans up a bit of the memcpy call rewriting by sinking some irrelevant code further down and making the call-emitting code a bit more concrete. Previously, memcpy of a subvector would actually miscompile (!!!) the copy into a single vector element copy. I have no idea how this ever worked. =/ This is the memcpy half of PR14478 which we probably weren't noticing previously because it didn't actually assert. The rewrite relies on the newly refactored insert- and extractVector functions to do the heavy lifting, and those are the same as used for loads and stores which makes the test coverage a bit more meaningful here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170338 91177308-0d34-0410-b5e6-96231b3b80d8 Fix a secondary bug I introduced while fixing the first part of PR14478. 2012-12-17T14:03:01Z Chandler Carruth chandlerc@gmail.com 2012-12-17T14:03:01Z urn:sha1:8bbff2348d378192b332db38394498d83ed4feeb The first half of fixing this bug was actually in r170328, but was entirely coincidental. It did however get me to realize the nature of the bug, and adapt the test case to test more interesting behavior. In turn, that uncovered the rest of the bug which I've fixed here. This should fix two new asserts that showed up in the vectorize nightly tester. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170333 91177308-0d34-0410-b5e6-96231b3b80d8 Fix the first part of PR14478: memset now works. 2012-12-17T04:07:37Z Chandler Carruth chandlerc@gmail.com 2012-12-17T04:07:37Z urn:sha1:17c84ea594c6f10cb13c84ebe765b54f234c82ef PR14478 highlights a serious problem in SROA that simply wasn't being exercised due to a lack of vector input code mixed with C-library function calls. Part of SROA was written carefully to handle subvector accesses via memset and memcpy, but the rewriter never grew support for this. Fixing it required refactoring the subvector access code in other parts of SROA so it could be shared, and then fixing the splat formation logic and using subvector insertion (this patch). The PR isn't quite fixed yet, as memcpy is still broken in the same way. I'm starting on that series of patches now. Hopefully this will be enough to bring the bullet benchmark back to life with the bb-vectorizer enabled, but that may require fixing memcpy as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170301 91177308-0d34-0410-b5e6-96231b3b80d8 Add a corollary test for PR14572. We got this code path correct already. 2012-12-15T09:31:54Z Chandler Carruth chandlerc@gmail.com 2012-12-15T09:31:54Z urn:sha1:d12de955856204db4cabdd9bcabc82c22d0e85f2 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170271 91177308-0d34-0410-b5e6-96231b3b80d8 Relax an overly aggressive assert to fix PR14572. 2012-12-15T09:26:06Z Chandler Carruth chandlerc@gmail.com 2012-12-15T09:26:06Z urn:sha1:19820053fe46dbc91c43edb80a693fa6aae09251 The alloca width is based on the alloc size, not the type size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170270 91177308-0d34-0410-b5e6-96231b3b80d8 Fix typo in test-case. 2012-12-12T20:29:06Z Jakub Staszak kubastaszak@gmail.com 2012-12-12T20:29:06Z urn:sha1:728fbdb79610113765304a3967d45daa5a041664 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170015 91177308-0d34-0410-b5e6-96231b3b80d8