linux/arch/tile/lib, branch v3.2.41

arch/tile: add a few #includes and an EXPORT to catch up with kernel changes.

2011-12-03T20:31:41Z

The empty_zero_page[] export is required for ZERO_PAGE() module references. The #includes are due to changes in implicit inclusion, and should of course have been in the sources from the beginning. Signed-off-by: Chris Metcalf

arch/tile: avoid exporting a symbol no longer used by gcc

2011-11-03T20:58:42Z

An earlier Tilera compiler generated calls to an "__ll_mul" function for long long multiplication. Our libgcc supported that as an alias for the normal __muldi3 routine, so we made it available to kernel modules as well. However, for a while now the compiler has internally been generating only the standard __muldi3 symbol, and the version we are giving back to the community does not have the __ll_mul alias, so we are removing it from the kernel too. Signed-off-by: Chris Metcalf

tile: revert change from to in asm files

2011-10-13T12:25:01Z

The 32-bit TILEPro support uses some #defines in for atomic support routines in assembly. To make this more explicit, I've turned those includes into includes of , which should hopefully make it clear that they shouldn't be bombed into in any cleanups. Signed-off-by: Chris Metcalf

atomic: use

2011-07-26T23:49:47Z

This allows us to move duplicated code in (atomic_inc_not_zero() for now) to Signed-off-by: Arun Sharma Reviewed-by: Eric Dumazet Cc: Ingo Molnar Cc: David Miller Cc: Eric Dumazet Acked-by: Mike Frysinger Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

arch/tile: finish enabling support for TILE-Gx 64-bit chip

2011-05-12T19:52:12Z

This support was partially present in the existing code (look for "__tilegx__" ifdefs) but with this change you can build a working kernel using the TILE-Gx toolchain and ARCH=tilegx. Most of these files are new, generally adding a foo_64.c file where previously there was just a foo_32.c file. The ARCH=tilegx directive redirects to arch/tile, not arch/tilegx, using the existing SRCARCH mechanism in the top-level Makefile. Changes to existing files: - and changed to factor the include of in the common header. - and arch/tile/kernel/compat.c changed to remove the "const" markers I had put on compat_sys_execve() when trying to match some recent similar changes to the non-compat execve. It turns out the compat version wasn't "upgraded" to use const. - and were previously included accidentally, with the 32-bit contents. Now they have the proper 64-bit contents. Finally, I had to hack the existing hacky drivers/input/input-compat.h to add yet another "#ifdef" for INPUT_COMPAT_TEST (same as x86_64). Signed-off-by: Chris Metcalf Acked-by: Dmitry Torokhov [drivers/input]

arch/tile: disable GX prefetcher during cache flush

2011-05-04T18:40:46Z

Otherwise, it's possible to end up with the prefetcher pulling data into cache that the code believes has been flushed. Signed-off-by: Chris Metcalf

arch/tile: allow nonatomic stores to interoperate with fast atomic syscalls

2011-05-04T18:40:07Z

This semantic was already true for atomic operations within the kernel, and this change makes it true for the fast atomic syscalls (__NR_cmpxchg and __NR_atomic_update) as well. Previously, user-space had to use the fast atomic syscalls exclusively to update memory, since raw stores could lose a race with the atomic update code even when the atomic update hadn't actually modified the value. With this change, we no longer write back the value to memory if it hasn't changed. This allows certain types of idioms in user space to work as expected, e.g. "atomic exchange" to acquire a spinlock, followed by a raw store of zero to release the lock. Signed-off-by: Chris Metcalf

arch/tile: fix futex sanitization definition/prototype mismatch

2011-03-20T04:08:21Z

Commit 8d7718aa082aaf30a0b4989e1f04858952f941bc changed "int" to "u32" in the prototypes but not the definition. I missed this when I saw the patch go by on LKML. We cast "u32 *" to "int *" since we are tying into the underlying atomics framework, and atomic_t uses int as its value type. Signed-off-by: Chris Metcalf Reviewed-by: Michel Lespinasse

arch/tile: fix deadlock bugs in rwlock implementation

2011-03-10T21:10:41Z

The first issue fixed in this patch is that pending rwlock write locks could lock out new readers; this could cause a deadlock if a read lock was held on cpu 1, a write lock was then attempted on cpu 2 and was pending, and cpu 1 was interrupted and attempted to re-acquire a read lock. The write lock code was modified to not lock out new readers. The second issue fixed is that there was a narrow race window where a tns instruction had been issued (setting the lock value to "1") and the store instruction to reset the lock value correctly had not yet been issued. In this case, if an interrupt occurred and the same cpu then tried to manipulate the lock, it would find the lock value set to "1" and spin forever, assuming some other cpu was partway through updating it. The fix is to enforce an interrupt critical section around the tns/store pair. In addition, this change now arranges to always validate that after a readlock we have not wrapped around the count of readers, which is only eight bits. Since these changes make the rwlock "fast path" code heavier weight, I decided to move all the rwlock code all out of line, leaving only the conventional spinlock code with fastpath inlines. Since the read_lock and read_trylock implementations ended up very similar, I just expressed read_lock in terms of read_trylock. As part of this change I also eliminate support for the now-obsolete tns_atomic mode. Signed-off-by: Chris Metcalf

arch/tile: support 4KB page size as well as 64KB

2011-03-10T18:17:53Z

The Tilera architecture traditionally supports 64KB page sizes to improve TLB utilization and improve performance when the hardware is being used primarily to run a single application. For more generic server scenarios, it can be beneficial to run with 4KB page sizes, so this commit allows that to be specified (by modifying the arch/tile/include/hv/pagesize.h header). As part of this change, we also re-worked the PTE management slightly so that PTE writes all go through a __set_pte() function where we can do some additional validation. The set_pte_order() function was eliminated since the "order" argument wasn't being used. One bug uncovered was in the PCI DMA code, which wasn't properly flushing the specified range. This was benign with 64KB pages, but with 4KB pages we were getting some larger flushes wrong. The per-cpu memory reservation code also needed updating to conform with the newer percpu stuff; before it always chose 64KB, and that was always correct, but with 4KB granularity we now have to pay closer attention and reserve the amount of memory that will be requested when the percpu code starts allocating. Signed-off-by: Chris Metcalf