linux - Linux kernel source tree

Age	Commit message (Collapse)	Author
2013-06-07	klist: del waiter from klist_remove_waiters before wakeup waitting process	wang, biao
	commit ac5a2962b02f57dea76d314ef2521a2170b28ab6 upstream. There is a race between klist_remove and klist_release. klist_remove uses a local var waiter saved on stack. When klist_release calls wake_up_process(waiter->process) to wake up the waiter, waiter might run immediately and reuse the stack. Then, klist_release calls list_del(&waiter->list) to change previous wait data and cause prior waiter thread corrupt. The patch fixes it against kernel 3.9. Signed-off-by: wang, biao <biao.wang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-04	idr: fix a subtle bug in idr_get_next()	Tejun Heo
	commit 6cdae7416a1c45c2ce105a78187d9b7e8feb9e24 upstream. The iteration logic of idr_get_next() is borrowed mostly verbatim from idr_for_each(). It walks down the tree looking for the slot matching the current ID. If the matching slot is not found, the ID is incremented by the distance of single slot at the given level and repeats. The implementation assumes that during the whole iteration id is aligned to the layer boundaries of the level closest to the leaf, which is true for all iterations starting from zero or an existing element and thus is fine for idr_for_each(). However, idr_get_next() may be given any point and if the starting id hits in the middle of a non-existent layer, increment to the next layer will end up skipping the same offset into it. For example, an IDR with IDs filled between [64, 127] would look like the following. [ 0 64 ... ] /----/ \| \| \| NULL [ 64 ... 127 ] If idr_get_next() is called with 63 as the starting point, it will try to follow down the pointer from 0. As it is NULL, it will then try to proceed to the next slot in the same level by adding the slot distance at that level which is 64 - making the next try 127. It goes around the loop and finds and returns 127 skipping [64, 126]. Note that this bug also triggers in idr_for_each_entry() loop which deletes during iteration as deletions can make layers go away leaving the iteration with unaligned ID into missing layers. Fix it by ensuring proceeding to the next slot doesn't carry over the unaligned offset - ie. use round_up(id + 1, slot_distance) instead of id += slot_distance. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Teigland <teigland@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-31	genalloc: stop crashing the system when destroying a pool	Thadeu Lima de Souza Cascardo
	commit eedce141cd2dad8d0cefc5468ef41898949a7031 upstream. The genalloc code uses the bitmap API from include/linux/bitmap.h and lib/bitmap.c, which is based on long values. Both bitmap_set from lib/bitmap.c and bitmap_set_ll, which is the lockless version from genalloc.c, use BITMAP_LAST_WORD_MASK to set the first bits in a long in the bitmap. That one uses (1 << bits) - 1, 0b111, if you are setting the first three bits. This means that the API counts from the least significant bits (LSB from now on) to the MSB. The LSB in the first long is bit 0, then. The same works for the lookup functions. The genalloc code uses longs for the bitmap, as it should. In include/linux/genalloc.h, struct gen_pool_chunk has unsigned long bits[0] as its last member. When allocating the struct, genalloc should reserve enough space for the bitmap. This should be a proper number of longs that can fit the amount of bits in the bitmap. However, genalloc allocates an integer number of bytes that fit the amount of bits, but may not be an integer amount of longs. 9 bytes, for example, could be allocated for 70 bits. This is a problem in itself if the Least Significat Bit in a long is in the byte with the largest address, which happens in Big Endian machines. This means genalloc is not allocating the byte in which it will try to set or check for a bit. This may end up in memory corruption, where genalloc will try to set the bits it has not allocated. In fact, genalloc may not set these bits because it may find them already set, because they were not zeroed since they were not allocated. And that's what causes a BUG when gen_pool_destroy is called and check for any set bits. What really happens is that genalloc uses kmalloc_node with __GFP_ZERO on gen_pool_add_virt. With SLAB and SLUB, this means the whole slab will be cleared, not only the requested bytes. Since struct gen_pool_chunk has a size that is a multiple of 8, and slab sizes are multiples of 8, we get lucky and allocate and clear the right amount of bytes. Hower, this is not the case with SLOB or with older code that did memset after allocating instead of using __GFP_ZERO. So, a simple module as this (running 3.6.0), will cause a crash when rmmod'ed. [root@phantom-lp2 foo]# cat foo.c #include <linux/kernel.h> #include <linux/module.h> #include <linux/init.h> #include <linux/genalloc.h> MODULE_LICENSE("GPL"); MODULE_VERSION("0.1"); static struct gen_pool *foo_pool; static __init int foo_init(void) { int ret; foo_pool = gen_pool_create(10, -1); if (!foo_pool) return -ENOMEM; ret = gen_pool_add(foo_pool, 0xa0000000, 32 << 10, -1); if (ret) { gen_pool_destroy(foo_pool); return ret; } return 0; } static __exit void foo_exit(void) { gen_pool_destroy(foo_pool); } module_init(foo_init); module_exit(foo_exit); [root@phantom-lp2 foo]# zcat /proc/config.gz \| grep SLOB CONFIG_SLOB=y [root@phantom-lp2 foo]# insmod ./foo.ko [root@phantom-lp2 foo]# rmmod foo ------------[ cut here ]------------ kernel BUG at lib/genalloc.c:243! cpu 0x4: Vector: 700 (Program Check) at [c0000000bb0e7960] pc: c0000000003cb50c: .gen_pool_destroy+0xac/0x110 lr: c0000000003cb4fc: .gen_pool_destroy+0x9c/0x110 sp: c0000000bb0e7be0 msr: 8000000000029032 current = 0xc0000000bb0e0000 paca = 0xc000000006d30e00 softe: 0 irq_happened: 0x01 pid = 13044, comm = rmmod kernel BUG at lib/genalloc.c:243! [c0000000bb0e7ca0] d000000004b00020 .foo_exit+0x20/0x38 [foo] [c0000000bb0e7d20] c0000000000dff98 .SyS_delete_module+0x1a8/0x290 [c0000000bb0e7e30] c0000000000097d4 syscall_exit+0x0/0x94 --- Exception: c00 (System Call) at 000000800753d1a0 SP (fffd0b0e640) is in userspace Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Benjamin Gaignard <benjamin.gaignard@stericsson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-13	lib/gcd.c: prevent possible div by 0	Davidlohr Bueso
	commit e96875677fb2b7cb739c5d7769824dff7260d31d upstream. Account for all properties when a and/or b are 0: gcd(0, 0) = 0 gcd(a, 0) = a gcd(0, b) = b Fixes no known problems in current kernels. Signed-off-by: Davidlohr Bueso <dave@gnu.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-17	btree: fix tree corruption in btree_get_prev()	Roland Dreier
	commit cbf8ae32f66a9ceb8907ad9e16663c2a29e48990 upstream. The memory the parameter __key points to is used as an iterator in btree_get_prev(), so if we save off a bkey() pointer in retry_key and then assign that to __key, we'll end up corrupting the btree internals when we do eg longcpy(__key, bkey(geo, node, i), geo->keylen); to return the key value. What we should do instead is use longcpy() to copy the key value that retry_key points to __key. This can cause a btree to get corrupted by seemingly read-only operations such as btree_for_each_safe. [akpm@linux-foundation.org: avoid the double longcpy()] Signed-off-by: Roland Dreier <roland@purestorage.com> Acked-by: Joern Engel <joern@logfs.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-02	uevent: send events in correct order according to seqnum (v3)	Andrew Vagin
	commit 7b60a18da393ed70db043a777fd9e6d5363077c4 upstream. The queue handling in the udev daemon assumes that the events are ordered. Before this patch uevent_seqnum is incremented under sequence_lock, than an event is send uner uevent_sock_mutex. I want to say that code contained a window between incrementing seqnum and sending an event. This patch locks uevent_sock_mutex before incrementing uevent_seqnum. v2: delete sequence_lock, uevent_seqnum is protected by uevent_sock_mutex v3: unlock the mutex before the goto exit Thanks for Kay for the comments. Signed-off-by: Andrew Vagin <avagin@openvz.org> Tested-By: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2011-11-11	netlink: validate NLA_MSECS length	Johannes Berg
	commit c30bc94758ae2a38a5eb31767c1985c0aae0950b upstream. L2TP for example uses NLA_MSECS like this: policy: [L2TP_ATTR_RECV_TIMEOUT] = { .type = NLA_MSECS, }, code: if (info->attrs[L2TP_ATTR_RECV_TIMEOUT]) cfg.reorder_timeout = nla_get_msecs(info->attrs[L2TP_ATTR_RECV_TIMEOUT]); As nla_get_msecs() is essentially nla_get_u64() plus the conversion to a HZ-based value, this will not properly reject attributes from userspace that aren't long enough and might overrun the message. Add NLA_MSECS to the attribute minlen array to check the size properly. Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11	kobj_uevent: Ignore if some listeners cannot handle message	Milan Broz
	commit ebf4127cd677e9781b450e44dfaaa1cc595efcaa upstream. kobject_uevent() uses a multicast socket and should ignore if one of listeners cannot handle messages or nobody is listening at all. Easily reproducible when a process in system is cloned with CLONE_NEWNET flag. (See also http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/5256) Signed-off-by: Milan Broz <mbroz@redhat.com> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03	XZ: Fix incorrect XZ_BUF_ERROR	Lasse Collin
	commit 9c1f8594df4814ebfd6822ca3c9444fb3445888d upstream. xz_dec_run() could incorrectly return XZ_BUF_ERROR if all of the following was true: - The caller knows how many bytes of output to expect and only provides that much output space. - When the last output bytes are decoded, the caller-provided input buffer ends right before the LZMA2 end of payload marker. So LZMA2 won't provide more output anymore, but it won't know it yet and thus won't return XZ_STREAM_END yet. - A BCJ filter is in use and it hasn't left any unfiltered bytes in the temp buffer. This can happen with any BCJ filter, but in practice it's more likely with filters other than the x86 BCJ. This fixes <https://bugzilla.redhat.com/show_bug.cgi?id=735408> where Squashfs thinks that a valid file system is corrupt. This also fixes a similar bug in single-call mode where the uncompressed size of a block using BCJ + LZMA2 was 0 bytes and caller provided no output space. Many empty .xz files don't contain any blocks and thus don't trigger this bug. This also tweaks a closely related detail: xz_dec_bcj_run() could call xz_dec_lzma2_run() to decode into temp buffer when it was known to be useless. This was harmless although it wasted a minuscule number of CPU cycles. Signed-off-by: Lasse Collin <lasse.collin@tukaani.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-15	crypto: Move md5_transform to lib/md5.c	David S. Miller
	We are going to use this for TCP/IP sequence number and fragment ID generation. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04	XZ: Fix missing <linux/kernel.h> include	Lasse Collin
	commit 81d67439855a7f928d90965d832aa4f2fb677342 upstream. <linux/kernel.h> is needed for min_t. The old version happened to work on x86 because <asm/unaligned.h> indirectly includes <linux/kernel.h>, but it didn't work on ARM. <linux/kernel.h> includes <asm/byteorder.h> so it's not necessary to include it explicitly anymore. Signed-off-by: Lasse Collin <lasse.collin@tukaani.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-07	Merge branches 'core-urgent-for-linus', 'perf-urgent-for-linus' and ↵	Linus Torvalds
	'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: debugobjects: Fix boot crash when kmemleak and debugobjects enabled * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: jump_label: Fix jump_label update for modules oprofile, x86: Fix race in nmi handler while starting counters * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Disable (revert) SCHED_LOAD_SCALE increase sched, cgroups: Fix MIN_SHARES on 64-bit boxen
2011-06-20	debugobjects: Fix boot crash when kmemleak and debugobjects enabled	Marcin Slusarz
	Order of initialization look like this: ... debugobjects kmemleak ...(lots of other subsystems)... workqueues (through early initcall) ... debugobjects use schedule_work for batch freeing of its data and kmemleak heavily use debugobjects, so when it comes to freeing and workqueues were not initialized yet, kernel crashes: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a [<ffffffff81085910>] queue_work_on+0x16/0x1d [<ffffffff81085abc>] queue_work+0x29/0x55 [<ffffffff81085afb>] schedule_work+0x13/0x15 [<ffffffff81242de1>] free_object+0x90/0x95 [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3 [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9 ... because system_wq is NULL. Fix it by checking if workqueues susbystem was initialized before using. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/20110528112342.GA3068@joi.lan Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-16	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: AFS: Use i_generation not i_version for the vnode uniquifier AFS: Set s_id in the superblock to the volume name vfs: Fix data corruption after failed write in __block_write_begin() afs: afs_fill_page reads too much, or wrong data VFS: Fix vfsmount overput on simultaneous automount fix wrong iput on d_inode introduced by e6bc45d65d Delay struct net freeing while there's a sysfs instance refering to it afs: fix sget() races, close leak on umount ubifs: fix sget races ubifs: split allocation of ubifs_info into a separate function fix leak in proc_set_super()
2011-06-15	lib/bitmap.c: fix kernel-doc notation	Randy Dunlap
	Fix new kernel-doc warnings in lib/bitmap.c: Warning(lib/bitmap.c:596): No description found for parameter 'buf' Warning(lib/bitmap.c:596): Excess function parameter 'bp' description in '__bitmap_parselist' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-12	Delay struct net freeing while there's a sysfs instance refering to it	Al Viro
	* new refcount in struct net, controlling actual freeing of the memory * new method in kobj_ns_type_operations (->drop_ns()) * ->current_ns() semantics change - it's supposed to be followed by corresponding ->drop_ns(). For struct net in case of CONFIG_NET_NS it bumps the new refcount; net_drop_ns() decrements it and calls net_free() if the last reference has been dropped. Method renamed to ->grab_current_ns(). * old net_free() callers call net_drop_ns() instead. * sysfs_exit_ns() is gone, along with a large part of callchain leading to it; now that the references stored in ->ns[...] stay valid we do not need to hunt them down and replace them with NULL. That fixes problems in sysfs_lookup() and sysfs_readdir(), along with getting rid of sb->s_instances abuse. Note that struct net shutdown logics has not changed - net_cleanup() is called exactly when it used to be called. The only thing postponed by having a sysfs instance refering to that struct net is actual freeing of memory occupied by struct net. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-09	Merge branch 'stable/xen-swiotlb.bugfix' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6 * 'stable/xen-swiotlb.bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6: swiotlb: Export swioltb_nr_tbl and utilize it as appropiate.
2011-06-09	vsprintf: Update %pI6c to not compress a single 0	Joe Perches
	RFC 5952 (http://tools.ietf.org/html/rfc5952) mandates that 2 or more consecutive 0's are required before using :: compression. Update ip6_compressed_string to match the RFC and update the http reference as well. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-06	swiotlb: Export swioltb_nr_tbl and utilize it as appropiate.	FUJITA Tomonori
	By default the io_tlb_nslabs is set to zero, and gets set to whatever value is passed in via swiotlb_init_with_tbl function. The default value passed in is 64MB. However, if the user provides the 'swiotlb=<nslabs>' the default value is ignored and the value provided by the user is used... Except when the SWIOTLB is used under Xen - there the default value of 64MB is used and the Xen-SWIOTLB has no mechanism to get the 'io_tlb_nslabs' filled out by setup_io_tlb_npages functions. This patch provides a function for the Xen-SWIOTLB to call to see if the io_tlb_nslabs is set and if so use that value. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-01	tile: enable CONFIG_BUGVERBOSE	Chris Metcalf
	Trivial config change to enable backtraces on panic. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-05-28	Merge branch 'rcu/urgent' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/urgent
2011-05-26	arch: remove CONFIG_GENERIC_FIND_{NEXT_BIT,BIT_LE,LAST_BIT}	Akinobu Mita
	By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT, CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used to test for existence of find bitops anymore. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Greg Ungerer <gerg@uclinux.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26	bitops: add #ifndef for each of find bitops	Akinobu Mita
	The style that we normally use in asm-generic is to test the macro itself for existence, so in asm-generic, do: #ifndef find_next_zero_bit_le extern unsigned long find_next_zero_bit_le(const void addr, unsigned long size, unsigned long offset); #endif and in the architectures, write static inline unsigned long find_next_zero_bit_le(const void addr, unsigned long size, unsigned long offset) #define find_next_zero_bit_le find_next_zero_bit_le This adds the #ifndef for each of the find bitops in the generic header and source files. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26	flex_array: avoid divisions when accessing elements	Jesse Gross
	On most architectures division is an expensive operation and accessing an element currently requires four of them. This performance penalty effectively precludes flex arrays from being used on any kind of fast path. However, two of these divisions can be handled at creation time and the others can be replaced by a reciprocal divide, completely avoiding real divisions on access. [eparis@redhat.com: rebase on top of changes to support 0 len elements] [eparis@redhat.com: initialize part_nr when array fits entirely in base] Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Eric Paris <eparis@redhat.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26	rcu: Fix unpaired rcu_irq_enter() from locking selftests	Frederic Weisbecker
	HARDIRQ_ENTER() maps to irq_enter() which calls rcu_irq_enter(). But HARDIRQ_EXIT() maps to __irq_exit() which doesn't call rcu_irq_exit(). So for every locking selftest that simulates hardirq disabled, we create an imbalance in the rcu extended quiescent state internal state. As a result, after the first missing rcu_irq_exit(), subsequent irqs won't exit dyntick-idle mode after leaving the interrupt handler. This means that RCU won't see the affected CPU as being in an extended quiescent state, resulting in long grace-period delays (as in grace periods extending for hours). To fix this, just use __irq_enter() to simulate the hardirq context. This is sufficient for the locking selftests as we don't need to exit any extended quiescent state or perform any check that irqs normally do when they wake up from idle. As a side effect, this patch makes it possible to restore "rcu: Decrease memory-barrier usage based on semi-formal proof", which eventually helped finding this bug. Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stable <stable@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-05-25	Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (26 commits) arch/tile: prefer "tilepro" as the name of the 32-bit architecture compat: include aio_abi.h for aio_context_t arch/tile: cleanups for tilegx compat mode arch/tile: allocate PCI IRQs later in boot arch/tile: support signal "exception-trace" hook arch/tile: use better definitions of xchg() and cmpxchg() include/linux/compat.h: coding-style fixes tile: add an RTC driver for the Tilera hypervisor arch/tile: finish enabling support for TILE-Gx 64-bit chip compat: fixes to allow working with tile arch arch/tile: update defconfig file to something more useful tile: do_hardwall_trap: do not play with task->sighand tile: replace mm->cpu_vm_mask with mm_cpumask() tile,mn10300: add device parameter to dma_cache_sync() audit: support the "standard" <asm-generic/unistd.h> arch/tile: clarify flush_buffer()/finv_buffer() function names arch/tile: kernel-related cleanups from removing static page size arch/tile: various header improvements for building drivers arch/tile: disable GX prefetcher during cache flush arch/tile: tolerate disabling CONFIG_BLK_DEV_INITRD ...
2011-05-25	lib: consolidate DEBUG_STACK_USAGE option	Stephen Boyd
	Most arches define CONFIG_DEBUG_STACK_USAGE exactly the same way. Move it to lib/Kconfig.debug so each arch doesn't have to define it. This obviously makes the option generic, but that's fine because the config is already used in generic code. It's not obvious to me that sysrq-P actually does anything caution by keeping the most inclusive wording. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Richard Weinberger <richard@nod.at> Acked-by: Mike Frysinger <vapier@gentoo.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Hirokazu Takata <takata@linux-m32r.org> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25	lib/genalloc.c: add support for specifying the physical address	Jean-Christophe PLAGNIOL-VILLARD
	So we can specify the virtual address as the base of the pool chunk and then get physical addresses for hardware IP. For example on at91 we will use this on spi, uart or macb Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Patrice VILCHEZ <patrice.vilchez@atmel.com> Cc: Jes Sorensen <jes@wildopensource.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25	lib: consolidate DEBUG_PER_CPU_MAPS	Stephen Boyd
	DEBUG_PER_CPU_MAPS is used in lib/cpumask.c as well as in inlcude/linux/cpumask.h and thus it has outgrown its use within x86 and powerpc alone. Any arch with SMP support may want to get some more debugging, so make this option generic. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: <linux-arch@vger.kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25	lib: add kstrto*_from_user()	Alexey Dobriyan
	There is quite a lot of code which does copy_from_user() + strict_strto() or simple_strto() combo in slightly different ways. Before doing conversions all over tree, let's get final API correct. Enter kstrtoull_from_user() and friends. Typical code which uses them looks very simple: TYPE val; int rv; rv = kstrtoTYPE_from_user(buf, count, 0, &val); if (rv < 0) return rv; [use val] return count; There is a tiny semantic difference from the plain kstrto*() API -- the latter allows any amount of leading zeroes, while the former copies data into buffer on stack and thus allows leading zeroes as long as it fits into buffer. This shouldn't be a problem for typical usecase "echo 42 > /proc/x". The point is to make reading one integer from userspace _very_ simple and very bug free. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25	lru_cache: use correct type in sizeof for allocation	Ilia Mirkin
	This has no actual effect, since sizeof(struct hlist_head) == sizeof(struct hlist_head ), but it's still the wrong type to use. The semantic match that finds this problem: // <smpl> @@ type T; identifier x; @@ T x; ... * x = kzalloc(... * sizeof(T) ..., ...); // </smpl> [akpm@linux-foundation.org: use kcalloc()] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Lars Ellenberg <lars@linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25	lib/vsprintf.c: fix interaction of kasprintf() and vsnprintf() when using %pV	Jan Beulich
	Otherwise, the warning at the top of vsnprintf() gets triggered by kvasprintf()'s first invocation (with NULL buffer and zero size) of vsnprintf(). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25	bitmap, irq: add smp_affinity_list interface to /proc/irq	Mike Travis
	Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the cpu count is large. Setting smp affinity to cpus 256 to 263 would be: echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity instead of: echo 256-263 > smp_affinity_list Think about what it looks like for cpus around say, 4088 to 4095. We already have many alternate "list" interfaces: /sys/devices/system/cpu/cpuX/indexY/shared_cpu_list /sys/devices/system/cpu/cpuX/topology/thread_siblings_list /sys/devices/system/cpu/cpuX/topology/core_siblings_list /sys/devices/system/node/nodeX/cpulist /sys/devices/pci*/*/local_cpulist Add a companion interface, smp_affinity_list to use cpu lists instead of cpu maps. This conforms to other companion interfaces where both a map and a list interface exists. This required adding a bitmap_parselist_user() function in a manner similar to the bitmap_parse_user() function. [akpm@linux-foundation.org: make __bitmap_parselist() static] Signed-off-by: Mike Travis <travis@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jack Steiner <steiner@sgi.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25	arch, mm: filter disallowed nodes from arch specific show_mem functions	David Rientjes
	Architectures that implement their own show_mem() function did not pass the filter argument to show_free_areas() to appropriately avoid emitting the state of nodes that are disallowed in the current context. This patch now passes the filter argument to show_free_areas() so those nodes are now avoided. This patch also removes the show_free_areas() wrapper around __show_free_areas() and converts existing callers to pass an empty filter. ia64 emits additional information for each node, so skip_free_areas_zone() must be made global to filter disallowed nodes and it is converted to use a nid argument rather than a zone for this use case. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Helge Deller <deller@gmx.de> Cc: James Bottomley <jejb@parisc-linux.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-24	Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into ↵	James Morris
	for-linus Conflicts: lib/flex_array.c security/selinux/avc.c security/selinux/hooks.c security/selinux/ss/policydb.c security/smack/smack_lsm.c Manually resolve conflicts. Signed-off-by: James Morris <jmorris@namei.org>
2011-05-23	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) b43: fix comment typo reqest -> request Haavard Skinnemoen has left Atmel cris: typo in mach-fs Makefile Kconfig: fix copy/paste-ism for dell-wmi-aio driver doc: timers-howto: fix a typo ("unsgined") perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course'). treewide: fix a few typos in comments regulator: change debug statement be consistent with the style of the rest Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations" audit: acquire creds selectively to reduce atomic op overhead rtlwifi: don't touch with treewide double semicolon removal treewide: cleanup continuations and remove logging message whitespace ath9k_hw: don't touch with treewide double semicolon removal include/linux/leds-regulator.h: fix syntax in example code tty: fix typo in descripton of tty_termios_encode_baud_rate xtensa: remove obsolete BKL kernel option from defconfig m68k: fix comment typo 'occcured' arch:Kconfig.locks Remove unused config option. treewide: remove extra semicolons ...
2011-05-19	Merge branch 'core-rcu-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits) Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree() batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu() net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu() net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu() perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu() perf,rcu: convert call_rcu(free_ctx) to kfree_rcu() net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(net_generic_release) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu() security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu() net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu() net,rcu: convert call_rcu(xps_map_release) to kfree_rcu() net,rcu: convert call_rcu(rps_map_release) to kfree_rcu() ...
2011-05-19	Merge branch 'core-locking-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: seqlock: Don't smp_rmb in seqlock reader spin loop watchdog, hung_task_timeout: Add Kconfig configurable default lockdep: Remove cmpxchg to update nr_chain_hlocks lockdep: Print a nicer description for simple irq lock inversions lockdep: Replace "Bad BFS generated tree" message with something less cryptic lockdep: Print a nicer description for irq inversion bugs lockdep: Print a nicer description for simple deadlocks lockdep: Print a nicer description for normal deadlocks lockdep: Print a nicer description for irq lock inversions
2011-05-19	Merge branch 'core-iommu-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, gart: Rename pci-gart_64.c to amd_gart_64.c x86/amd-iommu: Use threaded interupt handler arch/x86/kernel/pci-iommu_table.c: Convert sprintf_symbol to %pS x86/amd-iommu: Add support for invalidate_all command x86/amd-iommu: Add extended feature detection x86/amd-iommu: Add ATS enable/disable code x86/amd-iommu: Add flag to indicate IOTLB support x86/amd-iommu: Flush device IOTLB if ATS is enabled x86/amd-iommu: Select PCI_IOV with AMD IOMMU driver PCI: Move ATS declarations in seperate header file dma-debug: print information about leaked entry x86/amd-iommu: Flush all internal TLBs when IOMMUs are enabled x86/amd-iommu: Rename iommu_flush_device x86/amd-iommu: Improve handling of full command buffer x86/amd-iommu: Rename iommu_flush* to domain_flush* x86/amd-iommu: Remove command buffer resetting logic x86/amd-iommu: Cleanup completion-wait handling x86/amd-iommu: Cleanup inv_pages command handling x86/amd-iommu: Move inv-dte command building to own function x86/amd-iommu: Move compl-wait command building to own function
2011-05-19	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm: kmemleak: Initialise kmemleak after debug_objects_mem_init() kmemleak: Select DEBUG_FS unconditionally in DEBUG_KMEMLEAK kmemleak: Do not return a pointer to an object that kmemleak did not get
2011-05-19	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds
	* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (48 commits) MIPS: Move arch_get_unmapped_area and gang to new file. MIPS: Cleanup arch_get_unmapped_area MIPS: Octeon: Don't request interrupts for unused IPI mailbox bits. Octeon: Fix interrupt irq settings for performance counters. MIPS: Fix build warnings on defconfigs MIPS: Lemote 2F, Malta: Fix build warning MIPS: Set ELF AT_PLATFORM string for Loongson2 processors MIPS: Set ELF AT_PLATFORM string for BMIPS processors MIPS: Introduce set_elf_platform() helper function MIPS: JZ4740: setup: Autodetect physical memory. MIPS: BCM47xx: Fix MAC address parsing. MIPS: BCM47xx: Extend the filling of SPROM from NVRAM MIPS: BCM47xx: Register SSB fallback sprom callback MIPS: BCM47xx: Extend bcm47xx_fill_sprom with prefix. SSB: Change fallback sprom to callback mechanism. MIPS: Alchemy: Clean up GPIO registers and accessors MIPS: Alchemy: Cleanup DMA addresses MIPS: Alchemy: Rewrite ethernet platform setup MIPS: Alchemy: Rewrite UART setup and constants. MIPS: Alchemy: Convert dbdma.c to syscore_ops ...
2011-05-19	kmemleak: Select DEBUG_FS unconditionally in DEBUG_KMEMLEAK	Catalin Marinas
	In the past DEBUG_FS used to depend on SYSFS and DEBUG_KMEMLEAK selected it conditionally. This is no longer the case, so always select DEBUG_FS via DEBUG_KMEMLEAK. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2011-05-19	MIPS: Enable kmemleak for MIPS	Maxin John
	Signed-off-by: Maxin B. John <maxin.john@gmail.com> To: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Baluta <dbaluta@ixiacom.com> Cc: naveen yadav <yad.naveen@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Patchwork: https://patchwork.linux-mips.org/patch/2244/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-05-19	Add a strtobool function matching semantics of existing in kernel equivalents	Jonathan Cameron
	This is a rename of the usr_strtobool proposal, which was a renamed, relocated and fixed version of previous kstrtobool RFC Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19	lib: Add generic binary search function to the kernel.	Tim Abbott
	There a large number hand-coded binary searches in the kernel (run "git grep search \| grep binary" to find many of them). Since in my experience, hand-coding binary searches can be error-prone, it seems worth cleaning this up by providing a generic binary search function. This generic binary search implementation comes from Ksplice. It has the same basic API as the C library bsearch() function. Ksplice uses it in half a dozen places with 4 different comparison functions, and I think our code is substantially cleaner because of this. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Extra-bikeshedding-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Extra-bikeshedding-by: André Goddard Rosa <andre.goddard@gmail.com> Extra-bikeshedding-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Alessio Igor Bogani <abogani@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-12	vsprintf: Turn kptr_restrict off by default	Ingo Molnar
	kptr_restrict has been triggering bugs in apps such as perf, and it also makes the system less useful by default, so turn it off by default. This is how we generally handle security features that remove functionality, such as firewall code or SELinux - they have to be configured and activated from user-space. Distributions can turn kptr_restrict on again via this line in /etc/sysctrl.conf: kernel.kptr_restrict = 1 ( Also mark the variable __read_mostly while at it, as it's typically modified only once per bootup, or not at all. ) Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-10	Merge branches 'dma-debug/next', 'amd-iommu/command-cleanups', ↵	Joerg Roedel
	'amd-iommu/ats' and 'amd-iommu/extended-features' into iommu/2.6.40 Conflicts: arch/x86/include/asm/amd_iommu_types.h arch/x86/kernel/amd_iommu.c arch/x86/kernel/amd_iommu_init.c
2011-05-05	rcu: Enable DEBUG_OBJECTS_RCU_HEAD from !PREEMPT	Mathieu Desnoyers
	The prohibition of DEBUG_OBJECTS_RCU_HEAD from !PREEMPT was due to the fixup actions. So just produce a warning from !PREEMPT. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05	rcu: Remove conditional compilation for RCU CPU stall warnings	Paul E. McKenney
	The RCU CPU stall warnings can now be controlled using the rcu_cpu_stall_suppress boot-time parameter or via the same parameter from sysfs. There is therefore no longer any reason to have kernel config parameters for this feature. This commit therefore removes the RCU_CPU_STALL_DETECTOR and RCU_CPU_STALL_DETECTOR_RUNNABLE kernel config parameters. The RCU_CPU_STALL_TIMEOUT parameter remains to allow the timeout to be tuned and the RCU_CPU_STALL_VERBOSE parameter remains to allow task-stall information to be suppressed if desired. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-04	audit: support the "standard" <asm-generic/unistd.h>	Chris Metcalf
	Many of the syscalls mentioned in the audit code are not present for architectures that implement only the "standard" set of Linux syscalls (e.g. openat, but not open, etc.). This change adds proper #ifdefs for all those syscalls. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>