aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-04-14selinux: correctly label /proc inodes in use before the policy is loadedPaul Moore
commit f64410ec665479d7b4b77b7519e814253ed0f686 upstream. This patch is based on an earlier patch by Eric Paris, he describes the problem below: "If an inode is accessed before policy load it will get placed on a list of inodes to be initialized after policy load. After policy load we call inode_doinit() which calls inode_doinit_with_dentry() on all inodes accessed before policy load. In the case of inodes in procfs that means we'll end up at the bottom where it does: /* Default to the fs superblock SID. */ isec->sid = sbsec->sid; if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) { if (opt_dentry) { isec->sclass = inode_mode_to_security_class(...) rc = selinux_proc_get_sid(opt_dentry, isec->sclass, &sid); if (rc) goto out_unlock; isec->sid = sid; } } Since opt_dentry is null, we'll never call selinux_proc_get_sid() and will leave the inode labeled with the label on the superblock. I believe a fix would be to mimic the behavior of xattrs. Look for an alias of the inode. If it can't be found, just leave the inode uninitialized (and pick it up later) if it can be found, we should be able to call selinux_proc_get_sid() ..." On a system exhibiting this problem, you will notice a lot of files in /proc with the generic "proc_t" type (at least the ones that were accessed early in the boot), for example: # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }' system_u:object_r:proc_t:s0 /proc/sys/kernel/shmmax However, with this patch in place we see the expected result: # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }' system_u:object_r:sysctl_kernel_t:s0 /proc/sys/kernel/shmmax Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14pps: Fix a use-after free bug when unregistering a source.George Spelvin
commit d953e0e837e65ecc1ddaa4f9560f7925878a0de6 upstream. Remove the cdev from the system (with cdev_del) *before* deallocating it (in pps_device_destruct, called via kobject_put from device_destroy). Also prevent deallocating a device with open file handles. A better long-term fix is probably to remove the cdev from the pps_device entirely, and instead have all devices reference one global cdev. Then the deallocation ordering becomes simpler. But that's more complex and invasive change, so we leave that for later. Signed-off-by: George Spelvin <linux@horizon.com> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14pps: Use pps_lookup_dev to reduce ldisc couplingGeorge Spelvin
commit 03a7ffe4e542310838bac70ef85acc17536b6d7c upstream. Now that N_TTY uses tty->disc_data for its private data, 'subclass' ldiscs cannot use ->disc_data for their own private data. (This is a regression is v3.8-rc1) Use pps_lookup_dev to associate the tty with the pps source instead. This fixes a crashing regression in 3.8-rc1. Signed-off-by: George Spelvin <linux@horizon.com> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14pps: Add pps_lookup_dev() functionGeorge Spelvin
commit 513b032c98b4b9414aa4e9b4a315cb1bf0380101 upstream. The PPS serial line discipline wants to attach a PPS device to a tty without changing the tty code to add a struct pps_device * pointer. Since the number of PPS devices in a typical system is generally very low (n=1 is by far the most common), it's practical to search the entire list of allocated pps devices. (We capture the timestamp before the lookup, so the timing isn't affected.) It is a bit ugly that this function, which is part of the in-kernel PPS API, has to be in pps.c as opposed to kapi,c, but that's not something that affects users. Signed-off-by: George Spelvin <linux@horizon.com> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14idr: idr_for_each_entry() macroPhilipp Reisner
commit 9749f30f1a387070e6e8351f35aeb829eacc3ab6 upstream. Inspired by the list_for_each_entry() macro Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14ipc, msg: fix message length check for negative valuesMathias Krause
commit 4e9b45a19241354daec281d7a785739829b52359 upstream. On 64 bit systems the test for negative message sizes is bogus as the size, which may be positive when evaluated as a long, will get truncated to an int when passed to load_msg(). So a long might very well contain a positive value but when truncated to an int it would become negative. That in combination with a small negative value of msg_ctlmax (which will be promoted to an unsigned type for the comparison against msgsz, making it a big positive value and therefore make it pass the check) will lead to two problems: 1/ The kmalloc() call in alloc_msg() will allocate a too small buffer as the addition of alen is effectively a subtraction. 2/ The copy_from_user() call in load_msg() will first overflow the buffer with userland data and then, when the userland access generates an access violation, the fixup handler copy_user_handle_tail() will try to fill the remainder with zeros -- roughly 4GB. That almost instantly results in a system crash or reset. ,-[ Reproducer (needs to be run as root) ]-- | #include <sys/stat.h> | #include <sys/msg.h> | #include <unistd.h> | #include <fcntl.h> | | int main(void) { | long msg = 1; | int fd; | | fd = open("/proc/sys/kernel/msgmax", O_WRONLY); | write(fd, "-1", 2); | close(fd); | | msgsnd(0, &msg, 0xfffffff0, IPC_NOWAIT); | | return 0; | } '--- Fix the issue by preventing msgsz from getting truncated by consistently using size_t for the message length. This way the size checks in do_msgsnd() could still be passed with a negative value for msg_ctlmax but we would fail on the buffer allocation in that case and error out. Also change the type of m_ts from int to size_t to avoid similar nastiness in other code paths -- it is used in similar constructs, i.e. signed vs. unsigned checks. It should never become negative under normal circumstances, though. Setting msg_ctlmax to a negative value is an odd configuration and should be prevented. As that might break existing userland, it will be handled in a separate commit so it could easily be reverted and reworked without reintroducing the above described bug. Hardening mechanisms for user copy operations would have catched that bug early -- e.g. checking slab object sizes on user copy operations as the usercopy feature of the PaX patch does. Or, for that matter, detect the long vs. int sign change due to truncation, as the size overflow plugin of the very same patch does. [akpm@linux-foundation.org: fix i386 min() warnings] Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Pax Team <pageexec@freemail.hu> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Brad Spengler <spender@grsecurity.net> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: - Adjust context - Drop changes to alloc_msg() and copy_msg(), which don't exist] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14compiler/gcc4: Add quirk for 'asm goto' miscompilation bugIngo Molnar
commit 3f0116c3238a96bc18ad4b4acefe4e7be32fa861 upstream. Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto' constructs, as outlined here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 Implement a workaround suggested by Jakub Jelinek. Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [hq: Backported to 3.4: Adjust context] Signed-off-by: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14compiler-gcc.h: Add gcc-recommended GCC_VERSION macroDaniel Santos
commit 3f3f8d2f48acfd8ed3b8e6b7377935da57b27b16 upstream. Throughout compiler*.h, many version checks are made. These can be simplified by using the macro that gcc's documentation recommends. However, my primary reason for adding this is that I need bug-check macros that are enabled at certain gcc versions and it's cleaner to use this macro than the tradition method: #if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ => 2) If you add patch level, it gets this ugly: #if __GNUC__ > 4 || (__GNUC__ == 4 && (__GNUC_MINOR__ > 2 || \ __GNUC_MINOR__ == 2 __GNUC_PATCHLEVEL__ >= 1)) As opposed to: #if GCC_VERSION >= 40201 While having separate headers for gcc 3 & 4 eliminates some of this verbosity, they can still be cleaned up by this. See also: http://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html Signed-off-by: Daniel Santos <daniel.santos@pobox.com> Acked-by: Borislav Petkov <bp@alien8.de> Acked-by: David Rientjes <rientjes@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Joe Perches <joe@perches.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14workqueue: cond_resched() after processing each work itemTejun Heo
commit b22ce2785d97423846206cceec4efee0c4afd980 upstream. If !PREEMPT, a kworker running work items back to back can hog CPU. This becomes dangerous when a self-requeueing work item which is waiting for something to happen races against stop_machine. Such self-requeueing work item would requeue itself indefinitely hogging the kworker and CPU it's running on while stop_machine would wait for that CPU to enter stop_machine while preventing anything else from happening on all other CPUs. The two would deadlock. Jamie Liu reports that this deadlock scenario exists around scsi_requeue_run_queue() and libata port multiplier support, where one port may exclude command processing from other ports. With the right timing, scsi_requeue_run_queue() can end up requeueing itself trying to execute an IO which is asked to be retried while another device has an exclusive access, which in turn can't make forward progress due to stop_machine. Fix it by invoking cond_resched() after executing each work item. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Jamie Liu <jamieliu@google.com> References: http://thread.gmane.org/gmane.linux.kernel/1552567 [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14printk: Fix rq->lock vs logbuf_lock unlock lock inversionBu, Yitian
commit dbda92d16f8655044e082930e4e9d244b87fde77 upstream. commit 07354eb1a74d1 ("locking printk: Annotate logbuf_lock as raw") reintroduced a lock inversion problem which was fixed in commit 0b5e1c5255 ("printk: Release console_sem after logbuf_lock"). This happened probably when fixing up patch rejects. Restore the ordering and unlock logbuf_lock before releasing console_sem. Signed-off-by: ybu <ybu@qti.qualcomm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/E807E903FE6CBE4D95E420FBFCC273B827413C@nasanexd01h.na.qualcomm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLEOleg Nesterov
commit f000cfdde5de4fc15dead5ccf524359c07eadf2b upstream. audit_log_start() does wait_for_auditd() in a loop until audit_backlog_wait_time passes or audit_skb_queue has a room. If signal_pending() is true this becomes a busy-wait loop, schedule() in TASK_INTERRUPTIBLE won't block. Thanks to Guy for fully investigating and explaining the problem. (akpm: that'll cause the system to lock up on a non-preemptible uniprocessor kernel) (Guy: "Our customer was in fact running a uniprocessor machine, and they reported a system hang.") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Guy Streeter <streeter@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14idr: fix top layer handlingTejun Heo
commit 326cf0f0f308933c10236280a322031f0097205d upstream. Most functions in idr fail to deal with the high bits when the idr tree grows to the maximum height. * idr_get_empty_slot() stops growing idr tree once the depth reaches MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to cover the whole range. The function doesn't even notice that it didn't grow the tree enough and ends up allocating the wrong ID given sufficiently high @starting_id. For example, on 64 bit, if the starting id is 0x7fffff01, idr_get_empty_slot() will grow the tree 5 layer deep, which only covers the 30 bits and then proceed to allocate as if the bit 30 wasn't specified. It ends up allocating 0x3fffff01 without the bit 30 but still returns 0x7fffff01. * __idr_remove_all() will not remove anything if the tree is fully grown. * idr_find() can't find anything if the tree is fully grown. * idr_for_each() and idr_get_next() can't iterate anything if the tree is fully grown. Fix it by introducing idr_max() which returns the maximum possible ID given the depth of tree and replacing the id limit checks in all affected places. As the idr_layer pointer array pa[] needs to be 1 larger than the maximum depth, enlarge pa[] arrays by one. While this plugs the discovered issues, the whole code base is horrible and in desparate need of rewrite. It's fragile like hell, Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: - Adjust context - s/MAX_IDR_LEVEL/MAX_LEVEL/; s/MAX_IDR_SHIFT/MAX_ID_SHIFT/ - Drop change to idr_alloc()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14proc: pid/status: show all supplementary groupsArtem Bityutskiy
commit 8d238027b87e654be552eabdf492042a34c5c300 upstream. We display a list of supplementary group for each process in /proc/<pid>/status. However, we show only the first 32 groups, not all of them. Although this is rare, but sometimes processes do have more than 32 supplementary groups, and this kernel limitation breaks user-space apps that rely on the group list in /proc/<pid>/status. Number 32 comes from the internal NGROUPS_SMALL macro which defines the length for the internal kernel "small" groups buffer. There is no apparent reason to limit to this value. This patch removes the 32 groups printing limit. The Linux kernel limits the amount of supplementary groups by NGROUPS_MAX, which is currently set to 65536. And this is the maximum count of groups we may possibly print. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Qiang Huang <h.huangqiang@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-03Linux 3.4.86v3.4.86Greg Kroah-Hartman
2014-04-03netfilter: nf_conntrack_dccp: fix skb_header_pointer API usagesDaniel Borkmann
commit b22f5126a24b3b2f15448c3f2a254fc10cbc2b92 upstream. Some occurences in the netfilter tree use skb_header_pointer() in the following way ... struct dccp_hdr _dh, *dh; ... skb_header_pointer(skb, dataoff, sizeof(_dh), &dh); ... where dh itself is a pointer that is being passed as the copy buffer. Instead, we need to use &_dh as the forth argument so that we're copying the data into an actual buffer that sits on the stack. Currently, we probably could overwrite memory on the stack (e.g. with a possibly mal-formed DCCP packet), but unintentionally, as we only want the buffer to be placed into _dh variable. Fixes: 2bc780499aa3 ("[NETFILTER]: nf_conntrack: add DCCP protocol support") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-03x86: fix boot on uniprocessor systemsArtem Fetishev
commit 825600c0f20e595daaa7a6dd8970f84fa2a2ee57 upstream. On x86 uniprocessor systems topology_physical_package_id() returns -1 which causes rapl_cpu_prepare() to leave rapl_pmu variable uninitialized which leads to GPF in rapl_pmu_init(). See arch/x86/kernel/cpu/perf_event_intel_rapl.c. It turns out that physical_package_id and core_id can actually be retreived for uniprocessor systems too. Enabling them also fixes rapl_pmu code. Signed-off-by: Artem Fetishev <artem_fetishev@epam.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-03Input: synaptics - add manual min/max quirk for ThinkPad X240Hans de Goede
commit 8a0435d958fb36d93b8df610124a0e91e5675c82 upstream. This extends Benjamin Tissoires manual min/max quirk table with support for the ThinkPad X240. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-03Input: synaptics - add manual min/max quirkBenjamin Tissoires
commit 421e08c41fda1f0c2ff6af81a67b491389b653a5 upstream. The new Lenovo Haswell series (-40's) contains a new Synaptics touchpad. However, these new Synaptics devices report bad axis ranges. Under Windows, it is not a problem because the Windows driver uses RMI4 over SMBus to talk to the device. Under Linux, we are using the PS/2 fallback interface and it occurs the reported ranges are wrong. Of course, it would be too easy to have only one range for the whole series, each touchpad seems to be calibrated in a different way. We can not use SMBus to get the actual range because I suspect the firmware will switch into the SMBus mode and stop talking through PS/2 (this is the case for hybrid HID over I2C / PS/2 Synaptics touchpads). So as a temporary solution (until RMI4 land into upstream), start a new list of quirks with the min/max manually set. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-03ext4: atomically set inode->i_flags in ext4_set_inode_flags()Theodore Ts'o
commit 00a1a053ebe5febcfc2ec498bd894f035ad2aa06 upstream. Use cmpxchg() to atomically set i_flags instead of clearing out the S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race where an immutable file has the immutable flag cleared for a brief window of time. Reported-by: John Sullivan <jsrhbz@kanargh.force9.co.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-03staging: speakup: Prefix externally-visible symbolsSamuel Thibault
commit ca2beaf84d9678c12b17d92623f0e90829d6ca13 upstream. This prefixes all externally-visible symbols of speakup with "spk_". Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30Linux 3.4.85v3.4.85Greg Kroah-Hartman
2014-03-30ipc/msg: fix race around refcountKonstantin Khlebnikov
[fixed differently in 6062a8dc0517bce23e3c2f7d2fea5e22411269a3 upstream.] In older kernels (before v3.10) ipc_rcu_hdr->refcount was non-atomic int. There was possuble double-free bug: do_msgsnd() calls ipc_rcu_putref() under msq->q_perm->lock and RCU, while freequeue() calls it while it holds only 'rw_mutex', so there is no sinchronization between them. Two function decrements '2' non-atomically, they both can get '0' as result. do_msgsnd() freequeue() msq = msg_lock_check(ns, msqid); ... ipc_rcu_getref(msq); msg_unlock(msq); schedule(); (caller locks spinlock) expunge_all(msq, -EIDRM); ss_wakeup(&msq->q_senders, 1); msg_rmid(ns, msq); msg_unlock(msq); ipc_lock_by_ptr(&msq->q_perm); ipc_rcu_putref(msq); ipc_rcu_putref(msq); < both may get get --(...)->refcount == 0 > This patch locks ipc_lock and RCU around ipc_rcu_putref in freequeue. ( RCU protects memory for spin_unlock() ) Similar bugs might be in other users of ipc_rcu_putref(). In the mainline this has been fixed in v3.10 indirectly in commmit 6062a8dc0517bce23e3c2f7d2fea5e22411269a3 ("ipc,sem: fine grained locking for semtimedop") by Rik van Riel. That commit optimized locking and converted refcount into atomic. I'm not sure that anybody should care about this bug: it's very-very unlikely and no longer exists in actual mainline. I've found this just by looking into the code, probably this never happens in real life. Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
2014-03-30xhci: Fix resume issues on Renesas chips in Samsung laptopsSarah Sharp
commit 1aa9578c1a9450fb21501c4f549f5b1edb557e6d upstream. Don Zickus <dzickus@redhat.com> writes: Some co-workers of mine bought Samsung laptops that had mostly usb3 ports. Those ports did not resume correctly (the driver would timeout communicating and fail). This led to frustration as suspend/resume is a common use for laptops. Poking around, I applied the reset on resume quirk to this chipset and the resume started working. Reloading the xhci_hcd module had been the temporary workaround. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reported-by: Don Zickus <dzickus@redhat.com> Tested-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30KVM: VMX: fix use after free of vmx->loaded_vmcsMarcelo Tosatti
commit 26a865f4aa8e66a6d94958de7656f7f1b03c6c56 upstream. After free_loaded_vmcs executes, the "loaded_vmcs" structure is kfreed, and now vmx->loaded_vmcs points to a kfreed area. Subsequent free_loaded_vmcs then attempts to manipulate vmx->loaded_vmcs. Switch the order to avoid the problem. https://bugzilla.redhat.com/show_bug.cgi?id=1047892 Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30KVM: MMU: handle invalid root_hpa at __direct_mapMarcelo Tosatti
commit 989c6b34f6a9480e397b170cc62237e89bf4fdb9 upstream. It is possible for __direct_map to be called on invalid root_hpa (-1), two examples: 1) try_async_pf -> can_do_async_pf -> vmx_interrupt_allowed -> nested_vmx_vmexit 2) vmx_handle_exit -> vmx_interrupt_allowed -> nested_vmx_vmexit Then to load_vmcs12_host_state and kvm_mmu_reset_context. Check for this possibility, let fault exception be regenerated. BZ: https://bugzilla.redhat.com/show_bug.cgi?id=924916 Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30Input: elantech - improve clickpad detectionHans de Goede
commit c15bdfd5b9831e4cab8cfc118243956e267dd30e upstream. The current assumption in the elantech driver that hw version 3 touchpads are never clickpads and hw version 4 touchpads are always clickpads is wrong. There are several bug reports for this, ie: https://bugzilla.redhat.com/show_bug.cgi?id=1030802 http://superuser.com/questions/619582/right-elantech-touchpad-button-not-working-in-linux I've spend a couple of hours wading through various bugzillas, launchpads and forum posts to create a list of fw-versions and capabilities for different laptop models to find a good method to differentiate between clickpads and versions with separate hardware buttons. Which shows that a device being a clickpad is reliable indicated by bit 12 being set in the fw_version. I've included the gathered list inside the driver, so that we've this info at hand if we need to revisit this later. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30ARM: move outer_cache declaration out of ifdefRob Herring
commit 0b53c11d533a8f6688d73fad0baf67dd08ec1b90 upstream. Move the outer_cache declaration of the CONFIG_OUTER_CACHE ifdef so that outer_cache can be used inside IS_ENABLED condition. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30i7300_edac: Fix device reference countJean Delvare
commit 75135da0d68419ef8a925f4c1d5f63d8046e314d upstream. pci_get_device() decrements the reference count of "from" (last argument) so when we break off the loop successfully we have only one device reference - and we don't know which device we have. If we want a reference to each device, we must take them explicitly and let the pci_get_device() walk complete to avoid duplicate references. This is serious, as over-putting device references will cause the device to eventually disappear. Without this fix, the kernel crashes after a few insmod/rmmod cycles. Tested on an Intel S7000FC4UR system with a 7300 chipset. Signed-off-by: Jean Delvare <jdelvare@suse.de> Link: http://lkml.kernel.org/r/20140224111656.09bbb7ed@endymion.delvare Cc: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: stable@vger.kernel.org Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30p54: clamp properly instead of just truncatingDan Carpenter
commit 608cfbe4abaf76e9d732efd7ed1cfa3998163d91 upstream. The call to clamp_t() first truncates the variable signed 8 bit and as a result, the actual clamp is a no-op. Fixes: 0d78156eef1d ('p54: improve site survey') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30deb-pkg: Fix cross-building linux-headers packageBen Hutchings
commit f8ce239dfc7ba9add41d9ecdc5e7810738f839fa upstream. builddeb generates a control file that says the linux-headers package can only be built for the build system primary architecture. This breaks cross-building configurations. We should use $debarch for this instead. Since $debarch is not yet set when generating the control file, set Architecture: any and use control file variables to fill in the description. Fixes: cd8d60a20a45 ('kbuild: create linux-headers package in deb-pkg') Reported-and-tested-by: "Niew, Sh." <shniew@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Michal Marek <mmarek@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30x86: bpf_jit: support negative offsetsAlexei Starovoitov
commit fdfaf64e75397567257e1051931f9a3377360665 upstream. Commit a998d4342337 claimed to introduce negative offset support to x86 jit, but it couldn't be working, since at the time of the execution of LD+ABS or LD+IND instructions via call into bpf_internal_load_pointer_neg_helper() the %edx (3rd argument of this func) had junk value instead of access size in bytes (1 or 2 or 4). Store size into %edx instead of %ecx (what original commit intended to do) Fixes: a998d4342337 ("bpf jit: Let the x86 jit handle negative offsets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Jan Seiffert <kaffeemonster@googlemail.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30iwlwifi: Complete backport of "iwlwifi: always copy first 16 bytes of commands"Ben Hutchings
Linux 3.4.83 included an incomplete backport of commit 8a964f44e01ad3bbc208c3e80d931ba91b9ea786 ('iwlwifi: always copy first 16 bytes of commands') which causes a regression for this driver. This is the missing piece. Reported-by: Andreas Sturmlechner <andreas.sturmlechner@gmail.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Andres Bertens <abertensu@yahoo.com> Tested-by: Andreas Sturmlechner <andreas.sturmlechner@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-03-30libceph: resend all writes after the osdmap loses the full flagJosh Durgin
commit 9a1ea2dbff11547a8e664f143c1ffefc586a577a upstream. With the current full handling, there is a race between osds and clients getting the first map marked full. If the osd wins, it will return -ENOSPC to any writes, but the client may already have writes in flight. This results in the client getting the error and propagating it up the stack. For rbd, the block layer turns this into EIO, which can cause corruption in filesystems above it. To avoid this race, osds are being changed to drop writes that came from clients with an osdmap older than the last osdmap marked full. In order for this to work, clients must resend all writes after they encounter a full -> not full transition in the osdmap. osds will wait for an updated map instead of processing a request from a client with a newer map, so resent writes will not be dropped by the osd unless there is another not full -> full transition. This approach requires both osds and clients to be fixed to avoid the race. Old clients talking to osds with this fix may hang instead of returning EIO and potentially corrupting an fs. New clients talking to old osds have the same behavior as before if they encounter this race. Fixes: http://tracker.ceph.com/issues/6938 Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-30ALSA: compress: Pass through return value of open ops callbackCharles Keepax
commit 749d32237bf39e6576dd95bfdf24e4378e51716c upstream. The snd_compr_open function would always return 0 even if the compressed ops open function failed, obviously this is incorrect. Looks like this was introduced by a small typo in: commit a0830dbd4e42b38aefdf3fb61ba5019a1a99ea85 ALSA: Add a reference counter to card instance This patch returns the value from the compressed op as it should. Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23Linux 3.4.84v3.4.84Greg Kroah-Hartman
2014-03-23jiffies: Avoid undefined behavior from signed overflowPaul E. McKenney
commit 5a581b367b5df0531265311fc681c2abd377e5e6 upstream. According to the C standard 3.4.3p3, overflow of a signed integer results in undefined behavior. This commit therefore changes the definitions of time_after(), time_after_eq(), time_after64(), and time_after_eq64() to avoid this undefined behavior. The trick is that the subtraction is done using unsigned arithmetic, which according to 6.2.5p9 cannot overflow because it is defined as modulo arithmetic. This has the added (though admittedly quite small) benefit of shortening four lines of code by four characters each. Note that the C standard considers the cast from unsigned to signed to be implementation-defined, see 6.3.1.3p3. However, on a two's-complement system, an implementation that defines anything other than a reinterpretation of the bits is free to come to me, and I will be happy to act as a witness for its being committed to an insane asylum. (Although I have nothing against saturating arithmetic or signals in some cases, these things really should not be the default when compiling an operating-system kernel.) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: John Stultz <john.stultz@linaro.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Kevin Easton <kevin@guarana.org> [ paulmck: Included time_after64() and time_after_eq64(), as suggested by Eric Dumazet, also fixed commit message.] Reviewed-by: Josh Triplett <josh@joshtriplett.org> Ruchi Kandoi <kandoiruchi@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23ALSA: oxygen: modify adjust_dg_dac_routing functionRoman Volkov
commit 1f91ecc14deea9461aca93273d78871ec4d98fcd upstream. When selecting the audio output destinations (headphones, FP headphones, multichannel output), the channel routing should be changed depending on what destination selected. Also unnecessary I2S channels are digitally muted. This function called when the user selects the destination in the ALSA mixer. Signed-off-by: Roman Volkov <v1ron@mail.ru> Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23Btrfs: fix data corruption when reading/updating compressed extentsFilipe David Borba Manana
commit a2aa75e18a21b21952dc6daa9bac7c9f4426f81f upstream. When using a mix of compressed file extents and prealloc extents, it is possible to fill a page of a file with random, garbage data from some unrelated previous use of the page, instead of a sequence of zeroes. A simple sequence of steps to get into such case, taken from the test case I made for xfstests, is: _scratch_mkfs _scratch_mount "-o compress-force=lzo" $XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar $XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar $XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar This results in the following file items in the fs tree: item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160 inode generation 6 transid 6 size 542872 block group 0 mode 100600 item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53 extent data disk byte 0 nr 0 gen 6 extent data offset 0 nr 24576 ram 266240 extent compression 0 item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53 prealloc data disk byte 12849152 nr 241664 gen 6 prealloc data offset 0 nr 241664 item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53 extent data disk byte 12845056 nr 4096 gen 6 extent data offset 0 nr 20480 ram 20480 extent compression 2 item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53 prealloc data disk byte 13090816 nr 405504 gen 6 prealloc data offset 0 nr 258048 The on disk extent at offset 266240 (which corresponds to 1 single disk block), contains 5 compressed chunks of file data. Each of the first 4 compress 4096 bytes of file data, while the last one only compresses 3024 bytes of file data. Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 = 1072 bytes) should always return zeroes (our next extent is a prealloc one). The solution here is the compression code path to zero the remaining (untouched) bytes of the last page it uncompressed data into, as the information about how much space the file data consumes in the last page is not known in the upper layer fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing the remainder of the page but only if it corresponds to the last page of the inode and if the inode's size is not a multiple of the page size. This would cause not only returning random data on reads, but also permanently storing random data when updating parts of the region that should be zeroed. For the example above, it means updating a single byte in the region [285648 ; 286720[ would store that byte correctly but also store random data on disk. A test case for xfstests follows soon. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23SCSI: storvsc: NULL pointer dereference fixAles Novak
commit b12bb60d6c350b348a4e1460cd68f97ccae9822e upstream. If the initialization of storvsc fails, the storvsc_device_destroy() causes NULL pointer dereference. storvsc_bus_scan() scsi_scan_target() __scsi_scan_target() scsi_probe_and_add_lun(hostdata=NULL) scsi_alloc_sdev(hostdata=NULL) sdev->hostdata = hostdata now the host allocation fails __scsi_remove_device(sdev) calls sdev->host->hostt->slave_destroy() == storvsc_device_destroy(sdev) access of sdev->hostdata->request_mempool Signed-off-by: Ales Novak <alnovak@suse.cz> Signed-off-by: Thomas Abraham <tabraham@suse.com> Reviewed-by: Jiri Kosina <jkosina@suse.cz> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23SCSI: qla2xxx: Poll during initialization for ISP25xx and ISP83xxGiridhar Malavali
commit b77ed25c9f8402e8b3e49e220edb4ef09ecfbb53 upstream. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23SCSI: isci: correct erroneous for_each_isci_host macroLukasz Dorau
commit c59053a23d586675c25d789a7494adfdc02fba57 upstream. In the first place, the loop 'for' in the macro 'for_each_isci_host' (drivers/scsi/isci/host.h:314) is incorrect, because it accesses the 3rd element of 2 element array. After the 2nd iteration it executes the instruction: ihost = to_pci_info(pdev)->hosts[2] (while the size of the 'hosts' array equals 2) and reads an out of range element. In the second place, this loop is incorrectly optimized by GCC v4.8 (see http://marc.info/?l=linux-kernel&m=138998871911336&w=2). As a result, on platforms with two SCU controllers, the loop is executed more times than it can be (for i=0,1 and 2). It causes kernel panic during entering the S3 state and the following oops after 'rmmod isci': BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8131360b>] __list_add+0x1b/0xc0 Oops: 0000 [#1] SMP RIP: 0010:[<ffffffff8131360b>] [<ffffffff8131360b>] __list_add+0x1b/0xc0 Call Trace: [<ffffffff81661b84>] __mutex_lock_slowpath+0x114/0x1b0 [<ffffffff81661c3f>] mutex_lock+0x1f/0x30 [<ffffffffa03e97cb>] sas_disable_events+0x1b/0x50 [libsas] [<ffffffffa03e9818>] sas_unregister_ha+0x18/0x60 [libsas] [<ffffffffa040316e>] isci_unregister+0x1e/0x40 [isci] [<ffffffffa0403efd>] isci_pci_remove+0x5d/0x100 [isci] [<ffffffff813391cb>] pci_device_remove+0x3b/0xb0 [<ffffffff813fbf7f>] __device_release_driver+0x7f/0xf0 [<ffffffff813fc8f8>] driver_detach+0xa8/0xb0 [<ffffffff813fbb8b>] bus_remove_driver+0x9b/0x120 [<ffffffff813fcf2c>] driver_unregister+0x2c/0x50 [<ffffffff813381f3>] pci_unregister_driver+0x23/0x80 [<ffffffffa04152f8>] isci_exit+0x10/0x1e [isci] [<ffffffff810d199b>] SyS_delete_module+0x16b/0x2d0 [<ffffffff81012a21>] ? do_notify_resume+0x61/0xa0 [<ffffffff8166ce29>] system_call_fastpath+0x16/0x1b The loop has been corrected. This patch fixes kernel panic during entering the S3 state and the above oops. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Tested-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23SCSI: isci: fix reset timeout handlingDan Williams
commit ddfadd7736b677de2d4ca2cd5b4b655368c85a7a upstream. Remove an erroneous BUG_ON() in the case of a hard reset timeout. The reset timeout handler puts the port into the "awaiting link-up" state. The timeout causes the device to be disconnected and we need to be in the awaiting link-up state to re-connect the port. The BUG_ON() made the incorrect assumption that resets never timeout and we always complete the reset in the "resetting" state. Testing this patch also uncovered that libata continues to attempt to reset the port long after the driver has torn down the context. Once the driver has committed to abandoning the link it must indicate to libata that recovery ends by returning -ENODEV from ->lldd_I_T_nexus_reset(). Acked-by: Lukasz Dorau <lukasz.dorau@intel.com> Reported-by: David Milburn <dmilburn@redhat.com> Reported-by: Xun Ni <xun.ni@intel.com> Tested-by: Xun Ni <xun.ni@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() failsMarc Kleine-Budde
commit 7e9e148af01ef388efb6e2490805970be4622792 upstream. If flexcan_chip_start() in flexcan_open() fails, the interrupt is not freed, this patch adds the missing cleanup. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23vmxnet3: fix building without CONFIG_PCI_MSIArnd Bergmann
commit 0a8d8c446b5429d15ff2d48f46e00d8a08552303 upstream. Since commit d25f06ea466e "vmxnet3: fix netpoll race condition", the vmxnet3 driver fails to build when CONFIG_PCI_MSI is disabled, because it unconditionally references the vmxnet3_msix_rx() function. To fix this, use the same #ifdef in the caller that exists around the function definition. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23vmxnet3: fix netpoll race conditionNeil Horman
commit d25f06ea466ea521b563b76661180b4e44714ae6 upstream. vmxnet3's netpoll driver is incorrectly coded. It directly calls vmxnet3_do_poll, which is the driver internal napi poll routine. As the netpoll controller method doesn't block real napi polls in any way, there is a potential for race conditions in which the netpoll controller method and the napi poll method run concurrently. The result is data corruption causing panics such as this one recently observed: PID: 1371 TASK: ffff88023762caa0 CPU: 1 COMMAND: "rs:main Q:Reg" #0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b #1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92 #2 [ffff88023abd58b0] oops_end at ffffffff8152b570 #3 [ffff88023abd58e0] die at ffffffff81010e0b #4 [ffff88023abd5910] do_trap at ffffffff8152add4 #5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95 #6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b [exception RIP: vmxnet3_rq_rx_complete+1968] RIP: ffffffffa00f1e80 RSP: ffff88023abd5ac8 RFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff88023b5dcee0 RCX: 00000000000000c0 RDX: 0000000000000000 RSI: 00000000000005f2 RDI: ffff88023b5dcee0 RBP: ffff88023abd5b48 R8: 0000000000000000 R9: ffff88023a3b6048 R10: 0000000000000000 R11: 0000000000000002 R12: ffff8802398d4cd8 R13: ffff88023af35140 R14: ffff88023b60c890 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3] #8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3] #9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7 The fix is to do as other drivers do, and have the poll controller call the top half interrupt handler, wh