aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-10-05dm snapshot: workaround for a false positive lockdep warningMikulas Patocka
commit 5ea330a75bd86b2b2a01d7b85c516983238306fb upstream. The kernel reports a lockdep warning if a snapshot is invalidated because it runs out of space. The lockdep warning was triggered by commit 0976dfc1d0cd80a4e9dfaf87bd87 ("workqueue: Catch more locking problems with flush_work()") in v3.5. The warning is false positive. The real cause for the warning is that the lockdep engine treats different instances of md->lock as a single lock. This patch is a workaround - we use flush_workqueue instead of flush_work. This code path is not performance sensitive (it is called only on initialization or invalidation), thus it doesn't matter that we flush the whole workqueue. The real fix for the problem would be to teach the lockdep engine to treat different instances of md->lock as separate locks. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05driver core : Fix use after free of dev->parent in device_shutdownBenson Leung
commit f123db8e9d6c84c863cb3c44d17e61995dc984fb upstream. The put_device(dev) at the bottom of the loop of device_shutdown may result in the dev being cleaned up. In device_create_release, the dev is kfreed. However, device_shutdown attempts to use the dev pointer again after put_device by referring to dev->parent. Copy the parent pointer instead to avoid this condition. This bug was found on Chromium OS's chromeos-3.8, which is based on v3.8.11. See bug report : https://code.google.com/p/chromium/issues/detail?id=297842 This can easily be reproduced when shutting down with hidraw devices that report battery condition. Two examples are the HP Bluetooth Mouse X4000b and the Apple Magic Mouse. For example, with the magic mouse : The dev in question is "hidraw0" dev->parent is "magicmouse" In the course of the shutdown for this device, the input event cleanup calls a put on hidraw0, decrementing its reference count. When we finally get to put_device(dev) in device_shutdown, kobject_cleanup is called and device_create_release does kfree(dev). dev->parent is no longer valid, and we may crash in put_device(dev->parent). This change should be applied on any kernel with this change : d1c6c030fcec6f860d9bb6c632a3ebe62e28440b Signed-off-by: Benson Leung <bleung@chromium.org> Reviewed-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05usb/core/devio.c: Don't reject control message to endpoint with wrong ↵Kurt Garloff
direction bit commit 831abf76643555a99b80a3b54adfa7e4fa0a3259 upstream. Trying to read data from the Pegasus Technologies NoteTaker (0e20:0101) [1] with the Windows App (EasyNote) works natively but fails when Windows is running under KVM (and the USB device handed to KVM). The reason is a USB control message usb 4-2.2: control urb: bRequestType=22 bRequest=09 wValue=0200 wIndex=0001 wLength=0008 This goes to endpoint address 0x01 (wIndex); however, endpoint address 0x01 does not exist. There is an endpoint 0x81 though (same number, but other direction); the app may have meant that endpoint instead. The kernel thus rejects the IO and thus we see the failure. Apparently, Linux is more strict here than Windows ... we can't change the Win app easily, so that's a problem. It seems that the Win app/driver is buggy here and the driver does not behave fully according to the USB HID class spec that it claims to belong to. The device seems to happily deal with that though (and seems to not really care about this value much). So the question is whether the Linux kernel should filter here. Rejecting has the risk that somewhat non-compliant userspace apps/ drivers (most likely in a virtual machine) are prevented from working. Not rejecting has the risk of confusing an overly sensitive device with such a transfer. Given the fact that Windows does not filter it makes this risk rather small though. The patch makes the kernel more tolerant: If the endpoint address in wIndex does not exist, but an endpoint with toggled direction bit does, it will let the transfer through. (It does NOT change the message.) With attached patch, the app in Windows in KVM works. usb 4-2.2: check_ctrlrecip: process 13073 (qemu-kvm) requesting ep 01 but needs 81 I suspect this will mostly affect apps in virtual environments; as on Linux the apps would have been adapted to the stricter handling of the kernel. I have done that for mine[2]. [1] http://www.pegatech.com/ [2] https://sourceforge.net/projects/notetakerpen/ Signed-off-by: Kurt Garloff <kurt@garloff.de> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05usb: dwc3: add support for MerrifieldDavid Cohen
commit 85601f8cf67c56a561a6dd5e130e65fdc179047d upstream. Add PCI id for Intel Merrifield Signed-off-by: David Cohen <david.a.cohen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05usb: dwc3: pci: add support for BayTrailHeikki Krogerus
commit b62cd96de3161dfb125a769030eec35a4cab3d3a upstream. Add PCI id for Intel BayTrail. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05fsl/usb: Resolve PHY_CLK_VLD instability issue for ULPI phyRamneek Mehresh
commit ad1260e9fbf768d6bed227d9604ebee76a84aae3 upstream. For controller versions greater than 1.6, setting ULPI_PHY_CLK_SEL bit when USB_EN bit is already set causes instability issues with PHY_CLK_VLD bit. So USB_EN is set only for IP controller version below 1.6 before setting ULPI_PHY_CLK_SEL bit Signed-off-by: Ramneek Mehresh <ramneek.mehresh@freescale.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05USB: Fix breakage in ffs_fs_mount()Al Viro
commit 2606b28aabd7dea1766c23a105e1124c95409c96 upstream. There's a bunch of failure exits in ffs_fs_mount() with seriously broken recovery logics. Most of that appears to stem from misunderstanding of the ->kill_sb() semantics; unlike ->put_super() it is called for *all* superblocks of given type, no matter how (in)complete the setup had been. ->put_super() is called only if ->s_root is not NULL; any failure prior to setting ->s_root will have the call of ->put_super() skipped. ->kill_sb(), OTOH, awaits every superblock that has come from sget(). Current behaviour of ffs_fs_mount(): We have struct ffs_sb_fill_data data on stack there. We do ffs_dev = functionfs_acquire_dev_callback(dev_name); and store that in data.private_data. Then we call mount_nodev(), passing it ffs_sb_fill() as a callback. That will either fail outright, or manage to call ffs_sb_fill(). There we allocate an instance of struct ffs_data, slap the value of ffs_dev (picked from data.private_data) into ffs->private_data and overwrite data.private_data by storing ffs into an overlapping member (data.ffs_data). Then we store ffs into sb->s_fs_info and attempt to set the rest of the things up (root inode, root dentry, then create /ep0 there). Any of those might fail. Should that happen, we get ffs_fs_kill_sb() called before mount_nodev() returns. If mount_nodev() fails for any reason whatsoever, we proceed to functionfs_release_dev_callback(data.ffs_data); That's broken in a lot of ways. Suppose the thing has failed in allocation of e.g. root inode or dentry. We have functionfs_release_dev_callback(ffs); ffs_data_put(ffs); done by ffs_fs_kill_sb() (ffs accessed via sb->s_fs_info), followed by functionfs_release_dev_callback(ffs); from ffs_fs_mount() (via data.ffs_data). Note that the second functionfs_release_dev_callback() has every chance to be done to freed memory. Suppose we fail *before* root inode allocation. What happens then? ffs_fs_kill_sb() doesn't do anything to ffs (it's either not called at all, or it doesn't have a pointer to ffs stored in sb->s_fs_info). And functionfs_release_dev_callback(data.ffs_data); is called by ffs_fs_mount(), but here we are in nasal daemon country - we are reading from a member of union we'd never stored into. In practice, we'll get what we used to store into the overlapping field, i.e. ffs_dev. And then we get screwed, since we treat it (struct gfs_ffs_obj * in disguise, returned by functionfs_acquire_dev_callback()) as struct ffs_data *, pick what would've been ffs_data ->private_data from it (*well* past the actual end of the struct gfs_ffs_obj - struct ffs_data is much bigger) and poke in whatever it points to. FWIW, there's a minor leak on top of all that in case if ffs_sb_fill() fails on kstrdup() - ffs is obviously forgotten. The thing is, there is no point in playing all those games with union. Just allocate and initialize ffs_data *before* calling mount_nodev() and pass a pointer to it via data.ffs_data. And once it's stored in sb->s_fs_info, clear data.ffs_data, so that ffs_fs_mount() knows that it doesn't need to kill the sucker manually - from that point on we'll have it done by ->kill_sb(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05USB: UHCI: accept very late isochronous URBsAlan Stern
commit bef073b067a7b1874a6b381e0035bb0516d71a77 upstream. Commit 24f531371de1 (USB: EHCI: accept very late isochronous URBs) changed the isochronous API provided by ehci-hcd. URBs submitted too late, so that the time slots for all their packets have already expired, are no longer rejected outright. Instead the submission is accepted, and the URB completes normally with a -EXDEV error for each packet. This is what client drivers expect. This patch implements the same policy in uhci-hcd. It should be applied to all kernels containing commit c44b225077bb (UHCI: implement new semantics for URB_ISO_ASAP). Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05USB: OHCI: accept very late isochronous URBsAlan Stern
commit a8693424c751b8247ee19bd8b857f1d4f432b972 upstream. Commit 24f531371de1 (USB: EHCI: accept very late isochronous URBs) changed the isochronous API provided by ehci-hcd. URBs submitted too late, so that the time slots for all their packets have already expired, are no longer rejected outright. Instead the submission is accepted, and the URB completes normally with a -EXDEV error for each packet. This is what client drivers expect. This patch implements the same policy in ohci-hcd. The change is more complicated than it was in ehci-hcd, because ohci-hcd doesn't scan for isochronous completions in the same way as ehci-hcd does. Rather, it depends on the hardware adding completed TDs to a "done queue". Some OHCI controller don't handle this properly when a TD's time slot has already expired, so we have to avoid adding such TDs to the schedule in the first place. As a result, if the URB was submitted too late then none of its TDs will get put on the schedule, so none of them will end up on the done queue, so the driver will never realize that the URB should be completed. To solve this problem, the patch adds one to urb_priv->td_cnt for such URBs, making it larger than urb_priv->length (td_cnt already gets set to the number of TD's that had to be skipped because their slots have expired). Each time an URB is given back, the finish_urb() routine looks to see if urb_priv->td_cnt for the next URB on the same endpoint is marked in this way. If so, it gives back the next URB right away. This should be applied to all kernels containing commit 815fa7b91761 (USB: OHCI: fix logic for scheduling isochronous URBs). Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05xhci: Fix race between ep halt and URB cancellationFlorian Wolter
commit 526867c3ca0caa2e3e846cb993b0f961c33c2abb upstream. The halted state of a endpoint cannot be cleared over CLEAR_HALT from a user process, because the stopped_td variable was overwritten in the handle_stopped_endpoint() function. So the xhci_endpoint_reset() function will refuse the reset and communication with device can not run over this endpoint. https://bugzilla.kernel.org/show_bug.cgi?id=60699 Signed-off-by: Florian Wolter <wolly84@web.de> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Cc: Jonghwan Choi <jhbird.choi@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05USB: fix PM config symbol in uhci-hcd, ehci-hcd, and xhci-hcdAlan Stern
commit f875fdbf344b9fde207f66b392c40845dd7e5aa6 upstream. Since uhci-hcd, ehci-hcd, and xhci-hcd support runtime PM, the .pm field in their pci_driver structures should be protected by CONFIG_PM rather than CONFIG_PM_SLEEP. The corresponding change has already been made for ohci-hcd. Without this change, controllers won't do runtime suspend if system suspend or hibernation isn't enabled. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05xhci: Fix oops happening after address device timeoutMathias Nyman
commit 284d20552461466b04d6bfeafeb1c47a8891b591 upstream. When a command times out, the command ring is first aborted, and then stopped. If the command ring is empty when it is stopped the stop event will point to next command which is not yet set. xHCI tries to handle this next event often causing an oops. Don't handle command completion events on stopped cmd ring if ring is empty. This patch should be backported to kernels as old as 3.7, that contain the commit b92cc66c047ff7cf587b318fe377061a353c120f "xHCI: add aborting command ring function" Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Reported-by: Giovanni <giovanni.nervi@yahoo.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05xhci: Ensure a command structure points to the correct trb on the command ringMathias Nyman
commit ec7e43e2d98173483866fe2e4e690143626b659c upstream. If a command on the command ring needs to be cancelled before it is handled it can be turned to a no-op operation when the ring is stopped. We want to store the command ring enqueue pointer in the command structure when the command in enqueued for the cancellation case. Some commands used to store the command ring dequeue pointers instead of enqueue (these often worked because enqueue happends to equal dequeue quite often) Other commands correctly used the enqueue pointer but did not check if it pointed to a valid trb or a link trb, this caused for example stop endpoint command to timeout in xhci_stop_device() in about 2% of suspend/resume cases. This should also solve some weird behavior happening in command cancellation cases. This patch is based on a patch submitted by Sarah Sharp to linux-usb, but then forgotten: http://marc.info/?l=linux-usb&m=136269803207465&w=2 This patch should be backported to kernels as old as 3.7, that contain the commit b92cc66c047ff7cf587b318fe377061a353c120f "xHCI: add aborting command ring function" Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05drm/i915/tv: clear adjusted_mode.flagsDaniel Vetter
commit 1062b81598bc00e2f6620e6f3788f8f8df2f01e7 upstream. The native TV encoder has it's own flags to adjust sync modes and enabled interlaced modes which are totally irrelevant for the adjusted mode. This worked out nicely since the input modes used by both the load detect code and reported in the ->get_modes callbacks all have no flags set, and we also don't fill out any of them in the ->get_config callback. This changed with the additional sanitation done with commit 2960bc9cceecb5d556ce1c07656a6609e2f7e8b0 Author: Imre Deak <imre.deak@intel.com> Date: Tue Jul 30 13:36:32 2013 +0300 drm/i915: make user mode sync polarity setting explicit sinc now the "no flags at all" state wouldn't fit through core code any more. So fix this up again by explicitly clearing the flags in the ->compute_config callback. Aside: We have zero checking in place to make sure that the requested mode is indeed the right input mode we want for the selected TV mode. So we'll happily fall over if userspace tries to pull us. But that's definitely work for a different patch series. So just add a FIXME comment for now. Reported-by: Knut Petersen <Knut_Petersen@t-online.de> Cc: Knut Petersen <Knut_Petersen@t-online.de> Cc: Imre Deak <imre.deak@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Knut Petersen <Knut_Petersen@t-online.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05staging: vt6656: [BUG] iwctl_siwencodeext return if device not openMalcolm Priestley
commit 5e8c3d3e41b0bf241e830a1ee0752405adecc050 upstream. Don't allow entry to iwctl_siwencodeext if device not open. This fixes a race condition where wpa supplicant/network manager enters the function when the device is already closed. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05staging: vt6656: [BUG] main_usb.c oops on device_close move flag earlier.Malcolm Priestley
commit e3eb270fab7734427dd8171a93e4946fe28674bc upstream. The vt6656 is prone to resetting on the usb bus. It seems there is a race condition and wpa supplicant is trying to open the device via iw_handlers before its actually closed at a stage that the buffers are being removed. The device is longer considered open when the buffers are being removed. So move ~DEVICE_FLAGS_OPENED flag to before freeing the device buffers. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05ARM: 7837/3: fix Thumb-2 bug in AES assembler codeArd Biesheuvel
commit 40190c85f427dcfdbab5dbef4ffd2510d649da1f upstream. Patch 638591c enabled building the AES assembler code in Thumb2 mode. However, this code used arithmetic involving PC rather than adr{l} instructions to generate PC-relative references to the lookup tables, and this needs to take into account the different PC offset when running in Thumb mode. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05serial: pch_uart: fix tty-kref leak in dma-rx pathJohan Hovold
commit 19b85cfb190eb9980eaf416bff96aef4159a430e upstream. Fix tty_kref leak when tty_buffer_request room fails in dma-rx path. Note that the tty ref isn't really needed anymore, but as the leak has always been there, fixing it before removing should makes it easier to backport the fix. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05serial: pch_uart: fix tty-kref leak in rx-error pathJohan Hovold
commit fc0919c68cb2f75bb1af759315f9d7e2a9443c28 upstream. Fix tty-kref leak introduced by commit 384e301e ("pch_uart: fix a deadlock when pch_uart as console") which never put its tty reference. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05serial: tegra: fix tty-kref leakJohan Hovold
commit cfd29aa0e81b791985e8428e6507e80e074e6730 upstream. Fix potential tty-kref leak in stop_rx path. Signed-off-by: Johan Hovold <jhovold@gmail.com> Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05tty: Fix SIGTTOU not sent with tcflush()Peter Hurley
commit 5cec7bf699c61d14f0538345076480bb8c8ebfbb upstream. Commit 'e7f3880cd9b98c5bf9391ae7acdec82b75403776' tty: Fix recursive deadlock in tty_perform_flush() introduced a regression where tcflush() does not generate SIGTTOU for background process groups. Make sure ioctl(TCFLSH) calls tty_check_change() when invoked from the line discipline. Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05mei: cancel stall timers in mei_resetAlexander Usyskin
commit 4a704575cc1afb3b848f096778fa9b8d7b3d5813 upstream. Unset init_clients_timer and amthif_stall_timers in mei_reset in order to cancel timer ticking and hence avoid recursive reset calls. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05mei: bus: stop wait for read during cl state transitionTomas Winkler
commit e2b31644e999e8bfe3efce880fb32840299abf41 upstream. Bus layer omitted check for client state transition while waiting for read completion The client state transition may occur for example as result of firmware initiated reset Add mei_cl_is_transitioning wrapper to reduce the code repetition.: Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05mei: make me client counters less error proneTomas Winkler
commit 1aee351a739153529fbb98ee461777b2abd5e1c9 upstream. 1. u8 counters are prone to hard to detect overflow: make them unsigned long to match bit_ functions argument type 2. don't check me_clients_num for negativity, it is unsigned. 3. init all the me client counters from one place Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05x86, efi: Don't map Boot Services on i386Josh Boyer
commit 700870119f49084da004ab588ea2b799689efaf7 upstream. Add patch to fix 32bit EFI service mapping (rhbz 726701) Multiple people are reporting hitting the following WARNING on i386, WARNING: at arch/x86/mm/ioremap.c:102 __ioremap_caller+0x3d3/0x440() Modules linked in: Pid: 0, comm: swapper Not tainted 3.9.0-rc7+ #95 Call Trace: [<c102b6af>] warn_slowpath_common+0x5f/0x80 [<c1023fb3>] ? __ioremap_caller+0x3d3/0x440 [<c1023fb3>] ? __ioremap_caller+0x3d3/0x440 [<c102b6ed>] warn_slowpath_null+0x1d/0x20 [<c1023fb3>] __ioremap_caller+0x3d3/0x440 [<c106007b>] ? get_usage_chars+0xfb/0x110 [<c102d937>] ? vprintk_emit+0x147/0x480 [<c1418593>] ? efi_enter_virtual_mode+0x1e4/0x3de [<c102406a>] ioremap_cache+0x1a/0x20 [<c1418593>] ? efi_enter_virtual_mode+0x1e4/0x3de [<c1418593>] efi_enter_virtual_mode+0x1e4/0x3de [<c1407984>] start_kernel+0x286/0x2f4 [<c1407535>] ? repair_env_string+0x51/0x51 [<c1407362>] i386_start_kernel+0x12c/0x12f Due to the workaround described in commit 916f676f8 ("x86, efi: Retain boot service code until after switching to virtual mode") EFI Boot Service regions are mapped for a period during boot. Unfortunately, with the limited size of the i386 direct kernel map it's possible that some of the Boot Service regions will not be directly accessible, which causes them to be ioremap()'d, triggering the above warning as the regions are marked as E820_RAM in the e820 memmap. There are currently only two situations where we need to map EFI Boot Service regions, 1. To workaround the firmware bug described in 916f676f8 2. To access the ACPI BGRT image but since we haven't seen an i386 implementation that requires either, this simple fix should suffice for now. [ Added to changelog - Matt ] Reported-by: Bryan O'Donoghue <bryan.odonoghue.lkml@nexus-software.ie> Acked-by: Tom Zanussi <tom.zanussi@intel.com> Acked-by: Darren Hart <dvhart@linux.intel.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Josh Boyer <jwboyer@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05tools lib lk: Uninclude linux/magic.h in debugfs.cVinson Lee
commit ce7eebe5c3deef8e19c177c24ee75843256e69ca upstream. The compilation only looks for linux/magic.h from the default include paths, which does not include the source tree. This results in a build error if linux/magic.h is not available or not installed. For example, this build error occurs on CentOS 5. $ make -C tools/lib/lk V=1 [...] gcc -o debugfs.o -c -ggdb3 -Wall -Wextra -std=gnu99 -Werror -O6 -D_FORTIFY_SOURCE=2 -Wbad-function-cast -Wdeclaration-after-statement -Wformat-security -Wformat-y2k -Winit-self -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wno-system-headers -Wold-style-definition -Wpacked -Wredundant-decls -Wshadow -Wstrict-aliasing=3 -Wstrict-prototypes -Wswitch-default -Wswitch-enum -Wundef -Wwrite-strings -Wformat -fPIC -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 debugfs.c debugfs.c:8:25: error: linux/magic.h: No such file or directory The only symbol from linux/magic.h needed by debugfs.c is DEBUGFS_MAGIC, and that is already defined in debugfs.h. linux/magic.h isn't providing any extra symbols and can unincluded. This is similar to the approach by perf, which has its own magic.h wrapper at tools/perf/util/include/linux/magic.h Signed-off-by: Vinson Lee <vlee@twitter.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@suse.de> Cc: Vinson Lee <vlee@freedesktop.org> Link: http://lkml.kernel.org/r/1379546200-17028-1-git-send-email-vlee@freedesktop.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05x86/reboot: Add quirk to make Dell C6100 use reboot=pci automaticallyMasoud Sharbiani
commit 4f0acd31c31f03ba42494c8baf6c0465150e2621 upstream. Dell PowerEdge C6100 machines fail to completely reboot about 20% of the time. Signed-off-by: Masoud Sharbiani <msharbiani@twitter.com> Signed-off-by: Vinson Lee <vlee@twitter.com> Cc: Robin Holt <holt@sgi.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Link: http://lkml.kernel.org/r/1379717947-18042-1-git-send-email-vlee@freedesktop.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Fix flushes in writeback modeKent Overstreet
commit c0f04d88e46d14de51f4baebb6efafb7d59e9f96 upstream. In writeback mode, when we get a cache flush we need to make sure we issue a flush to the backing device. The code for sending down an extra flush was wrong - by cloning the bio we were probably getting flags that didn't make sense for a bare flush, and also the old code was firing for FUA bios, for which we don't need to send a flush to the backing device. This was causing data corruption somehow - the mechanism was never determined, but this patch fixes it for the users that were seeing it. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Fix for handling overlapping extents when reading in a btree nodeKent Overstreet
commit 84786438ed17978d72eeced580ab757e4da8830b upstream. btree_sort_fixup() was overly clever, because it was trying to avoid pulling a key off the btree iterator in more than one place. This led to a really obscure bug where we'd break early from the loop in btree_sort_fixup() if the current key overlapped with keys in more than one older set, and the next key it overlapped with was zero size. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Fix a shrinker deadlockKent Overstreet
commit a698e08c82dfb9771e0bac12c7337c706d729b6d upstream. GFP_NOIO means we could be getting called recursively - mca_alloc() -> mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then. Whoops. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Fix a dumb CPU spinning bug in writebackKent Overstreet
commit 79e3dab90d9f826ceca67c7890e048ac9169de49 upstream. schedule_timeout() != schedule_timeout_uninterruptible() Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Fix a flush/fua performance bugKent Overstreet
commit 1394d6761b6e9e15ee7c632a6d48791188727b40 upstream. bch_journal_meta() was missing the flush to make the journal write actually go down (instead of waiting up to journal_delay_ms)... Whoops Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Fix a writeback performance regressionKent Overstreet
commit c2a4f3183a1248f615a695fbd8905da55ad11bba upstream. Background writeback works by scanning the btree for dirty data and adding those keys into a fixed size buffer, then for each dirty key in the keybuf writing it to the backing device. When read_dirty() finishes and it's time to scan for more dirty data, we need to wait for the outstanding writeback IO to finish - they still take up slots in the keybuf (so that foreground writes can check for them to avoid races) - without that wait, we'll continually rescan when we'll be able to add at most a key or two to the keybuf, and that takes locks that starves foreground IO. Doh. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Fix for when no journal entries are foundKent Overstreet
commit c426c4fd46f709ade2bddd51c5738729c7ae1db5 upstream. The journal replay code didn't handle this case, causing it to go into an infinite loop... Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Strip endline when writing the label through sysfsGabriel de Perthuis
commit aee6f1cfff3ce240eb4b43b41ca466b907acbd2e upstream. sysfs attributes with unusual characters have crappy failure modes in Squeeze (udev 164); later versions of udev are unaffected. This should make these characters more unusual. Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05bcache: Fix a dumb journal discard bugKent Overstreet
commit 6d9d21e35fbfa2934339e96934f862d118abac23 upstream. That switch statement was obviously wrong, leading to some sort of weird spinning on rare occasion with discards enabled... Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05sysv: Add forgotten superblock lock init for v7 fsLubomir Rintel
commit 49475555848d396a0c78fb2f8ecceb3f3f263ef1 upstream. Superblock lock was replaced with (un)lock_super() removal, but left uninitialized for Seventh Edition UNIX filesystem in the following commit (3.7): c07cb01 sysv: drop lock/unlock super Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-05block: Fix bio_copy_data()Kent Overstreet
commit 2f6cf0de0281d210061ce976f2d42d246adc75bb upstream. The memcpy() in bio_copy_data() was using the wrong offset vars, leading to data corruption in weird unusual setups. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01Linux 3.10.14v3.10.14Greg Kroah-Hartman
2013-10-01netfilter: ipset: Fix serious failure in CIDR trackingOliver Smith
commit 2cf55125c64d64cc106e204d53b107094762dfdf upstream. This fixes a serious bug affecting all hash types with a net element - specifically, if a CIDR value is deleted such that none of the same size exist any more, all larger (less-specific) values will then fail to match. Adding back any prefix with a CIDR equal to or more specific than the one deleted will fix it. Steps to reproduce: ipset -N test hash:net ipset -A test 1.1.0.0/16 ipset -A test 2.2.2.0/24 ipset -T test 1.1.1.1 #1.1.1.1 IS in set ipset -D test 2.2.2.0/24 ipset -T test 1.1.1.1 #1.1.1.1 IS NOT in set This is due to the fact that the nets counter was unconditionally decremented prior to the iteration that shifts up the entries. Now, we first check if there is a proceeding entry and if not, decrement it and return. Otherwise, we proceed to iterate and then zero the last element, which, in most cases, will already be zero. Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01rpc: let xdr layer allocate gssproxy receieve pagesJ. Bruce Fields
commit d4a516560fc96a9d486a9939bcb567e3fdce8f49 upstream. In theory the linux cred in a gssproxy reply can include up to NGROUPS_MAX data, 256K of data. In the common case we expect it to be shorter. So do as the nfsv3 ACL code does and let the xdr code allocate the pages as they come in, instead of allocating a lot of pages that won't typically be used. Tested-by: Simo Sorce <simo@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01rpc: fix huge kmalloc's in gss-proxyJ. Bruce Fields
commit 9dfd87da1aeb0fd364167ad199f40fe96a6a87be upstream. The reply to a gssproxy can include up to NGROUPS_MAX gid's, which will take up more than a page. We therefore need to allocate an array of pages to hold the reply instead of trying to allocate a single huge buffer. Tested-by: Simo Sorce <simo@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01rpc: comment on linux_cred encoding, treat all as unsignedJ. Bruce Fields
commit 6a36978e6931e6601be586eb313375335f2cfaa3 upstream. The encoding of linux creds is a bit confusing. Also: I think in practice it doesn't really matter whether we treat any of these things as signed or unsigned, but unsigned seems more straightforward: uid_t/gid_t are unsigned and it simplifies the ngroups overflow check. Tested-by: Simo Sorce <simo@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01rpc: clean up decoding of gssproxy linux credsJ. Bruce Fields
commit 778e512bb1d3315c6b55832248cd30c566c081d7 upstream. We can use the normal coding infrastructure here. Two minor behavior changes: - we're assuming no wasted space at the end of the linux cred. That seems to match gss-proxy's behavior, and I can't see why it would need to do differently in the future. - NGROUPS_MAX check added: note groups_alloc doesn't do this, this is the caller's responsibility. Tested-by: Simo Sorce <simo@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01cfq: explicitly use 64bit divide operation for 64bit argumentsAnatol Pomozov
commit f3cff25f05f2ac29b2ee355e611b0657482f6f1d upstream. 'samples' is 64bit operant, but do_div() second parameter is 32. do_div silently truncates high 32 bits and calculated result is invalid. In case if low 32bit of 'samples' are zeros then do_div() produces kernel crash. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Cc: Jonghwan Choi <jhbird.choi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01bio-integrity: Fix use of bs->bio_integrity_pool after freeBjorn Helgaas
commit adbe6991efd36104ac9eaf751993d35eaa7f493a upstream. This fixes a copy and paste error introduced by 9f060e2231 ("block: Convert integrity to bvec_alloc_bs()"). Found by Coverity (CID 1020654). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Kent Overstreet <koverstreet@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Cc: Jonghwan Choi <jhbird.choi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01perf tools: Handle JITed code in shared memoryAndi Kleen
commit 89365e6c9ad4c0e090e4c6a4b67a3ce319381d89 upstream. Need to check for /dev/zero. Most likely more strings are missing too. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1366848182-30449-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01mm: fix aio performance regression for database caused by THPKhalid Aziz
commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream. I am working with a tool that simulates oracle database I/O workload. This tool (orion to be specific - <http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>) allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag. It then does aio into these pages from flash disks using various common block sizes used by database. I am looking at performance with two of the most common block sizes - 1M and 64K. aio performance with these two block sizes plunged after Transparent HugePages was introduced in the kernel. Here are performance numbers: pre-THP 2.6.39 3.11-rc5 1M read 8384 MB/s 5629 MB/s 6501 MB/s 64K read 7867 MB/s 4576 MB/s 4251 MB/s I have narrowed the performance impact down to the overheads introduced by THP in __get_page_tail() and put_compound_page() routines. perf top shows >40% of cycles being spent in these two routines. Every time direct I/O to hugetlbfs pages starts, kernel calls get_page() to grab a reference to the pages and calls put_page() when I/O completes to put the reference away. THP introduced significant amount of locking overhead to get_page() and put_page() when dealing with compound pages because hugepages can be split underneath get_page() and put_page(). It added this overhead irrespective of whether it is dealing with hugetlbfs pages or transparent hugepages. This resulted in 20%-45% drop in aio performance when using hugetlbfs pages. Since hugetlbfs pages can not be split, there is no reason to go through all the locking overhead for these pages from what I can see. I added code to __get_page_tail() and put_compound_page() to bypass all the locking code when working with hugetlbfs pages. This improved performance significantly. Performance numbers with this patch: pre-THP 3.11-rc5 3.11-rc5 + Patch 1M read 8384 MB/s 6501 MB/s 8371 MB/s 64K read 7867 MB/s 4251 MB/s 6510 MB/s Performance with 64K read is still lower than what it was before THP, but still a 53% improvement. It does mean there is more work to be done but I will take a 53% improvement for now. Please take a look at the following patch and let me know if it looks reasonable. [akpm@linux-foundation.org: tweak comments] Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com> Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Christoph Lameter <cl@linux.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01audit: fix endless wait in audit_log_start()Konstantin Khlebnikov
commit 8ac1c8d5deba65513b6a82c35e89e73996c8e0d6 upstream. After commit 829199197a43 ("kernel/audit.c: avoid negative sleep durations") audit emitters will block forever if userspace daemon cannot handle backlog. After the timeout the waiting loop turns into busy loop and runs until daemon dies or returns back to work. This is a minimal patch for that bug. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Richard Guy Briggs <rgb@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Chuck Anderson <chuck.anderson@oracle.com> Cc: Dan Duval <dan.duval@oracle.com> Cc: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jonghwan Choi <jhbird.choi@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-01udf: Refuse RW mount of the filesystem instead of making it ROJan Kara
commit e729eac6f65e11c5f03b09adcc84bd5bcb230467 upstream. Refuse RW mount of udf filesystem. So far we just silently changed it to RO mount but when the media is writeable, block layer won't notice this change and thus will think device is used RW and will block eject button of the drive. That is unexpected by users because for non-writeable media eject button works just fine. Userspace mount(8) command handles this just fine and retries mounting with MS_RDONLY set so userspace shouldn't see any regression. Plus any tool mounting udf is likely confronted with the case of read-only media where block layer already refuses to mount the filesystem without MS_RDONLY set so our behavior shouldn't be anything new for it. Reported-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>