linux - Linux kernel source tree

Age	Commit message (Collapse)	Author
2009-04-27	Linux 2.6.29.2v2.6.29.2	Chris Wright

2009-04-27	Bonding: fix zero address hole bug in arp_ip_target list	Brian Haley
	upstream commit: 5a31bec014449dc9ca994e4c1dbf2802b7ca458a Fix a zero address hole bug in the bonding arp_ip_target list that was causing the bond to ignore ARP replies (bugz 13006). Instead of just setting the array entry to zero, we now copy any additional entries down one slot, putting the zero entry at the end. With this change we can now have all the loops that walk the array stop when they hit a zero since there will be no addresses after it. Changes are based in part on code fragment provided in kernel: bugzilla 13006: http://bugzilla.kernel.org/show_bug.cgi?id=13006 by Steve Howard <steve@astutenetworks.com> Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	skge: fix occasional BUG during MTU change	Michal Schmidt
	upstream commit: d119b3927994e3d620d6adb0dd1ea6bf24427875 The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() was sometimes observed when setting MTU. skge_down() disables the TX queue, but then reenables it by mistake via skge_tx_clean(). Fix it by moving the waking of the queue from skge_tx_clean() to the other caller. And to make sure start_xmit is not in progress on another CPU, skge_down() should call netif_tx_disable(). The bug was reported to me by Jiri Jilek whose Debian system sometimes failed to boot. He tested the patch and the bug did not happen anymore. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	scsi: mpt: suppress debugobjects warning	Eric Paris
	upstream commit: b298cecb3deddf76d60022473a57f1cb776cbdcd Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13133 ODEBUG: object is on stack, but not annotated ------------[ cut here ]------------ WARNING: at lib/debugobjects.c:253 __debug_object_init+0x1f3/0x276() Hardware name: VMware Virtual Platform Modules linked in: mptspi(+) mptscsih mptbase scsi_transport_spi ext3 jbd mbcache Pid: 540, comm: insmod Not tainted 2.6.28-mm1 #2 Call Trace: [<c042c51c>] warn_slowpath+0x74/0x8a [<c0469600>] ? start_critical_timing+0x96/0xb7 [<c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c [<c0446fad>] ? trace_hardirqs_off_caller+0x18/0xaf [<c044704f>] ? trace_hardirqs_off+0xb/0xd [<c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c [<c042cb84>] ? release_console_sem+0x1a5/0x1ad [<c05013e6>] __debug_object_init+0x1f3/0x276 [<c0501494>] debug_object_init+0x13/0x17 [<c0433c56>] init_timer+0x10/0x1a [<e08e5b54>] mpt_config+0x1c1/0x2b7 [mptbase] [<e08e3b82>] ? kmalloc+0x8/0xa [mptbase] [<e08e3b82>] ? kmalloc+0x8/0xa [mptbase] [<e08e6fa2>] mpt_do_ioc_recovery+0x950/0x1212 [mptbase] [<c04496c2>] ? __lock_acquire+0xa69/0xacc [<c060c8f1>] ? _spin_unlock_irqrestore+0x36/0x3c [<c060c3af>] ? _spin_unlock_irq+0x22/0x26 [<c04f2d8b>] ? string+0x2b/0x76 [<c04f310e>] ? vsnprintf+0x338/0x7b3 [<c04496c2>] ? __lock_acquire+0xa69/0xacc [<c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c [<c04496c2>] ? __lock_acquire+0xa69/0xacc [<c044897d>] ? debug_check_no_locks_freed+0xeb/0x105 [<c060c8f1>] ? _spin_unlock_irqrestore+0x36/0x3c [<c04488bc>] ? debug_check_no_locks_freed+0x2a/0x105 [<c0446b8c>] ? lock_release_holdtime+0x43/0x48 [<c043f742>] ? up_read+0x16/0x29 [<c05076f8>] ? pci_get_slot+0x66/0x72 [<e08e89ca>] mpt_attach+0x881/0x9b1 [mptbase] [<e091c8e5>] mptspi_probe+0x11/0x354 [mptspi] Noticing that every caller of mpt_config has its CONFIGPARMS struct declared on the stack and thus the &pCfg->timer is always on the stack I changed init_timer() to init_timer_on_stack() and it seems to have shut up..... Cc: "Moore, Eric Dean" <Eric.Moore@lsil.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: "Desai, Kashyap" <Kashyap.Desai@lsi.com> Cc: <stable@kernel.org> [2.6.29.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	hugetlbfs: return negative error code for bad mount option	Akinobu Mita
	upstream commit: c12ddba09394c60e1120e6997794fa6ed52da884 This fixes the following BUG: # mount -o size=MM -t hugetlbfs none /huge hugetlbfs: Bad value 'MM' for mount option 'size=MM' ------------[ cut here ]------------ kernel BUG at fs/super.c:996! Due to BUG_ON(!mnt->mnt_sb); in vfs_kern_mount(). Also, remove unused #include <linux/quotaops.h> Cc: William Irwin <wli@holomorphy.com> Cc: <stable@kernel.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	NFS: Fix the XDR iovec calculation in nfs3_xdr_setaclargs	Trond Myklebust
	upstream commit: 8340437210390676f687633a80e3748c40885dc8 Commit ae46141ff08f1965b17c531b571953c39ce8b9e2 (NFSv3: Fix posix ACL code) introduces a bug in the calculation of the XDR header iovec. In the case where we are inlining the acls, we need to adjust the length of the iovec req->rq_svec, in addition to adjusting the total buffer length. Tested-by: Leonardo Chiquitto <leonardo.lists@gmail.com> Tested-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	gso: Fix support for linear packets	Herbert Xu
	upstream commit: 2f181855a0b3c2b39314944add7b41c15647cf86 When GRO/frag_list support was added to GSO, I made an error which broke the support for segmenting linear GSO packets (GSO packets are normally non-linear in the payload). These days most of these packets are constructed by the tun driver, which prefers to allocate linear memory if possible. This is fixed in the latest kernel, but for 2.6.29 and earlier it is still the norm. Therefore this bug causes failures with GSO when used with tun in 2.6.29. Reported-by: James Huang <jamesclhuang@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	agp: zero pages before sending to userspace	Shaohua Li
	upstream commit: 59de2bebabc5027f93df999d59cc65df591c3e6e CVE-2009-1192 AGP pages might be mapped into userspace finally, so the pages should be set to zero before userspace can use it. Otherwise there is potential information leakage. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	virtio: fix suspend when using virtio_balloon	Marcelo Tosatti
	upstream commit: 84a139a985300901dfad99bd93c7345d180af860 Break out of wait_event_interruptible() if freezing has been requested, in the vballoon thread. Without this change vballoon refuses to stop and the system can't suspend. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	Revert "console ASCII glyph 1:1 mapping"	Samuel Thibault
	upstream commit: c0b7988200a82290287c6f4cd49585007f73175a This reverts commit 1c55f18717304100a5f624c923f7cb6511b4116d. Ingo Brueckl was assuming that reverting to 1:1 mapping for chars >= 128 was not useful, but it happens to be: due to the limitations of the Linux console, when a blind user wants to read BIG5 on it, he has no other way than loading a font without SFM and let the 1:1 mapping permit the screen reader to get the BIG5 encoding. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	Input: gameport - fix attach driver code	Dmitry Torokhov
	upstream commit: 4ced8e7cb990a2c3bbf0ac7f27b35c890e7ce895 The commit 6902c0bead4ce266226fc0c5b3828b850bdc884a that moved driver registration out of kgameportd thread was incomplete and did not add the code necessary to actually attach driver to already registered devices, rectify that. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	x86, PAT: Remove page granularity tracking for vm_insert_pfn maps	Pallipadi, Venkatesh
	upstream commit: 4b065046273afa01ec8e3de7da407e8d3599251d This change resolves the problem of too many single page entries in pat_memtype_list and "freeing invalid memtype" errors with i915, reported here: http://marc.info/?l=linux-kernel&m=123845244713183&w=2 Remove page level granularity track and untrack of vm_insert_pfn. memtype tracking at page granularity does not scale and cleaner approach would be for the driver to request a type for a bigger IO address range or PCI io memory range for that device, either at mmap time or driver init time and just use that type during vm_insert_pfn. This patch just removes the track/untrack of vm_insert_pfn. That means we will be in same state as 2.6.28, with respect to these APIs. Newer APIs for the drivers to request a memtype for a bigger region is coming soon. [ Impact: fix Xorg startup warnings and hangs ] Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Tested-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <20090408223716.GC3493@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	KVM: is_long_mode() should check for EFER.LMA	Amit Shah
	upstream commit: 41d6af119206e98764b4ae6d264d63acefcf851e is_long_mode currently checks the LongModeEnable bit in EFER instead of the LongModeActive bit. This is wrong, but we survived this till now since it wasn't triggered. This breaks guests that go from long mode to compatibility mode. This is noticed on a solaris guest and fixes bug #1842160 Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	KVM: VMX: Update necessary state when guest enters long mode	Amit Shah
	upstream commit: 401d10dee083bda281f2fdcdf654080313ba30ec setup_msrs() should be called when entering long mode to save the shadow state for the 64-bit guest state. Using vmx_set_efer() in enter_lmode() removes some duplicated code and also ensures we call setup_msrs(). We can safely pass the value of shadow_efer to vmx_set_efer() as no other bits in the efer change while enabling long mode (guest first sets EFER.LME, then sets CR0.PG which causes a vmexit where we activate long mode). With this fix, is_long_mode() can check for EFER.LMA set instead of EFER.LME and 5e23049e86dd298b72e206b420513dbc3a240cd9 can be reverted. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	KVM: fix kvm_vm_ioctl_deassign_device	Weidong Han
	upstream commit: 4a906e49f103c2e544148a209ba1db316510799f only need to set assigned_dev_id for deassignment, use match->flags to judge and deassign it. Acked-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	KVM: MMU: handle compound pages in kvm_is_mmio_pfn	Joerg Roedel
	upstream commit: fc5659c8c6b6c4e02ac354b369017c1bf231f347 The function kvm_is_mmio_pfn is called before put_page is called on a page by KVM. This is a problem when when this function is called on some struct page which is part of a compund page. It does not test the reserved flag of the compound page but of the struct page within the compount page. This is a problem when KVM works with hugepages allocated at boot time. These pages have the reserved bit set in all tail pages. Only the flag in the compount head is cleared. KVM would not put such a page which results in a memory leak. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	KVM: Reset PIT irq injection logic when the PIT IRQ is unmasked	Avi Kivity
	upstream commit: 4780c65904f0fc4e312ee2da9383eacbe04e61ea While the PIT is masked the guest cannot ack the irq, so the reinject logic will never allow the interrupt to be injected. Fix by resetting the reinjection counters on unmask. Unbreaks Xen. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	KVM: Interrupt mask notifiers for ioapic	Avi Kivity
	upstream commit: 75858a84a6207f5e60196f6bbd18fde4250e5759 Allow clients to request notifications when the guest masks or unmasks a particular irq line. This complements irq ack notifications, as the guest will not ack an irq line that is masked. Currently implemented for the ioapic only. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	KVM: Add CONFIG_HAVE_KVM_IRQCHIP	Avi Kivity
	upstream commit: 5d9b8e30f543a9f21a968a4cda71e8f6d1c66a61 Two KVM archs support irqchips and two don't. Add a Kconfig item to make selecting between the two models easier. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	KVM: Fix missing smp tlb flush in invlpg	Andrea Arcangeli
	upstream commit: 4539b35881ae9664b0e2953438dd83f5ee02c0b4 When kvm emulates an invlpg instruction, it can drop a shadow pte, but leaves the guest tlbs intact. This can cause memory corruption when swapping out. Without this the other cpu can still write to a freed host physical page. tlb smp flush must happen if rmap_remove is called always before mmu_lock is released because the VM will take the mmu_lock before it can finally add the page to the freelist after swapout. mmu notifier makes it safe to flush the tlb after freeing the page (otherwise it would never be safe) so we can do a single flush for multiple sptes invalidated. Cc: stable@kernel.org Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> [mtosatti: backport to 2.6.29] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	USB: usb-storage: augment unusual_devs entry for Simple Tech/Datafab	Alan Stern
	upstream commit: e4813eec8d47c8299d968bd5349dc881fa481c26 This patch (as1227) adds the MAX_SECTORS_64 flag to the unusual_devs entry for the Simple Tech/Datafab controller. This fixes Bugzilla #12882. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: binbin <binbinsh@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	USB: fix oops in cdc-wdm in case of malformed descriptors	Oliver Neukum
	upstream commit: e13c594f3a1fc2c78e7a20d1a07974f71e4b448f cdc-wdm needs to ignore extremely malformed descriptors. Signed-off-by: Oliver Neukum <oliver@neukum.org> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	USB: ftdi_sio: add vendor/project id for JETI specbos 1201 spectrometer	Peter Korsgaard
	upstream commit: ae27d84351f1f3568118318a8c40ff3a154bd629 Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	usb gadget: fix ethernet link reports to ethtool	Jonathan McDowell
	upstream commit: 237e75bf1e558f7330f8deb167fa3116405bef2c The g_ether USB gadget driver currently decides whether or not there's a link to report back for eth_get_link based on if the USB link speed is set. The USB gadget speed is however often set even before the device is enumerated. It seems more sensible to only report a "link" if we're actually connected to a host that wants to talk to us. The patch below does this for me - tested with the PXA27x UDC driver. Signed-off-by: Jonathan McDowell <noodles@earth.li> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	x86: disable X86_PTRACE_BTS for now	Ingo Molnar
	upstream commit: d45b41ae8da0f54aec0eebcc6f893ba5f22a1e8e Oleg Nesterov found a couple of races in the ptrace-bts code and fixes are queued up for it but they did not get ready in time for the merge window. We'll merge them in v2.6.31 - until then mark the feature as CONFIG_BROKEN. There's no user-space yet making use of this so it's not a big issue. Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> [chrisw: trivial 2.6.29 backport] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	SCSI: sg: fix q->queue_lock on scsi_error_handler path	FUJITA Tomonori
	upstream commit: 015640edb1f346e0b2eda703587c4cd1c310ec1d sg_rq_end_io() is called via rq->end_io. In some rare cases, sg_rq_end_io calls blk_put_request/blk_rq_unmap_user (when a program issuing a command has gone before the command completion; e.g. by interrupting a program issuing a command before the command completes). We can't call blk_put_request/blk_rq_unmap_user in interrupt so the commit c96952ed7031e7c576ecf90cf95b8ec099d5295a uses execute_in_process_context(). The problem is that scsi_error_handler() calls rq->end_io too. We can't call blk_put_request/blk_rq_unmap_user too in this path (we hold q->queue_lock). To avoid the above problem, in these rare cases, this patch always uses schedule_work() instead of execute_in_process_context(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	SCSI: sg: avoid blk_put_request/blk_rq_unmap_user in interrupt	FUJITA Tomonori
	upstream commit: c96952ed7031e7c576ecf90cf95b8ec099d5295a This fixes the following oops: http://marc.info/?l=linux-kernel&m=123316111415677&w=2 You can reproduce this bug by interrupting a program before a sg response completes. This leads to the special sg state (the orphan state), then sg calls blk_put_request in interrupt (rq->end_io). The above bug report shows the recursive lock problem because sg calls blk_put_request in interrupt. We could call __blk_put_request here instead however we also need to handle blk_rq_unmap_user here, which can't be called in interrupt too. In the orphan state, we don't need to care about the data transfer (the program revoked the command) so adding 'just free the resource' mode to blk_rq_unmap_user is a possible option. I prefer to avoid complicating the blk mapping API when possible. I change the orphan state to call sg_finish_rem_req via execute_in_process_context. We hold sg_fd->kref so sg_fd doesn't go away until keventd_wq finishes our work. copy_from_user/to_user fails so blk_rq_unmap_user just frees the resource without the data transfer. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	SCSI: sg: fix races with ioctl(SG_IO)	Tony Battersby
	upstream commit: a2dd3b4cea335713b58996bb07b3abcde1175f47 sg_io_owned needs to be set before the command is sent to the midlevel; otherwise, a quickly-completing command may cause a different CPU to see "srp->done == 1 && !srp->sg_io_owned", which would lead to incorrect behavior. Check srp->done and set srp->orphan while holding rq_list_lock to prevent races with sg_rq_end_io(). There is no need to check sfp->closed from read/write/ioctl/poll/etc. since the kernel guarantees that this won't happen. The usefulness of sg_srp_done() was questionable before; now it is definitely not needed. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	SCSI: sg: fix races during device removal	Tony Battersby
	upstream commit: c6517b7942fad663cc1cf3235cbe4207cf769332 sg has the following problems related to device removal: * opening a sg fd races with removing a device * closing a sg fd races with removing a device * /proc/scsi/sg/* access races with removing a device * command completion races with removing a device * command completion races with closing a sg fd * can rmmod sg with active commands These problems can cause kernel oopses, memory-use-after-free, or double-free errors. This patch fixes these problems by using krefs to manage the lifetime of sg_device and sg_fd. Each command submitted to the midlevel holds a reference to sg_fd until the completion callback. This ensures that sg_fd doesn't go away if the fd is closed with commands still outstanding. sg_fd gets the reference of sg_device (with scsi_device) and also makes sure that the sg module doesn't go away. /proc/scsi/sg/* functions don't play nicely with krefs because they give information about sg_fds which have been closed but not yet freed due to still having outstanding commands and sg_devices which have been removed but not yet freed due to still being referenced by one or more sg_fds. To deal with this safely without removing functionality, /proc functions now access sg_device and sg_fd while holding a lock instead of using kref_get()/kref_put(). Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> [chrisw: big for -stable, helps fix real bug, and made it through rc2 upstream] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	mm: pass correct mm when growing stack	Hugh Dickins
	upstream commit: 05fa199d45c54a9bda7aa3ae6537253d6f097aa9 Tetsuo Handa reports seeing the WARN_ON(current->mm == NULL) in security_vm_enough_memory(), when do_execve() is touching the target mm's stack, to set up its args and environment. Yes, a UMH_NO_WAIT or UMH_WAIT_PROC call_usermodehelper() spawns an mm-less kernel thread to do the exec. And in any case, that vm_enough_memory check when growing stack ought to be done on the target mm, not on the execer's mm (though apart from the warning, it only makes a slight tweak to OVERCOMMIT_NEVER behaviour). Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	pata_hpt37x: fix HPT370 DMA timeouts	Sergei Shtylyov
	upstream commit: 265b7215aed36941620b65ecfff516200fb190c1 The libata driver has copied the code from the IDE driver which caused a post 2.4.18 regression on many HPT370[A] chips -- DMA stopped to work completely, only causing timeouts. Now remove hpt370_bmdma_start() for good... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	hpt366: fix HPT370 DMA timeouts	Sergei Shtylyov
	upstream commit: c018f1ee5cf81e58b93d9e93a2ee39cad13dc1ac The big driver change in 2.4.19-rc1 introduced a regression for many HPT370[A] chips -- DMA stopped to work completely, only causing endless timeouts... The culprit has been identified (at last!): it turned to be the code resetting the DMA state machine before each transfer. Stop doing it now as this counter- measure has clearly caused more harm than good. This should fix the kernel.org bug #7703. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	powerpc: Fix data-corrupting bug in __futex_atomic_op	Paul Mackerras
	upstream commit: 306a82881b14d950d59e0b59a55093a07d82aa9a Richard Henderson pointed out that the powerpc __futex_atomic_op has a bug: it will write the wrong value if the stwcx. fails and it has to retry the lwarx/stwcx. loop, since 'oparg' will have been overwritten by the result from the first time around the loop. This happens because it uses the same register for 'oparg' (an input) as it uses for the result. This fixes it by using separate registers for 'oparg' and 'ret'. Cc: stable@kernel.org Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	ALSA: hda - Fix the cmd cache keys for amp verbs	Takashi Iwai
	upstream commit: fcad94a4c71c36a05f4d5c6dcb174534b4e0b136 Fix the key value generation for get/set amp verbs. The upper bits of the parameter have to be combined with the verb value to be unique for each direction/index of amp access. This fixes the resume problem on some hardwares like Macbook after the channel mode is changed. Tested-by: Johannes Berg <johannes@sipsolutions.net> Cc: <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	sfc: Match calls to netif_napi_add() and netif_napi_del()	Ben Hutchings
	upstream commit: 718cff1eec595ce6ab0635b8160a51ee37d9268d sfc could call netif_napi_add() multiple times for the same napi_struct, corrupting the list of napi_structs for the associated device and leading to a busy-loop on device removal. Move the call to netif_napi_add() and add a call to netif_napi_del() in the obvious places. [bhutchings: backport to 2.6.29] Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	tty: Fix leak in ti-usb	Alan Cox
	upstream commit: cf5450930db0ae308584e5361f3345e0ff73e643 If the ti-usb adapter returns an zero data length frame (which happens) then we leak a kref. Found by Christoph Mair <christoph.mair@gmail.com> who proposed a patch. The patch here is different as Christoph's patch didn't work for the case where tty = NULL and data arrived but Christoph did all the hard work chasing it down. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	spi: spi_write_then_read() bugfixes	David Brownell
	upstream commit: bdff549ebeff92b1a6952e5501caf16a6f8898c8 The "simplify spi_write_then_read()" patch included two regressions from the 2.6.27 behaviors: - The data it wrote out during the (full duplex) read side of the transfer was not zeroed. - It fails completely on half duplex hardware, such as Microwire and most "3-wire" SPI variants. So, revert that patch. A revised version should be submitted at some point, which can get the speedup on standard hardware (full duplex) without breaking on less-capable half-duplex stuff. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: <stable@kernel.org> [2.6.28.x, 2.6.29.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	add some long-missing capabilities to fs_mask	Serge E. Hallyn
	upstream commit: 0ad30b8fd5fe798aae80df6344b415d8309342cc When POSIX capabilities were introduced during the 2.1 Linux cycle, the fs mask, which represents the capabilities which having fsuid==0 is supposed to grant, did not include CAP_MKNOD and CAP_LINUX_IMMUTABLE. However, before capabilities the privilege to call these did in fact depend upon fsuid==0. This patch introduces those capabilities into the fsmask, restoring the old behavior. See the thread starting at http://lkml.org/lkml/2009/3/11/157 for reference. Note that if this fix is deemed valid, then earlier kernel versions (2.4 and 2.2) ought to be fixed too. Changelog: [Mar 23] Actually delete old CAP_FS_SET definition... [Mar 20] Updated against J. Bruce Fields's patch Reported-by: Igor Zhbanov <izh1979@gmail.com> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: stable@kernel.org Cc: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	hrtimer: fix rq->lock inversion (again)	Peter Zijlstra
	upstream commit: 7f1e2ca9f04b02794597f60e7b1d43f0a1317939 It appears I inadvertly introduced rq->lock recursion to the hrtimer_start() path when I delegated running already expired timers to softirq context. This patch fixes it by introducing a __hrtimer_start_range_ns() method that will not use raise_softirq_irqoff() but __raise_softirq_irqoff() which avoids the wakeup. It then also changes schedule() to check for pending softirqs and do the wakeup then, I'm not quite sure I like this last bit, nor am I convinced its really needed. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org LKML-Reference: <20090313112301.096138802@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	x86: fix broken irq migration logic while cleaning up multiple vectors	Suresh Siddha
	upstream commit: 68a8ca593fac82e336a792226272455901fa83df Impact: fix spurious IRQs During irq migration, we send a low priority interrupt to the previous irq destination. This happens in non interrupt-remapping case after interrupt starts arriving at new destination and in interrupt-remapping case after modifying and flushing the interrupt-remapping table entry caches. This low priority irq cleanup handler can cleanup multiple vectors, as multiple irq's can be migrated at almost the same time. While there will be multiple invocations of irq cleanup handler (one cleanup IPI for each irq migration), first invocation of the cleanup handler can potentially cleanup more than one vector (as the first invocation can see the requests for more than vector cleanup). When we cleanup multiple vectors during the first invocation of the smp_irq_move_cleanup_interrupt(), other vectors that are to be cleanedup can still be pending in the local cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled). When we are ready to unhook a vector corresponding to an irq, check if that vector is registered in the local cpu's IRR. If so skip that cleanup and do a self IPI with the cleanup vector, so that we give a chance to service the pending vector interrupt and then cleanup that vector allocation once we execute the lowest priority handler. This fixes spurious interrupts seen when migrating multiple vectors at the same time. [ This is apparently possible even on conventional xapic, although to the best of our knowledge it has never been seen. The stable maintainers may wish to consider this one for -stable. ] [suresh: backport to 2.6.29] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: stable@kernel.org Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	sched: do not count frozen tasks toward load	Nathan Lynch
	upstream commit: e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd Freezing tasks via the cgroup freezer causes the load average to climb because the freezer's current implementation puts frozen tasks in uninterruptible sleep (D state). Some applications which perform job-scheduling functions consult the load average when making decisions. If a cgroup is frozen, the load average does not provide a useful measure of the system's utilization to such applications. This is especially inconvenient if the job scheduler employs the cgroup freezer as a mechanism for preempting low priority jobs. Contrast this with using SIGSTOP for the same purpose: the stopped tasks do not count toward system load. Change task_contributes_to_load() to return false if the task is frozen. This results in /proc/loadavg behavior that better meets users' expectations. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Nigel Cunningham <nigel@tuxonice.net> Tested-by: Nigel Cunningham <nigel@tuxonice.net> Cc: <stable@kernel.org> Cc: containers@lists.linux-foundation.org Cc: linux-pm@lists.linux-foundation.org Cc: Matt Helsley <matthltc@us.ibm.com> LKML-Reference: <20090408194512.47a99b95@manatee.lan> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	dm kcopyd: fix callback race	Mikulas Patocka
	upstream commit: 340cd44451fb0bfa542365e6b4b565bbd44836e2 If the thread calling dm_kcopyd_copy is delayed due to scheduling inside split_job/segment_complete and the subjobs complete before the loop in split_job completes, the kcopyd callback could be invoked from the thread that called dm_kcopyd_copy instead of the kcopyd workqueue. dm_kcopyd_copy -> split_job -> segment_complete -> job->fn() Snapshots depend on the fact that callbacks are called from the singlethreaded kcopyd workqueue and expect that there is no racing between individual callbacks. The racing between callbacks can lead to corruption of exception store and it can also mean that exception store callbacks are called twice for the same exception - a likely reason for crashes reported inside pending_complete() / remove_exception(). This patch fixes two problems: 1. job->fn being called from the thread that submitted the job (see above). - Fix: hand over the completion callback to the kcopyd thread. 2. job->fn(read_err, write_err, job->context); in segment_complete reports the error of the last subjob, not the union of all errors. - Fix: pass job->write_err to the callback to report all error bits (it is done already in run_complete_job) Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	dm kcopyd: prepare for callback race fix	Mikulas Patocka
	upstream commit: 73830857bca6f6c9dbd48e906daea50bea42d676 Use a variable in segment_complete() to point to the dm_kcopyd_client struct and only release job->pages in run_complete_job() if any are defined. These changes are needed by the next patch. Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	posix-timers: fix RLIMIT_CPU && setitimer(CPUCLOCK_PROF)	Oleg Nesterov
	upstream commit: 8f2e586567b1bad72dac7c3810fe9a2ef7117506 update_rlimit_cpu() tries to optimize out set_process_cpu_timer() in case when we already have CPUCLOCK_PROF timer which should expire first. But it uses cputime_lt() instead of cputime_gt(). Test case: int main(void) { struct itimerval it = { .it_value = { .tv_sec = 1000 }, }; assert(!setitimer(ITIMER_PROF, &it, NULL)); struct rlimit rl = { .rlim_cur = 1, .rlim_max = 1, }; assert(!setrlimit(RLIMIT_CPU, &rl)); for (;;) ; return 0; } Without this patch, the task is not killed as RLIMIT_CPU demands. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Peter Lojkin <ia6432@inbox.ru> Cc: Roland McGrath <roland@redhat.com> Cc: stable@kernel.org LKML-Reference: <20090327000610.GA10108@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	posix-timers: fix RLIMIT_CPU && fork()	Oleg Nesterov
	upstream commit: 6279a751fe096a21dc7704e918d570d3ff06e769 See http://bugzilla.kernel.org/show_bug.cgi?id=12911 copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus fastpath_timer_check() returns false unless we have other expired cpu timers. Change copy_signal() to set cputime_expires.prof_exp if we have RLIMIT_CPU. Also, set cputimer.running = 1 in that case. This is not strictly necessary, but imho makes sense. Reported-by: Peter Lojkin <ia6432@inbox.ru> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Peter Lojkin <ia6432@inbox.ru> Cc: Roland McGrath <roland@redhat.com> Cc: stable@kernel.org LKML-Reference: <20090327000607.GA10104@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	posixtimers, sched: Fix posix clock monotonicity	Hidetoshi Seto
	upstream commit: c5f8d99585d7b5b7e857fabf8aefd0174903a98c Impact: Regression fix (against clock_gettime() backwarding bug) This patch re-introduces a couple of functions, task_sched_runtime and thread_group_sched_runtime, which was once removed at the time of 2.6.28-rc1. These functions protect the sampling of thread/process clock with rq lock. This rq lock is required not to update rq->clock during the sampling. i.e. The clock_gettime() may return ((accounted runtime before update) + (delta after update)) that is less than what it should be. v2 -> v3: - Rename static helper function __task_delta_exec() to do_task_delta_exec() since -tip tree already has a __task_delta_exec() of different version. v1 -> v2: - Revises comments of function and patch description. - Add note about accuracy of thread group's runtime. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org [2.6.28.x][2.6.29.x] LKML-Reference: <49D1CC93.4080401@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	cap_prctl: don't set error to 0 at 'no_change'	Serge E. Hallyn
	upstream commit: 5bf37ec3e0f5eb79f23e024a7fbc8f3557c087f0 One-liner: capsh --print is broken without this patch. In certain cases, cap_prctl returns error > 0 for success. However, the 'no_change' label was always setting error to 0. As a result, for example, 'prctl(CAP_BSET_READ, N)' would always return 0. It should return 1 if a process has N in its bounding set (as by default it does). I'm keeping the no_change label even though it's now functionally the same as 'error'. Signed-off-by: Serge Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	SCSI: libiscsi: fix iscsi pool error path	Jean Delvare
	upstream commit: fd6e1c14b73dbab89cb76af895d5612e4a8b5522 Le lundi 30 mars 2009, Chris Wright a écrit : > q->queue could be ERR_PTR(-ENOMEM) which will break unwinding > on error. Make iscsi_pool_free more defensive. > Making the freeing of q->queue dependent on q->pool being set looks really weird (although it is correct at the moment. But this seems to be fixable in a much simpler way. With the benefit that only the error case is slowed down. In both cases we have a problem if q->queue contains an error value but it's not -ENOMEM. Apparently this can't happen today, but it doesn't feel right to assume this will always be true. Maybe it's the right time to fix this as well. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> [chrisw: this is a fixlet to f474a37b, also in -stable] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	SCSI: libiscsi: fix iscsi pool error path	Jean Delvare
	upstream commit: f474a37bc48667595b5653a983b635c95ed82a3b Memory freeing in iscsi_pool_free() looks wrong to me. Either q->pool can be NULL and this should be tested before dereferencing it, or it can't be NULL and it shouldn't be tested at all. As far as I can see, the only case where q->pool is NULL is on early error in iscsi_pool_init(). One possible way to fix the bug is thus to not call iscsi_pool_free() in this case (nothing needs to be freed anyway) and then we can get rid of the q->pool check. Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-04-27	sparc64: Fix bug in ("sparc64: Flush TLB before releasing pages.")	David Miller
	[ No upstream commit, this regression was added only to 2.6.29.1 ] Unfortunately I merged an earlier version of commit b6816b706138c3870f03115071872cad824f90b4 ("sparc64: Flush TLB before releasing pages.") than what I actually tested and merged upstream. Simply diffing asm/tlb_64.h in Linus's tree vs. what ended up in 2.6.29.1 confirms this. Sync things up to fix BUG() triggers some users are seeing. Reported-by: Dennis Gilmore <dennis@ausil.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>