linux - Linux kernel source tree

Age	Commit message (Collapse)	Author
2011-05-02	qlcnic: limit skb frags for non tso packet	Amit Kumar Salecha
	commit 91a403caf0f26c71ce4407fd235b2d6fb225fba9 upstream. Machines are getting deadlock in four node cluster environment. All nodes are accessing (find /gfs2 -depth -print\|cpio -ocv > /dev/null) 200 GB storage on a GFS2 filesystem. This result in memory fragmentation and driver receives 18 frags for 1448 byte packets. For non tso packet, fw drops the tx request, if it has >14 frags. Fixing it by pulling extra frags. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-02	p54: Initialize extra_len in p54_tx_80211	Jason Conti
	commit a6756da9eace8b4af73e9dea43f1fc2889224c94 upstream. This patch fixes a very serious off-by-one bug in the driver, which could leave the device in an unresponsive state. The problem was that the extra_len variable [used to reserve extra scratch buffer space for the firmware] was left uninitialized. Because p54_assign_address later needs the value to reserve additional space, the resulting frame could be to big for the small device's memory window and everything would immediately come to a grinding halt. Reference: https://bugs.launchpad.net/bugs/722185 Acked-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Jason Conti <jason.conti@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-02	block, blk-sysfs: Fix an err return path in blk_register_queue()	Liu Yuan
	commit ed5302d3c25006a9edc7a7fbea97a30483f89ef7 upstream. We do not call blk_trace_remove_sysfs() in err return path if kobject_add() fails. This path fixes it. Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-02	ath: add missing regdomain pair 0x5c mapping	Christian Lamparter
	commit bd39a274fb7b43374c797bafdb7f506598f36f77 upstream. Joe Culler reported a problem with his AR9170 device: > ath: EEPROM regdomain: 0x5c > ath: EEPROM indicates we should expect a direct regpair map > ath: invalid regulatory domain/country code 0x5c > ath: Invalid EEPROM contents It turned out that the regdomain 'APL7_FCCA' was not mapped yet. According to Luis R. Rodriguez [Atheros' engineer] APL7 maps to FCC_CTL and FCCA maps to FCC_CTL as well, so the attached patch should be correct. Reported-by: Joe Culler <joe.culler@gmail.com> Acked-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-02	netxen: limit skb frags for non tso packet	amit salecha
	commit c968bdf6912cad6d0fc63d7037cc1c870604a808 upstream. Machines are getting deadlock in four node cluster environment. All nodes are accessing (find /gfs2 -depth -print\|cpio -ocv > /dev/null) 200 GB storage on a GFS2 filesystem. This result in memory fragmentation and driver receives 18 frags for 1448 byte packets. For non tso packet, fw drops the tx request, if it has >14 frags. Fixing it by pulling extra frags. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-02	ath9k_hw: fix stopping rx DMA during resets	Felix Fietkau
	commit 5882da02e9d9089b7e8c739f3e774aaeeff8b7ba upstream. During PHY errors, the MAC can sometimes fail to enter an idle state on older hardware (before AR9380) after an rx stop has been requested. This typically shows up in the kernel log with messages like these: ath: Could not stop RX, we could be confusing the DMA engine when we start RX up ------------[ cut here ]------------ WARNING: at drivers/net/wireless/ath/ath9k/recv.c:504 ath_stoprecv+0xcc/0xf0 [ath9k]() Call Trace: [<8023f0e8>] dump_stack+0x8/0x34 [<80075050>] warn_slowpath_common+0x78/0xa4 [<80075094>] warn_slowpath_null+0x18/0x24 [<80d66d60>] ath_stoprecv+0xcc/0xf0 [ath9k] [<80d642cc>] ath_set_channel+0xbc/0x270 [ath9k] [<80d65254>] ath_radio_disable+0x4a4/0x7fc [ath9k] When this happens, the state that the MAC enters is easy to identify and does not result in bogus DMA traffic, however to ensure a working state after a channel change, the hardware should still be reset. This patch adds detection for this specific MAC state, after which the above warnings completely disappear in my tests. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Cc: Kyungwan Nam <Kyungwan.Nam@Atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	Linux 2.6.38.4v2.6.38.4	Greg Kroah-Hartman

2011-04-21	ip: ip_options_compile() resilient to NULL skb route	Eric Dumazet
	commit c65353daf137dd41f3ede3baf62d561fca076228 upstream. Scot Doyle demonstrated ip_options_compile() could be called with an skb without an attached route, using a setup involving a bridge, netfilter, and forged IP packets. Let's make ip_options_compile() and ip_options_rcv_srr() a bit more robust, instead of changing bridge/netfilter code. With help from Hiroaki SHIMODA. Reported-by: Scot Doyle <lkml@scotdoyle.com> Tested-by: Scot Doyle <lkml@scotdoyle.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	bridge: reset IPCB in br_parse_ip_options	Eric Dumazet
	commit f8e9881c2aef1e982e5abc25c046820cd0b7cf64 upstream. Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP stack), missed one IPCB init before calling ip_options_compile() Thanks to Scot Doyle for his tests and bug reports. Reported-by: Scot Doyle <lkml@scotdoyle.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Acked-by: Bandan Das <bandan.das@stratus.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Cc: Jan Lübbe <jluebbe@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	perf tool: Fix gcc 4.6.0 issues	Kyle McMartin
	commit fb7d0b3cefb80a105f7fd26bbc62e0cbf9192822 upstream. GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf due to the -Werror=unused-but-set-variable flag. I've gone through and annotated some of the assignments that had side effects (ie: return value from a function) with the __used annotation, and in some cases, just removed unused code. In a few cases, we were assigning something useful, but not using it in later parts of the function. kyle@dreadnought:~/src% gcc --version gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3) Cc: Ingo Molnar <mingo@redhat.com> LKML-Reference: <20110124161304.GK27353@bombadil.infradead.org> Signed-off-by: Kyle McMartin <kyle@redhat.com> [ committer note: Fixed up the annotation fixes, as that code moved recently ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> [Backported to 2.6.38.2 by deleting unused but set variables] Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	Bluetooth: Fix HCI_RESET command synchronization	Gustavo F. Padovan
	commit f630cf0d5434e3923e1b8226ffa2753ead6b0ce5 upstream. We can't send new commands before a cmd_complete for the HCI_RESET command shows up. Reported-by: Mikko Vinni <mmvinni@yahoo.com> Reported-by: Justin P. Mattock <justinmattock@gmail.com> Reported-by: Ed Tomlinson <edt@aei.ca> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi> Tested-by: Justin P. Mattock <justinmattock@gmail.com> Tested-by: Mikko Vinni <mmvinni@yahoo.com> Tested-by: Ed Tomlinson <edt@aei.ca>
2011-04-21	radeon: Fix KMS CP writeback on big endian machines.	Michel Dänzer
	commit dc66b325f161bb651493c7d96ad44876b629cf6a upstream. This is necessary even with PCI(e) GART, and it makes writeback work even with AGP on my PowerBook. Might still be unreliable with older revisions of UniNorth and other AGP bridges though. Signed-off-by: Michel Dänzer <daenzer@vmware.com> Reviewed-by: Alex Deucher <alex.deucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: Fix unplug of device with active streams	Matthew Wilcox
	commit b214f191d95ba4b5a35aebd69cd129cf7e3b1884 upstream. If I unplug a device while the UAS driver is loaded, I get an oops in usb_free_streams(). This is because usb_unbind_interface() calls usb_disable_interface() which calls usb_disable_endpoint() which sets ep_out and ep_in to NULL. Then the UAS driver calls usb_pipe_endpoint() which returns a NULL pointer and passes an array of NULL pointers to usb_free_streams(). I think the correct fix for this is to check for the NULL pointer in usb_free_streams() rather than making the driver check for this situation. My original patch for this checked for dev->state == USB_STATE_NOTATTACHED, but the call to usb_disable_interface() is conditional, so not all drivers would want this check. Note from Sarah Sharp: This patch does avoid a potential dereference, but the real fix (which will be implemented later) is to set the .soft_unbind flag in the usb_driver structure for the UAS driver, and all drivers that allocate streams. The driver should free any streams when it is unbound from the interface. This avoids leaking stream rings in the xHCI driver when usb_disable_interface() is called. This should be queued for stable trees back to 2.6.35. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: xhci - also free streams when resetting devices	Dmitry Torokhov
	commit 2dea75d96ade3c7cd2bfe73f99c7b3291dc3d03a upstream. Currently, when resetting a device, xHCI driver disables all but one endpoints and frees their rings, but leaves alone any streams that might have been allocated. Later, when users try to free allocated streams, we oops in xhci_setup_no_streams_ep_input_ctx() because ep->ring is NULL. Let's free not only rings but also stream data as well, so that calling free_streams() on a device that was reset will be safe. This should be queued for stable trees back to 2.6.35. Reviewed-by: Micah Elizabeth Scott <micah@vmware.com> Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: xhci - fix math in xhci_get_endpoint_interval()	Dmitry Torokhov
	commit dfa49c4ad120a784ef1ff0717168aa79f55a483a upstream. When parsing exponent-expressed intervals we subtract 1 from the value and then expect it to match with original + 1, which is highly unlikely, and we end with frequent spew: usb 3-4: ep 0x83 - rounding interval to 512 microframes Also, parsing interval for fullspeed isochronous endpoints was incorrect - according to USB spec they use exponent-based intervals (but xHCI spec claims frame-based intervals). I trust USB spec more, especially since USB core agrees with it. This should be queued for stable kernels back to 2.6.31. Reviewed-by: Micah Elizabeth Scott <micah@vmware.com> Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: xhci - fix unsafe macro definitions	Dmitry Torokhov
	commit 5a6c2f3ff039154872ce597952f8b8900ea0d732 upstream. Macro arguments used in expressions need to be enclosed in parenthesis to avoid unpleasant surprises. This should be queued for kernels back to 2.6.31 Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: fix formatting of SuperSpeed endpoints in /proc/bus/usb/devices	Dmitry Torokhov
	commit 2868a2b1ba8f9c7f6c4170519ebb6c62934df70e upstream. Isochronous and interrupt SuperSpeed endpoints use the same mechanisms for decoding bInterval values as HighSpeed ones so adjust the code accordingly. Also bandwidth reservation for SuperSpeed matches highspeed, not low/full speed. Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: EHCI: unlink unused QHs when the controller is stopped	Alan Stern
	commit 94ae4976e253757e9b03a44d27d41b20f1829d80 upstream. This patch (as1458) fixes a problem affecting ultra-reliable systems: When hardware failover of an EHCI controller occurs, the data structures do not get released correctly. This is because the routine responsible for removing unused QHs from the async schedule assumes the controller is running properly (the frame counter is used in determining how long the QH has been idle) -- but when a failover causes the controller to be electronically disconnected from the PCI bus, obviously it stops running. The solution is simple: Allow scan_async() to remove a QH from the async schedule if it has been idle for long enough _or_ if the controller is stopped. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-Tested-by: Dan Duval <dan.duval@stratus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	usb: qcserial add missing errorpath kfrees	Steven Hardy
	commit cb62d65f966146a39fdde548cb474dacf1d00fa5 upstream. There are two -ENODEV error paths in qcprobe where the allocated private data is not freed, this patch adds the two missing kfrees to avoid leaking memory on the error path Signed-off-by: Steven Hardy <shardy@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	usb: qcserial avoid pointing to freed memory	Steven Hardy
	commit 99ab3f9e4eaec35fd2d7159c31b71f17f7e613e3 upstream. Rework the qcprobe logic such that serial->private is not set when qcprobe exits with -ENODEV, otherwise serial->private will point to freed memory on -ENODEV Signed-off-by: Steven Hardy <shardy@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	usb: Fix qcserial memory leak on rmmod	Steven Hardy
	commit 10c9ab15d6aee153968d150c05b3ee3df89673de upstream. qcprobe function allocates serial->private but this is never freed, this patch adds a new function qc_release() which frees serial->private, after calling usb_wwan_release Signed-off-by: Steven Hardy <shardy@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	powerpc/perf_event: Skip updating kernel counters if register value shrinks	Eric B Munson
	commit 86c74ab317c1ef4d37325e0d7ca8a01a796b0bd7 upstream. Because of speculative event roll back, it is possible for some event coutners to decrease between reads on POWER7. This causes a problem with the way that counters are updated. Delta calues are calculated in a 64 bit value and the top 32 bits are masked. If the register value has decreased, this leaves us with a very large positive value added to the kernel counters. This patch protects against this by skipping the update if the delta would be negative. This can lead to a lack of precision in the coutner values, but from my testing the value is typcially fewer than 10 samples at a time. Signed-off-by: Eric B Munson <emunson@mgebm.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	powerpc: Fix oops if scan_dispatch_log is called too early	Anton Blanchard
	commit 84ffae55af79d7b8834fd0c08d0d1ebf2c77f91e upstream. We currently enable interrupts before the dispatch log for the boot cpu is setup. If a timer interrupt comes in early enough we oops in scan_dispatch_log: Unable to handle kernel paging request for data at address 0x00000010 ... .scan_dispatch_log+0xb0/0x170 .account_system_vtime+0xa0/0x220 .irq_enter+0x88/0xc0 .do_IRQ+0x48/0x230 The patch below adds a check to scan_dispatch_log to ensure the dispatch log has been allocated. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	proc: do proper range check on readdir offset	Linus Torvalds
	commit d8bdc59f215e62098bc5b4256fd9928bf27053a1 upstream. Rather than pass in some random truncated offset to the pid-related functions, check that the offset is in range up-front. This is just cleanup, the previous commit fixed the real problem. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	next_pidmap: fix overflow condition	Linus Torvalds
	commit c78193e9c7bcbf25b8237ad0dec82f805c4ea69b upstream. next_pidmap() just quietly accepted whatever 'last' pid that was passed in, which is not all that safe when one of the users is /proc. Admittedly the proc code should do some sanity checking on the range (and that will be the next commit), but that doesn't mean that the helper functions should just do that pidmap pointer arithmetic without checking the range of its arguments. So clamp 'last' to PID_MAX_LIMIT. The fact that we then do "last+1" doesn't really matter, the for-loop does check against the end of the pidmap array properly (it's only the actual pointer arithmetic overflow case we need to worry about, and going one bit beyond isn't going to overflow). [ Use PID_MAX_LIMIT rather than pid_max as per Eric Biederman ] Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com> Analyzed-by: Robert Święcki <robert@swiecki.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: option: Added support for Samsung GT-B3730/GT-B3710 LTE USB modem.	Marius B. Kotsbak
	commit 80f9df3e0093ad9f1eeefd2ff7fd27daaa518d25 upstream. Bind only modem AT command endpoint to option. Signed-off-by: Marius B. Kotsbak <marius@kotsbak.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: ftdi_sio: add ids for Hameg HO720 and HO730	Paul Friedrich
	commit c53c2fab40cf16e13af66f40bfd27200cda98d2f upstream. usb serial: ftdi_sio: add two missing USB ID's for Hameg interfaces HO720 and HO730 Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: ftdi_sio: add PID for OCT DK201 docking station	Johan Hovold
	commit 11a31d84129dc3133417d626643d714c9df5317e upstream. Add PID 0x0103 for serial port of the OCT DK201 docking station. Reported-by: Jan Hoogenraad <jan@hoogenraad.net> Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	USB: ftdi_sio: Added IDs for CTI USB Serial Devices	Christian Simon
	commit 5a9443f08c83c294c5c806a689c1184b27cb26b3 upstream. I added new ProdutIds for two devices from CTI GmbH Leipzig. Signed-off-by: Christian Simon <simon@swine.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	usb: musb: temporarily make it bool	Felipe Balbi
	commit 7a180e70cfc56e131bfe4796773df2acfc7d4180 upstream. Due to the recent changes to musb's glue layers, we can't compile musb-hdrc as a module - compilation will break due to undefined symbol musb_debug. In order to fix that, we need a big re-work of the debug support on the MUSB driver. Because that would mean a lot of new code coming into the -rc series, it's best to defer that to next merge window and for now just disable module support for MUSB. Once we get the refactor of the debugging support done, we can simply revert this patch and things will go back to normal again. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	brk: COMPAT_BRK: fix detection of randomized brk	Jiri Kosina
	commit 4471a675dfc7ca676c165079e91c712b09dc9ce4 upstream. 5520e89 ("brk: fix min_brk lower bound computation for COMPAT_BRK") tried to get the whole logic of brk randomization for legacy (libc5-based) applications finally right. It turns out that the way to detect whether brk has actually been randomized in the end or not introduced by that patch still doesn't work for those binaries, as reported by Geert: : /sbin/init from my old m68k ramdisk exists prematurely. : : Before the patch: : : \| brk(0x80005c8e) = 0x80006000 : : After the patch: : : \| brk(0x80005c8e) = 0x80005c8e : : Old libc5 considers brk() to have failed if the return value is not : identical to the requested value. I don't like it, but currently see no better option than a bit flag in task_struct to catch the CONFIG_COMPAT_BRK && randomize_va_space == 2 case. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	vmscan: all_unreclaimable() use zone->all_unreclaimable as a name	KOSAKI Motohiro
	commit 929bea7c714220fc76ce3f75bef9056477c28e74 upstream. all_unreclaimable check in direct reclaim has been introduced at 2.6.19 by following commit. 2006 Sep 25; commit 408d8544; oom: use unreclaimable info And it went through strange history. firstly, following commit broke the logic unintentionally. 2008 Apr 29; commit a41f24ea; page allocator: smarter retry of costly-order allocations Two years later, I've found obvious meaningless code fragment and restored original intention by following commit. 2010 Jun 04; commit bb21c7ce; vmscan: fix do_try_to_free_pages() return value when priority==0 But, the logic didn't works when 32bit highmem system goes hibernation and Minchan slightly changed the algorithm and fixed it . 2010 Sep 22: commit d1908362: vmscan: check all_unreclaimable in direct reclaim path But, recently, Andrey Vagin found the new corner case. Look, struct zone { .. int all_unreclaimable; .. unsigned long pages_scanned; .. } zone->all_unreclaimable and zone->pages_scanned are neigher atomic variables nor protected by lock. Therefore zones can become a state of zone->page_scanned=0 and zone->all_unreclaimable=1. In this case, current all_unreclaimable() return false even though zone->all_unreclaimabe=1. This resulted in the kernel hanging up when executing a loop of the form 1. fork 2. mmap 3. touch memory 4. read memory 5. munmmap as described in http://www.gossamer-threads.com/lists/linux/kernel/1348725#1348725 Is this ignorable minor issue? No. Unfortunately, x86 has very small dma zone and it become zone->all_unreclamble=1 easily. and if it become all_unreclaimable=1, it never restore all_unreclaimable=0. Why? if all_unreclaimable=1, vmscan only try DEF_PRIORITY reclaim and a-few-lru-pages>>DEF_PRIORITY always makes 0. that mean no page scan at all! Eventually, oom-killer never works on such systems. That said, we can't use zone->pages_scanned for this purpose. This patch restore all_unreclaimable() use zone->all_unreclaimable as old. and in addition, to add oom_killer_disabled check to avoid reintroduce the issue of commit d1908362 ("vmscan: check all_unreclaimable in direct reclaim path"). Reported-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	sched: Fix erroneous all_pinned logic	Ken Chen
	commit b30aef17f71cf9e24b10c11cbb5e5f0ebe8a85ab upstream. The scheduler load balancer has specific code to deal with cases of unbalanced system due to lots of unmovable tasks (for example because of hard CPU affinity). In those situation, it excludes the busiest CPU that has pinned tasks for load balance consideration such that it can perform second 2nd load balance pass on the rest of the system. This all works as designed if there is only one cgroup in the system. However, when we have multiple cgroups, this logic has false positives and triggers multiple load balance passes despite there are actually no pinned tasks at all. The reason it has false positives is that the all pinned logic is deep in the lowest function of can_migrate_task() and is too low level: load_balance_fair() iterates each task group and calls balance_tasks() to migrate target load. Along the way, balance_tasks() will also set a all_pinned variable. Given that task-groups are iterated, this all_pinned variable is essentially the status of last group in the scanning process. Task group can have number of reasons that no load being migrated, none due to cpu affinity. However, this status bit is being propagated back up to the higher level load_balance(), which incorrectly think that no tasks were moved. It kick off the all pinned logic and start multiple passes attempt to move load onto puller CPU. To fix this, move the all_pinned aggregation up at the iterator level. This ensures that the status is aggregated over all task-groups, not just last one in the list. Signed-off-by: Ken Chen <kenchen@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/BANLkTi=ernzNawaR5tJZEsV_QVnfxqXmsQ@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	RTC: add missing "return 0" in new alarm func for rtc-bfin.c	Mike Frysinger
	commit 8c122b96866580c99e44f3f07ac93a993d964ec3 upstream. The new bfin_rtc_alarm_irq_enable function forgot to add a "return 0" to the end leading to the build warning: drivers/rtc/rtc-bfin.c: In function 'bfin_rtc_alarm_irq_enable': drivers/rtc/rtc-bfin.c:253: warning: control reaches end of non-void function CC: Thomas Gleixner <tglx@linutronix.de> CC: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	i2c-algo-bit: Call pre/post_xfer for bit_test	Alex Deucher
	commit d3b3e15da14ded61c9654db05863b04a2435f4cc upstream. Apparently some distros set i2c-algo-bit.bit_test to 1 by default. In some cases this causes i2c_bit_add_bus to fail and prevents the i2c bus from being added. In the radeon case, we fail to add the ddc i2c buses which prevents the driver from being able to detect attached monitors. The i2c bus works fine even if bit_test fails. This is likely due to gpio switching that is required and handled in the pre/post_xfer hooks, so call the pre/post_xfer hooks in the bit test as well. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=36221 Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	ARM: 6864/1: hw_breakpoint: clear DBGVCR out of reset	Will Deacon
	commit e89c0d7090c54d7b11b9b091e495a1ae345dd3ff upstream. The DBGVCR, used for configuring vector catch debug events, is UNKNOWN out of reset on ARMv7. When enabling monitor mode, this must be zeroed to avoid UNPREDICTABLE behaviour. This patch adds the zeroing code to the debug reset path. Reported-by: Stepan Moskovchenko <stepanm@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	vfs: Fix absolute RCU path walk failures due to uninitialized seq number	Tim Chen
	commit c1530019e311c91d14b24d8e74d233152d806e45 upstream. During RCU walk in path_lookupat and path_openat, the rcu lookup frequently failed if looking up an absolute path, because when root directory was looked up, seq number was not properly set in nameidata. We dropped out of RCU walk in nameidata_drop_rcu due to mismatch in directory entry's seq number. We reverted to slow path walk that need to take references. With the following patch, I saw a 50% increase in an exim mail server benchmark throughput on a 4-socket Nehalem-EX system. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	x86, amd: Disable GartTlbWlkErr when BIOS forgets it	Joerg Roedel
	commit 5bbc097d890409d8eff4e3f1d26f11a9d6b7c07e upstream. This patch disables GartTlbWlk errors on AMD Fam10h CPUs if the BIOS forgets to do is (or is just too old). Letting these errors enabled can cause a sync-flood on the CPU causing a reboot. The AMD BKDG recommends disabling GART TLB Wlk Error completely. This patch is the fix for https://bugzilla.kernel.org/show_bug.cgi?id=33012 on my machine. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Link: http://lkml.kernel.org/r/20110415131152.GJ18463@8bytes.org Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	x86, AMD: Set ARAT feature on AMD processors	Boris Ostrovsky
	commit b87cf80af3ba4b4c008b4face3c68d604e1715c6 upstream. Support for Always Running APIC timer (ARAT) was introduced in commit db954b5898dd3ef3ef93f4144158ea8f97deb058. This feature allows us to avoid switching timers from LAPIC to something else (e.g. HPET) and go into timer broadcasts when entering deep C-states. AMD processors don't provide a CPUID bit for that feature but they also keep APIC timers running in deep C-states (except for cases when the processor is affected by erratum 400). Therefore we should set ARAT feature bit on AMD CPUs. Tested-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Mark Langsdorf <mark.langsdorf@amd.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> LKML-Reference: <1300205624-4813-1-git-send-email-ostr@amd64.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	UBIFS: fix oops when R/O file-system is fsync'ed	Artem Bityutskiy
	commit 78530bf7f2559b317c04991b52217c1608d5a58d upstream. This patch fixes severe UBIFS bug: UBIFS oopses when we 'fsync()' an file on R/O-mounter file-system. We (the UBIFS authors) incorrectly thought that VFS would not propagate 'fsync()' down to the file-system if it is read-only, but this is not the case. It is easy to exploit this bug using the following simple perl script: use strict; use File::Sync qw(fsync sync); die "File path is not specified" if not defined $ARGV[0]; my $path = $ARGV[0]; open FILE, "<", "$path" or die "Cannot open $path: $!"; fsync(\*FILE) or die "cannot fsync $path: $!"; close FILE or die "Cannot close $path: $!"; Thanks to Reuben Dowle <Reuben.Dowle@navico.com> for reporting about this issue. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Reported-by: Reuben Dowle <Reuben.Dowle@navico.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	MAINTAINERS: update STABLE BRANCH info	Randy Dunlap
	commit d00ebeac5f24f290636f7a895dafc124b2930a08 upstream. Drop Chris Wright from STABLE maintainers. He hasn't done STABLE release work for quite some time. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	oom-kill: remove boost_dying_task_prio()	KOSAKI Motohiro
	commit 341aea2bc48bf652777fb015cc2b3dfa9a451817 upstream. This is an almost-revert of commit 93b43fa ("oom: give the dying task a higher priority"). That commit dramatically improved oom killer logic when a fork-bomb occurs. But I've found that it has nasty corner case. Now cpu cgroup has strange default RT runtime. It's 0! That said, if a process under cpu cgroup promote RT scheduling class, the process never run at all. If an admin inserts a !RT process into a cpu cgroup by setting rtruntime=0, usually it runs perfectly because a !RT task isn't affected by the rtruntime knob. But if it promotes an RT task via an explicit setscheduler() syscall or an OOM, the task can't run at all. In short, the oom killer doesn't work at all if admins are using cpu cgroup and don't touch the rtruntime knob. Eventually, kernel may hang up when oom kill occur. I and the original author Luis agreed to disable this logic. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Luis Claudio R. Goncalves <lclaudio@uudg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	ramfs: fix memleak on no-mmu arch	Bob Liu
	commit b836aec53e2bce71de1d5415313380688c851477 upstream. On no-mmu arch, there is a memleak during shmem test. The cause of this memleak is ramfs_nommu_expand_for_mapping() added page refcount to 2 which makes iput() can't free that pages. The simple test file is like this: int main(void) { int i; key_t k = ftok("/etc", 42); for ( i=0; i<100; ++i) { int id = shmget(k, 10000, 0644\|IPC_CREAT); if (id == -1) { printf("shmget error\n"); } if(shmctl(id, IPC_RMID, NULL ) == -1) { printf("shm rm error\n"); return -1; } } printf("run ok...\n"); return 0; } And the result: root:/> free total used free shared buffers Mem: 60320 17912 42408 0 0 -/+ buffers: 17912 42408 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 19096 41224 0 0 -/+ buffers: 19096 41224 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 20296 40024 0 0 -/+ buffers: 20296 40024 ... After this patch the test result is:(no memleak anymore) root:/> free total used free shared buffers Mem: 60320 16668 43652 0 0 -/+ buffers: 16668 43652 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 16668 43652 0 0 -/+ buffers: 16668 43652 Signed-off-by: Bob Liu <lliubbo@gmail.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	mm/thp: use conventional format for boolean attributes	Ben Hutchings
	commit e27e6151b154ff6e5e8162efa291bc60196d29ea upstream. The conventional format for boolean attributes in sysfs is numeric ("0" or "1" followed by new-line). Any boolean attribute can then be read and written using a generic function. Using the strings "yes [no]", "[yes] no" (read), "yes" and "no" (write) will frustrate this. [akpm@linux-foundation.org: use kstrtoul()] [akpm@linux-foundation.org: test_bit() doesn't return 1/0, per Neil] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Tested-by: David Rientjes <rientjes@google.com> Cc: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	kstrto*: converting strings to integers done (hopefully) right	Alexey Dobriyan
	commit 33ee3b2e2eb9b4b6c64dcf9ed66e2ac3124e748c upstream. 1. simple_strto() do not contain overflow checks and crufty, libc way to indicate failure. 2. strict_strto() also do not have overflow checks but the name and comments pretend they do. 3. Both families have only "long long" and "long" variants, but users want strtou8() 4. Both "simple" and "strict" prefixes are wrong: Simple doesn't exactly say what's so simple, strict should not exist because conversion should be strict by default. The solution is to use "k" prefix and add convertors for more types. Enter kstrtoull() kstrtoll() kstrtoul() kstrtol() kstrtouint() kstrtoint() kstrtou64() kstrtos64() kstrtou32() kstrtos32() kstrtou16() kstrtos16() kstrtou8() kstrtos8() Include runtime testsuite (somewhat incomplete) as well. strict_strto() become deprecated, stubbed to kstrto() and eventually will be removed altogether. Use kstrto*() in code today! Note: on some archs _kstrtoul() and _kstrtol() are left in tree, even if they'll be unused at runtime. This is temporarily solution, because I don't want to hardcode list of archs where these functions aren't needed. Current solution with sizeof() and __alignof__ at least always works. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	futex: Set FLAGS_HAS_TIMEOUT during futex_wait restart setup	Darren Hart
	commit 0cd9c6494ee5c19aef085152bc37f3a4e774a9e1 upstream. The FLAGS_HAS_TIMEOUT flag was not getting set, causing the restart_block to restart futex_wait() without a timeout after a signal. Commit b41277dc7a18ee332d in 2.6.38 introduced the regression by accidentally removing the the FLAGS_HAS_TIMEOUT assignment from futex_wait() during the setup of the restart block. Restore the originaly behavior. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=32922 Reported-by: Tim Smith <tsmith201104@yahoo.com> Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Kacur <jkacur@redhat.com> Link: http://lkml.kernel.org/r/%3Cdaac0eb3af607f72b9a4d3126b2ba8fb5ed3b883.1302820917.git.dvhart%40linux.intel.com%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	sparc64: Fix build errors with gcc-4.6.0	David S. Miller
	[ Upstream commit c6fee0810df4e0f4cf9c4834d2569ca01c02cffc ] Most of the warnings emitted (we fail arch/sparc file builds with -Werror) were legitimate but harmless, however one case (n2_pcr_write) was a genuine bug. Based almost entirely upon a patch by Sam Ravnborg. Reported-by: Dennis Gilmore <dennis@ausil.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	sparc32: Pass task_struct to schedule_tail() in ret_from_fork	Tkhai Kirill
	[ Upstream commit 47c7c97a93a5b8f719093dbf83555090b3b8228b ] We have to pass task_struct of previous process to function schedule_tail(). Currently in ret_from_fork previous thread_info is passed: switch_to: mov %g6, %g3 /* previous thread_info in g6 / ret_from_fork: call schedule_tail mov %g3, %o0 / previous thread_info is passed / void schedule_tail(struct task_struct prev); Signed-off-by: Tkhai Kirill <tkhai@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	sparc32: Fix might-be-used-uninitialized warning in do_sparc_fault().	David S. Miller
	[ Upstream commit c816be7b5f24585baa9eba1f2413935f771d6ad6 ] When we try to handle vmalloc faults, we can take a code path which uses "code" before we actually set it. Amusingly gcc-3.3 notices this yet gcc-4.x does not. Reported-by: Bob Breuer <breuerr@mc.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-21	sparc: Fix .size directive for do_int_load	Ben Hutchings
	[ Upstream commit 35043c428f1fcb92feb5792f5878a8852ee00771 ] gas used to accept (and ignore?) .size directives which referred to undefined symbols, as this does. In binutils 2.21 these are treated as errors. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>