aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-06-16m68k: Add ext2_find_{first,next}_bit() for ext4Aneesh Kumar K.V
commit 69c5ddf58a03da3686691ad2f293bc79fd977c10 upstream Add ext2_find_{first,next}_bit(), which are needed for ext4. They're derived out of the ext2_find_next_zero_bit found in the same file. Compile tested with crosstools [Reworked to preserve all symmetry with ext2_find_{first,next}_zero_bit()] This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10393 Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16IB/umem: Avoid sign problems when demoting npages to integerRoland Dreier
commit 8079ffa0e18baaf2940e52e0c118eef420a473a4 upstream On a 64-bit architecture, if ib_umem_get() is called with a size value that is so big that npages is negative when cast to int, then the length of the page list passed to get_user_pages(), namely min_t(int, npages, PAGE_SIZE / sizeof (struct page *)) will be negative, and get_user_pages() will immediately return 0 (at least since 900cf086, "Be more robust about bad arguments in get_user_pages()"). This leads to an infinite loop in ib_umem_get(), since the code boils down to: while (npages) { ret = get_user_pages(...); npages -= ret; } Fix this by taking the minimum as unsigned longs, so that the value of npages is never truncated. The impact of this bug isn't too severe, since the value of npages is checked against RLIMIT_MEMLOCK, so a process would need to have an astronomical limit or have CAP_IPC_LOCK to be able to trigger this, and such a process could already cause lots of mischief. But it does let buggy userspace code cause a kernel lock-up; for example I hit this with code that passes a negative value into a memory registartion function where it is promoted to a huge u64 value. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))Ilpo Järvinen
[ upstream commit: 8aca6cb1179ed9bef9351028c8d8af852903eae2 ] It is possible that this skip path causes TCP to end up into an invalid state where ca_state was left to CA_Open while some segments already came into sacked_out. If next valid ACK doesn't contain new SACK information TCP fails to enter into tcp_fastretrans_alert(). Thus at least high_seq is set incorrectly to a too high seqno because some new data segments could be sent in between (and also, limited transmit is not being correctly invoked there). Reordering in both directions can easily cause this situation to occur. I guess we would want to use tcp_moderate_cwnd(tp) there as well as it may be possible to use this to trigger oversized burst to network by sending an old ACK with huge amount of SACK info, but I'm a bit unsure about its effects (mainly to FlightSize), so to be on the safe side I just currently fixed it minimally to keep TCP's state consistent (obviously, such nasty ACKs have been possible this far). Though it seems that FlightSize is already underestimated by some amount, so probably on the long term we might want to trigger recovery there too, if appropriate, to make FlightSize calculation to resemble reality at the time when the losses where discovered (but such change scares me too much now and requires some more thinking anyway how to do that as it likely involves some code shuffling). This bug was found by Brian Vowell while running my TCP debug patch to find cause of another TCP issue (fackets_out miscount). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16forcedeth: msi interruptsAyaz Abdulla
commit 4db0ee176e256444695ee2d7b004552e82fec987 upstream Add a workaround for lost MSI interrupts. There is a race condition in the HW in which future interrupts could be missed. The workaround is to toggle the MSI irq mask. Added cleanup based on comments from Andrew Morton. Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16hgafb: resource management fixKrzysztof Helt
commit 630c270183133ac25bef8c8d726ac448df9b169a upstream Date: Thu, 12 Jun 2008 15:21:29 -0700 Subject: hgafb: resource management fix Release ports which are requested during detection which are not freed if there is no hga card. Otherwise there is a crash during cat /proc/ioports command. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16cciss: add new hardware supportMike Miller
commit 24aac480e76c6f5d1391ac05c5e9c0eb9b0cd302 upstream Date: Thu, 12 Jun 2008 15:21:34 -0700 Subject: cciss: add new hardware support Add support for the next generation of HP Smart Array SAS/SATA controllers. Shipping date is late Fall 2008. Bump the driver version to 3.6.20 to reflect the new hardware support from patch 1 of this set. Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16mmc: wbsd: initialize tasklets before requesting interruptChuck Ebbert
commit cef33400d0349fb24b6f8b7dea79b66e3144fd8b upstream With CONFIG_DEBUG_SHIRQ set we will get an interrupt as soon as we allocate one. Tasklets may be scheduled in the interrupt handler but they will be initialized after the handler returns, causing a BUG() in kernel/softirq.c when they run. Should fix this Fedora bug report: https://bugzilla.redhat.com/show_bug.cgi?id=449817 Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Acked-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16tcp FRTO: work-around inorder receiversIlpo Järvinen
[ upstream commit: 79d44516b4b178ffb6e2159c75584cfcfc097914 ] If receiver consumes segments successfully only in-order, FRTO fallback to conventional recovery produces RTO loop because FRTO's forward transmissions will always get dropped and need to be resent, yet by default they're not marked as lost (which are the only segments we will retransmit in CA_Loss). Price to pay about this is occassionally unnecessarily retransmitting the forward transmission(s). SACK blocks help a bit to avoid this, so it's mainly a concern for NewReno case though SACK is not fully immune either. This change has a side-effect of fixing SACKFRTO problem where it didn't have snd_nxt of the RTO time available anymore when fallback become necessary (this problem would have only occured when RTO would occur for two or more segments and ECE arrives in step 3; no need to figure out how to fix that unless the TODO item of selective behavior is considered in future). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Reported-by: Damon L. Chesser <damon@damtek.com> Tested-by: Damon L. Chesser <damon@damtek.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16tcp FRTO: SACK variant is errorneously used with NewRenoIlpo Järvinen
[ upstream commit: 62ab22278308a40bcb7f4079e9719ab8b7fe11b5 ] Note: there's actually another bug in FRTO's SACK variant, which is the causing failure in NewReno case because of the error that's fixed here. I'll fix the SACK case separately (it's a separate bug really, though related, but in order to fix that I need to audit tp->snd_nxt usage a bit). There were two places where SACK variant of FRTO is getting incorrectly used even if SACK wasn't negotiated by the TCP flow. This leads to incorrect setting of frto_highmark with NewReno if a previous recovery was interrupted by another RTO. An eventual fallback to conventional recovery then incorrectly considers one or couple of segments as forward transmissions though they weren't, which then are not LOST marked during fallback making them "non-retransmittable" until the next RTO. In a bad case, those segments are really lost and are the only one left in the window. Thus TCP needs another RTO to continue. The next FRTO, however, could again repeat the same events making the progress of the TCP flow extremely slow. In order for these events to occur at all, FRTO must occur again in FRTOs step 3 while the key segments must be lost as well, which is not too likely in practice. It seems to most frequently with some small devices such as network printers that *seem* to accept TCP segments only in-order. In cases were key segments weren't lost, things get automatically resolved because those wrongly marked segments don't need to be retransmitted in order to continue. I found a reproducer after digging up relevant reports (few reports in total, none at netdev or lkml I know of), some cases seemed to indicate middlebox issues which seems now to be a false assumption some people had made. Bugzilla #10063 _might_ be related. Damon L. Chesser <damon@damtek.com> had a reproducable case and was kind enough to tcpdump it for me. With the tcpdump log it was quite trivial to figure out. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16tcp FRTO: Fix fallback to conventional recoveryIlpo Järvinen
[ upstream commit: a1c1f281b84a751fdb5ff919da3b09df7297619f ] It seems that commit 009a2e3e4ec ("[TCP] FRTO: Improve interoperability with other undo_marker users") run into another land-mine which caused fallback to conventional recovery to break: 1. Cumulative ACK arrives after FRTO retransmission 2. tcp_try_to_open sees zero retrans_out, clears retrans_stamp which should be kept like in CA_Loss state it would be 3. undo_marker change allowed tcp_packet_delayed to return true because of the cleared retrans_stamp once FRTO is terminated causing LossUndo to occur, which means all loss markings FRTO made are reverted. This means that the conventional recovery basically recovered one loss per RTT, which is not that efficient. It was quite unobvious that the undo_marker change broken something like this, I had a quite long session to track it down because of the non-intuitiviness of the bug (luckily I had a trivial reproducer at hand and I was also able to learn to use kprobes in the process as well :-)). This together with the NewReno+FRTO fix and FRTO in-order workaround this fixes Damon's problems, this and the first mentioned are enough to fix Bugzilla #10063. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Reported-by: Damon L. Chesser <damon@damtek.com> Tested-by: Damon L. Chesser <damon@damtek.com> Tested-by: Sebastian Hyrwall <zibbe@cisko.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16tcp: fix skb vs fack_count out-of-sync conditionIlpo Järvinen
[ upstream commit: a6604471db5e7a33474a7f16c64d6b118fae3e74 ] This bug is able to corrupt fackets_out in very rare cases. In order for this to cause corruption: 1) DSACK in the middle of previous SACK block must be generated. 2) In order to take that particular branch, part or all of the DSACKed segment must already be SACKed so that we have that in cache in the first place. 3) The new info must be top enough so that fackets_out will be updated on this iteration. ...then fack_count is updated while skb wasn't, then we walk again that particular segment thus updating fack_count twice for a single skb and finally that value is assigned to fackets_out by tcp_sacktag_one. It is safe to call tcp_sacktag_one just once for a segment (at DSACK), no need to call again for plain SACK. Potential problem of the miscount are limited to premature entry to recovery and to inflated reordering metric (which could even cancel each other out in the most the luckiest scenarios :-)). Both are quite insignificant in worst case too and there exists also code to reset them (fackets_out once sacked_out becomes zero and reordering metric on RTO). This has been reported by a number of people, because it occurred quite rarely, it has been very evasive. Andy Furniss was able to get it to occur couple of times so that a bit more info was collected about the problem using a debug patch, though it still required lot of checking around. Thanks also to others who have tried to help here. This is listed as Bugzilla #10346. The bug was introduced by me in commit 68f8353b48 ([TCP]: Rewrite SACK block processing & sack_recv_cache use), I probably thought back then that there's need to scan that entry twice or didn't dare to make it go through it just once there. Going through twice would have required restoring fack_count after the walk but as noted above, I chose to drop the additional walk step altogether here. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16l2tp: Fix possible oops if transmitting or receiving when tunnel goes downJames Chapman
[ upstream commit: 24b95685ffcdb3dc28f64b9e8af6ea3e8360fbc5 ] Some problems have been experienced in the field which cause an oops in the pppol2tp driver if L2TP tunnels fail while passing data. The pppol2tp driver uses private data that is referenced via the sk->sk_user_data of its UDP and PPPoL2TP sockets. This patch makes sure that the driver uses sock_hold() when it holds a reference to the sk pointer. This affects its sendmsg(), recvmsg(), getname(), [gs]etsockopt() and ioctl() handlers. Tested by ISP where problem was seen. System has been up 10 days with no oops since running this patch. Without the patch, an oops would occur every 1-2 days. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16l2tp: Fix possible WARN_ON from socket code when UDP socket is closedJames Chapman
[ upstream commit: 199f7d24ae59894243687a234a909f44a8724506 ] If an L2TP daemon closes a tunnel socket while packets are queued in the tunnel's reorder queue, a kernel warning is logged because the socket is closed while skbs are still referencing it. The fix is to purge the queue in the socket's release handler. WARNING: at include/net/sock.h:351 udp_lib_unhash+0x41/0x68() Pid: 12998, comm: openl2tpd Not tainted 2.6.25 #8 [<c0423c58>] warn_on_slowpath+0x41/0x51 [<c05d33a7>] udp_lib_unhash+0x41/0x68 [<c059424d>] sk_common_release+0x23/0x90 [<c05d16be>] udp_lib_close+0x8/0xa [<c05d8684>] inet_release+0x42/0x48 [<c0592599>] sock_release+0x14/0x60 [<c059299f>] sock_close+0x29/0x30 [<c046ef52>] __fput+0xad/0x15b [<c046f1d9>] fput+0x17/0x19 [<c046c8c4>] filp_close+0x50/0x5a [<c046da06>] sys_close+0x69/0x9f [<c04048ce>] syscall_call+0x7/0xb Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16tcp: Limit cwnd growth when deferring for GSOJohn Heffner
[ upstream commit: 246eb2af060fc32650f07203c02bdc0456ad76c7 ] This fixes inappropriately large cwnd growth on sender-limited flows when GSO is enabled, limiting cwnd growth to 64k. [ Backport to 2.6.25 by replacing sk->sk_gso_max_size with 65536 -DaveM ] Signed-off-by: John Heffner <johnwheffner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16tcp: Allow send-limited cwnd to grow up to max_burst when gso disabledJohn Heffner
[ upstream commit: ce447eb91409225f8a488f6b7b2a1bdf7b2d884f ] This changes the logic in tcp_is_cwnd_limited() so that cwnd may grow up to tcp_max_burst() even when sk_can_gso() is false, or when sysctl_tcp_tso_win_divisor != 0. Signed-off-by: John Heffner <johnwheffner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16tcp: TCP connection times out if ICMP frag needed is delayedSridhar Samudrala
[ upstream commit: 7d227cd235c809c36c847d6a597956ad9e9d2bae ] We are seeing an issue with TCP in handling an ICMP frag needed message that is received after net.ipv4.tcp_retries1 retransmits. The default value of retries1 is 3. So if the path mtu changes and ICMP frag needed is lost for the first 3 retransmits or if it gets delayed until 3 retransmits are done, TCP doesn't update MSS correctly and continues to retransmit the orginal message until it timesout after tcp_retries2 retransmits. I am seeing this issue even with the latest 2.6.25.4 kernel. In tcp_retransmit_timer(), when retransmits counter exceeds tcp_retries1 value, the dst cache entry of the socket is reset. At this time, if we receive an ICMP frag needed message, the dst entry gets updated with the new MTU, but the TCP sockets dst_cache entry remains NULL. So the next time when we try to retransmit after the ICMP frag needed is received, tcp_retransmit_skb() gets called. Here the cur_mss value is calculated at the start of the routine with a NULL sk_dst_cache. Instead we should call tcp_current_mss after the rebuild_header that caches the dst entry with the updated mtu. Also the rebuild_header should be called before tcp_fragment so that skb is fragmented if the mss goes down. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16vlan: Correctly handle device notifications for layered VLAN devicesPatrick McHardy
[ upstream commit: 81d85346b3fcd8b3167eac8b5fb415a210bd4345 ] Commit 30688a9 ([VLAN]: Handle vlan devices net namespace changing) changed the device notifier to special-case notifications for VLAN devices, effectively disabling state propagation to underlying VLAN devices. This is needed for layered VLANs though, so restore the original behaviour. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16l2tp: avoid skb truesize bug if headroom is increasedJames Chapman
[ upstream commit: 090c48d3dd5ea90b37350334aaed9a93b0c1e0a1 ] A user reported seeing occasional bugs such as the following when using the L2TP driver. SKB BUG: Invalid truesize (272) len=72, sizeof(sk_buff)=208 When L2TP adds its header in the transmit path, it might need to increase the headroom of the skb. In some cases, the increased headroom trips a kernel bug when the skb is freed because the skb has grown beyond its truesize value. The fix is to increase the truesize by the amount of headroom added, after orphaning the skb. While here, fix a misleading comment. Thanks to Iouri Kharon <bc-info@styx.cabel.net> for the initial report and testing the fix. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16netlink: Fix nla_parse_nested_compat() to call nla_parse() directlyThomas Graf
[ upstream commit: b9a2f2e450b0f770bb4347ae8d48eb2dea701e24 ] The purpose of nla_parse_nested_compat() is to parse attributes which contain a struct followed by a stream of nested attributes. So far, it called nla_parse_nested() to parse the stream of nested attributes which was wrong, as nla_parse_nested() expects a container attribute as data which holds the attribute stream. It needs to call nla_parse() directly while pointing at the next possible alignment point after the struct in the beginning of the attribute. With this patch, I can no longer reproduce the reported leftover warnings. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ipsec: Use the correct ip_local_out functionHerbert Xu
[ upstream commit: 1ac06e0306d0192a7a4d9ea1c9e06d355ce7e7d3 ] Because the IPsec output function xfrm_output_resume does its own dst_output call it should always call __ip_local_output instead of ip_local_output as the latter may invoke dst_output directly. Otherwise the return values from nf_hook and dst_output may clash as they both use the value 1 but for different purposes. When that clash occurs this can cause a packet to be used after it has been freed which usually leads to a crash. Because the offending value is only returned from dst_output with qdiscs such as HTB, this bug is normally not visible. Thanks to Marco Berizzi for his perseverance in tracking this down. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16net_sched: cls_api: fix return value for non-existant classifiersPatrick McHardy
[ upstream commit: f2df824948d559ea818e03486a8583e42ea6ab37 ] cls_api should return ENOENT when the requested classifier doesn't exist. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags()David Woodhouse
[ upstream commit: 0e91796eb46e29edc791131c832a2232bcaed9dd ] Am I just being particularly dim today, or can the call to dev->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags() never happen? We've just set dev->flags = flags & IFF_MULTICAST, effectively. So the condition '(dev->flags ^ flags) & IFF_MULTICAST' is _never_ going to be true. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16cassini: Only use chip checksum for ipv4 packets.David S. Miller
[ upstream commit: b1443e2f6501f06930a162ff1ff08382a98bf23e ] According to David Monro, at least with Natsemi Saturn chips the cassini driver has some trouble with ipv6 checksums. Until we have more information about what's going on here, only use the chip checksums for ipv4. This workaround was suggested and tested by David. Update version and release date. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16can: Fix copy_from_user() results interpretationSam Ravnborg
[ Upstream commit: 3f91bd420a955803421f2db17b2e04aacfbb2bb8 ] Both copy_to_ and _from_user return the number of bytes, that failed to reach their destination, not the 0/-EXXX values. Based on patch from Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16bluetooth: rfcomm_dev_state_change deadlock fixDave Young
commit 537d59af73d894750cff14f90fe2b6d77fbab15b in mainline There's logic in __rfcomm_dlc_close: rfcomm_dlc_lock(d); d->state = BT_CLOSED; d->state_changed(d, err); rfcomm_dlc_unlock(d); In rfcomm_dev_state_change, it's possible that rfcomm_dev_put try to take the dlc lock, then we will deadlock. Here fixed it by unlock dlc before rfcomm_dev_get in rfcomm_dev_state_change. why not unlock just before rfcomm_dev_put? it's because there's another problem. rfcomm_dev_get/rfcomm_dev_del will take rfcomm_dev_lock, but in rfcomm_dev_add the lock order is : rfcomm_dev_lock --> dlc lock so I unlock dlc before the taken of rfcomm_dev_lock. Actually it's a regression caused by commit 1905f6c736cb618e07eca0c96e60e3c024023428 ("bluetooth : __rfcomm_dlc_close lock fix"), the dlc state_change could be two callbacks : rfcomm_sk_state_change and rfcomm_dev_state_change. I missed the rfcomm_sk_state_change that time. Thanks Arjan van de Ven <arjan@linux.intel.com> for the effort in commit 4c8411f8c115def968820a4df6658ccfd55d7f1a ("bluetooth: fix locking bug in the rfcomm socket cleanup handling") but he missed the rfcomm_dev_state_change lock issue. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16bluetooth: fix locking bug in the rfcomm socket cleanup handlingArjan van de Ven
[ Upstream commit: 7dccf1f4e1696c79bff064c3770867cc53cbc71c ] in net/bluetooth/rfcomm/sock.c, rfcomm_sk_state_change() does the following operation: if (parent && sock_flag(sk, SOCK_ZAPPED)) { /* We have to drop DLC lock here, otherwise * rfcomm_sock_destruct() will dead lock. */ rfcomm_dlc_unlock(d); rfcomm_sock_kill(sk); rfcomm_dlc_lock(d); } } which is fine, since rfcomm_sock_kill() will call sk_free() which will call rfcomm_sock_destruct() which takes the rfcomm_dlc_lock()... so far so good. HOWEVER, this assumes that the rfcomm_sk_state_change() function always gets called with the rfcomm_dlc_lock() taken. This is the case for all but one case, and in that case where we don't have the lock, we do a double unlock followed by an attempt to take the lock, which due to underflow isn't going anywhere fast. This patch fixes this by moving the stragling case inside the lock, like the other usages of the same call are doing in this code. This was found with the help of the www.kerneloops.org project, where this deadlock was observed 51 times at this point in time: http://www.kerneloops.org/search.php?search=rfcomm_sock_destruct Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ax25: Fix NULL pointer dereference and lockup.Jarek Poplawski
[ Upstream commit: 7dccf1f4e1696c79bff064c3770867cc53cbc71c ] There is only one function in AX25 calling skb_append(), and it really looks suspicious: appends skb after previously enqueued one, but in the meantime this previous skb could be removed from the queue. This patch Fixes it the simple way, so this is not fully compatible with the current method, but testing hasn't shown any problems. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16af_key: Fix selector family initialization.Kazunori MIYAZAWA
[ upstream commit: 4da5105687e0993a3bbdcffd89b2b94d9377faab ] This propagates the xfrm_user fix made in commit bcf0dda8d2408fe1c1040cdec5a98e5fcad2ac72 ("[XFRM]: xfrm_user: fix selector family initialization") Based upon a bug report from, and tested by, Alan Swanson. Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16sunhv: Fix locking in non-paged I/O case.David S. Miller
[ upstream commit: 3651751fff44ede58f65cbb1e39242139ead251b ] This causes the lock to be taken twice, thus resulting in a deadlock. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16fbdev: export symbol fb_mode_optionGeoff Levand
upstream commit: 659179b28f15ab1b1db5f8767090f5e728f115a1 Frame buffer and mode setting drivers can be built as modules, so fb_mode_option needs to be exported to support these. Prevents this error: ERROR: "fb_mode_option" [drivers/ps3/ps3av_mod.ko] undefined! Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ecryptfs: fix missed mutex_unlockCyrill Gorcunov
upstream commit: 71fd5179e8d1d4d503b517e0c5374f7c49540bfc Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ecryptfs: clean up (un)lock_parentMiklos Szeredi
upstream commit: 8dc4e37362a5dc910d704d52ac6542bfd49ddc2f dget(dentry->d_parent) --> dget_parent(dentry) unlock_parent() is racy and unnecessary. Replace single caller with unlock_dir(). There are several other suspect uses of ->d_parent in ecryptfs... Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16eCryptfs: protect crypt_stat->flags in ecryptfs_open()Michael Halcrow
upstream commit: 2f9b12a31fcb738ea8c9eb0d4ddf906c6f1d696c Make sure crypt_stat->flags is protected with a lock in ecryptfs_open(). Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ecryptfs: add missing lock around notify_changeMiklos Szeredi
upstream commit: 9c3580aa52195699065bc2d7242b1c7e3e6903fa Callers of notify_change() need to hold i_mutex. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16sound: emu10k1 - fix system hang with Audigy2 ZS Notebook PCMCIA cardJaroslav Franek
upstream commit: 868e15dbd2940f9453b4399117686f408dc77299 When the Linux kernel is compiled with CONFIG_DEBUG_SHIRQ=y, the Soundblaster Audigy2 ZS Notebook PCMCIA card causes the system hang during boot (udev stage) or when the card is hot-plug. The CONFIG_DEBUG_SHIRQ flag is by default 'y' with all Fedora kernels since 2.6.23. The problem was reported as https://bugzilla.redhat.com/show_bug.cgi?id=326411 The issue was hunted down to the snd_emu10k1_create() routine: /* pseudo-code */ snd_emu10k1_create(...) { ... request_irq(... IRQF_SHARED ...) { register the irq handler #ifdef CONFIG_DEBUG_SHIRQ call the irq handler: snd_emu10k1_interrupt() { poll I/O port // <---- !! system hangs ... } #endif } ... snd_emu10k1_cardbus_init(...) { initialize I/O ports } ... } The early access to I/O port in the interrupt handler causes the freeze. Obviously it is necessary to init the I/O ports before accessing them. This patch moves the registration of the irq handler after the initialization of the I/O ports. Signed-off-by: Jaroslav Franek <jarin.franek@post.cz> Acked-by: James Courtier-Dutton <James@superbug.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ALSA: hda - Fix resume of auto-config mode with Realtek codecsTakashi Iwai
upstream commit: 07bc76dfa19b10017b518dd9aa1b2719e8c863de The auto-config mode of Realtek ALC codecs has a bug since 2.6.25 that it cannot resume properly. The problem was the wrong assignment of init_hook that overrides the whole initialization. Relevant bug reports: http://bugzilla.kernel.org/show_bug.cgi?id=10662 https://bugzilla.novell.com/show_bug.cgi?id=385473 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16double-free of inode on alloc_file() failure exit in create_write_pipe()Al Viro
upstream commit: ed1524371716466e9c762808b02601d0d0276a92 Duh... Fortunately, the bug is quite recent (post-2.6.25) and, embarrassingly, mine ;-/ http://bugzilla.kernel.org/show_bug.cgi?id=10878 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enableMichael Buesch
upstream commit: a3bafeedfff2ac5fa0a316bea4570e27900b6fcc This fixes a context assertion in ssb that makes b44 print out warnings on resume. This fixes the following kernel oops: http://www.kerneloops.org/oops.php?number=12732 http://www.kerneloops.org/oops.php?number=11410 Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16Add 'rd' alias to new brd ramdisk driverNick Piggin
upstream commit: efedf51c866130945b5db755cb58670e60205d83 Alias brd to rd in the hope of helping legacy users. Suggested by Jan. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ipwireless: Fix blocked sendingDavid Sterba
upstream commit: eb4e545d4ac82d9018487edb4419b33b9930c857 Packet sending is driven by two flags, tx_ready and tx_queued. It was possible, that there were queued data for sending and hardware was flagged as blocked but in fact it was not. The tx_queued was indicator but should be really a counter else first fragmented packet resets tx_queued flag, but there may be pending packets which do not get sent. New semantics: tx_ready - set, if hw is ready to send packet, no packet is being transferred right now set the flag right at the place where data are copied into hw memory and not earlier without checking if it was succesful tx_queued - count of enqueued packets, including fragments Tested-by: Michal Rokos <michal.rokos@gmail.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16b43: Fix controller restart crashMichael Buesch
upstream commit: 3bf0a32e22fedc0b46443699db2d61ac2a883ac4 This fixes a kernel crash on rmmod, in the case where the controller was restarted before doing the rmmod. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09Linux 2.6.25.6v2.6.25.6Chris Wright
2008-06-09x86: fix bad pmd ffff810000207xxx(9090909090909090)Hugh Dickins
upstream commit: 2884f110d5409714f3a04eeb6d2ecd77da66b242 OGAWA Hirofumi and Fede have reported rare pmd_ERROR messages: mm/memory.c:127: bad pmd ffff810000207xxx(9090909090909090). Initialization's cleanup_highmap was leaving alignment filler behind in the pmd for MODULES_VADDR: when vmalloc's guard page would occupy a new page table, it's not allocated, and then module unload's vfree hits the bad 9090 pmd entry left over. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Hugh notes: It's actually not a serious problem, but it does look as if it's a serious problem, so we should stamp it out. Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09cpufreq: fix null object access on Transmeta CPUCHIKAMA masaki
upstream commit: 879000f94442860e72c934f9e568989bc7fb8ec4 If cpu specific cpufreq driver(i.e. longrun) has "setpolicy" function, governor object isn't set into cpufreq_policy object at "__cpufreq_set_policy" function in driver/cpufreq/cpufreq.c . This causes a null object access at "store_scaling_setspeed" and "show_scaling_setspeed" function in driver/cpufreq/cpufreq.c when reading or writing through /sys interface (ex. cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed) Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=10654 https://bugzilla.redhat.com/show_bug.cgi?id=443354 Signed-off-by: CHIKAMA Masaki <masaki.chikama@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Chuck Ebbert <cebbert@redhat.com> Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09capabilities: remain source compatible with 32-bit raw legacy capability ↵Andrew G. Morgan
support. upstream commit: ca05a99a54db1db5bca72eccb5866d2a86f8517f Source code out there hard-codes a notion of what the _LINUX_CAPABILITY_VERSION #define means in terms of the semantics of the raw capability system calls capget() and capset(). Its unfortunate, but true. Since the confusing header file has been in a released kernel, there is software that is erroneously using 64-bit capabilities with the semantics of 32-bit compatibilities. These recently compiled programs may suffer corruption of their memory when sys_getcap() overwrites more memory than they are coded to expect, and the raising of added capabilities when using sys_capset(). As such, this patch does a number of things to clean up the situation for all. It 1. forces the _LINUX_CAPABILITY_VERSION define to always retain its legacy value. 2. adopts a new #define strategy for the kernel's internal implementation of the preferred magic. 3. deprecates v2 capability magic in favor of a new (v3) magic number. The functionality of v3 is entirely equivalent to v2, the only difference being that the v2 magic causes the kernel to log a "deprecated" warning so the admin can find applications that may be using v2 inappropriately. [User space code continues to be encouraged to use the libcap API which protects the application from details like this. libcap-2.10 is the first to support v3 capabilities.] Fixes issue reported in https://bugzilla.redhat.com/show_bug.cgi?id=447518. Thanks to Bojan Smojver for the report. [akpm@linux-foundation.org: s/depreciate/deprecate/g] [akpm@linux-foundation.org: be robust about put_user size] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Bojan Smojver <bojan@rexursive.com> Cc: stable@kernel.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09md: fix prexor vs sync_request raceDan Williams
upstream commit: e0a115e5aa554b93150a8dc1c3fe15467708abb2 During the initial array synchronization process there is a window between when a prexor operation is scheduled to a specific stripe and when it completes for a sync_request to be scheduled to the same stripe. When this happens the prexor completes and the stripe is unconditionally marked "insync", effectively canceling the sync_request for the stripe. Prior to 2.6.23 this was not a problem because the prexor operation was done under sh->lock. The effect in older kernels being that the prexor would still erroneously mark the stripe "insync", but sync_request would be held off and re-mark the stripe as "!in_sync". Change the write completion logic to not mark the stripe "in_sync" if a prexor was performed. The effect of the change is to sometimes not set STRIPE_INSYNC. The worst this can do is cause the resync to stall waiting for STRIPE_INSYNC to be set. If this were happening, then STRIPE_SYNCING would be set and handle_issuing_new_read_requests would cause all available blocks to eventually be read, at which point prexor would never be used on that stripe any more and STRIPE_INSYNC would eventually be set. echo repair > /sys/block/mdN/md/sync_action will correct arrays that may have lost this race. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [chrisw: backport to 2.6.25.5] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09md: fix uninitialized use of mddev->recovery_waitDan Williams
upstream commit: a6d8113a986c66aeb379a26b6e0062488b3e59e1 If an array was created with --assume-clean we will oops when trying to set ->resync_max. Fix this by initializing ->recovery_wait in mddev_find. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09md: do not compute parity unless it is on a failed driveDan Williams
upstream commit: c337869d95011495fa181536786e74aa2d7ff031 If a block is computed (rather than read) then a check/repair operation may be lead to believe that the data on disk is correct, when infact it isn't. So only compute blocks for failed devices. This issue has been around since at least 2.6.12, but has become harder to hit in recent kernels since most reads bypass the cache. echo repair > /sys/block/mdN/md/sync_action will set the parity blocks to the correct state. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09eCryptfs: remove unnecessary page decrypt callMichael Halcrow
upstream commit: d3e49afbb66109613c3474f2273f5830ac2dcb09 The page decrypt calls in ecryptfs_write() are both pointless and buggy. Pointless because ecryptfs_get_locked_page() has already brought the page up to date, and buggy because prior mmap writes will just be blown away by the decrypt call. This patch also removes the declaration of a now-nonexistent function ecryptfs_write_zeros(). Thanks to Eric Sandeen and David Kleikamp for helping to track this down. Eric said: fsx w/ mmap dies quickly ( < 100 ops) without this, and survives nicely (to millions of ops+) with it in place. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Eric Sandeen <sandeen@redhat.com> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [chrisw: backport to 2.6.25.5] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09brk: make sys_brk() honor COMPAT_BRK when computing lower boundJiri Kosina
upstream commit: a5b4592cf77b973c29e7c9695873a26052b58951 Fix a regression introduced by commit 4cc6028d4040f95cdb590a87db478b42b8be0508 Author: Jiri Kosina <jkosina@suse.cz> Date: Wed Feb 6 22:39:44 2008 +0100 brk: check the lower bound properly The check in sys_brk() on minimum value the brk might have must take CONFIG_COMPAT_BRK setting into account. When this option is turned on (i.e. we support ancient legacy binaries, e.g. libc5-linked stuff), the lower bound on brk value is mm->end_code, otherwise the brk start is allowed to be arbitrarily shifted. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>