aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2012-02-03tcp: md5: using remote adress for md5 lookup in rst packetshawnlu
[ Upstream commit 8a622e71f58ec9f092fc99eacae0e6cf14f6e742 ] md5 key is added in socket through remote address. remote address should be used in finding md5 key when sending out reset packet. Signed-off-by: shawnlu <shawn.lu@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03tcp: fix tcp_trim_head() to adjust segment count with skb MSSNeal Cardwell
[ Upstream commit 5b35e1e6e9ca651e6b291c96d1106043c9af314a ] This commit fixes tcp_trim_head() to recalculate the number of segments in the skb with the skb's existing MSS, so trimming the head causes the skb segment count to be monotonically non-increasing - it should stay the same or go down, but not increase. Previously tcp_trim_head() used the current MSS of the connection. But if there was a decrease in MSS between original transmission and ACK (e.g. due to PMTUD), this could cause tcp_trim_head() to counter-intuitively increase the segment count when trimming bytes off the head of an skb. This violated assumptions in tcp_tso_acked() that tcp_trim_head() only decreases the packet count, so that packets_acked in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to pass u32 pkts_acked values as large as 0xffffffff to ca_ops->pkts_acked(). As an aside, if tcp_trim_head() had really wanted the skb to reflect the current MSS, it should have called tcp_set_skb_tso_segs() unconditionally, since a decrease in MSS would mean that a single-packet skb should now be sliced into multiple segments. Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03rds: Make rds_sock_lock BH rather than IRQ safe.David S. Miller
[ Upstream commit efc3dbc37412c027e363736b4f4c74ee5e8ecffc ] rds_sock_info() triggers locking warnings because we try to perform a local_bh_enable() (via sock_i_ino()) while hardware interrupts are disabled (via taking rds_sock_lock). There is no reason for rds_sock_lock to be a hardware IRQ disabling lock, none of these access paths run in hardware interrupt context. Therefore making it a BH disabling lock is safe and sufficient to fix this bug. Reported-by: Kumar Sanghvi <kumaras@chelsio.com> Reported-by: Josh Boyer <jwboyer@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03net: reintroduce missing rcu_assign_pointer() callsEric Dumazet
[ Upstream commit cf778b00e96df6d64f8e21b8395d1f8a859ecdc7 ] commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER) did a lot of incorrect changes, since it did a complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x, y). We miss needed barriers, even on x86, when y is not NULL. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03l2tp: l2tp_ip - fix possible oops on packet receiveJames Chapman
[ Upstream commit 68315801dbf3ab2001679fd2074c9dc5dcf87dfa ] When a packet is received on an L2TP IP socket (L2TPv3 IP link encapsulation), the l2tpip socket's backlog_rcv function calls xfrm4_policy_check(). This is not necessary, since it was called before the skb was added to the backlog. With CONFIG_NET_NS enabled, xfrm4_policy_check() will oops if skb->dev is null, so this trivial patch removes the call. This bug has always been present, but only when CONFIG_NET_NS is enabled does it cause problems. Most users are probably using UDP encapsulation for L2TP, hence the problem has only recently surfaced. EIP: 0060:[<c12bb62b>] EFLAGS: 00210246 CPU: 0 EIP is at l2tp_ip_recvmsg+0xd4/0x2a7 EAX: 00000001 EBX: d77b5180 ECX: 00000000 EDX: 00200246 ESI: 00000000 EDI: d63cbd30 EBP: d63cbd18 ESP: d63cbcf4 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Call Trace: [<c1218568>] sock_common_recvmsg+0x31/0x46 [<c1215c92>] __sock_recvmsg_nosec+0x45/0x4d [<c12163a1>] __sock_recvmsg+0x31/0x3b [<c1216828>] sock_recvmsg+0x96/0xab [<c10b2693>] ? might_fault+0x47/0x81 [<c10b2693>] ? might_fault+0x47/0x81 [<c1167fd0>] ? _copy_from_user+0x31/0x115 [<c121e8c8>] ? copy_from_user+0x8/0xa [<c121ebd6>] ? verify_iovec+0x3e/0x78 [<c1216604>] __sys_recvmsg+0x10a/0x1aa [<c1216792>] ? sock_recvmsg+0x0/0xab [<c105a99b>] ? __lock_acquire+0xbdf/0xbee [<c12d5a99>] ? do_page_fault+0x193/0x375 [<c10d1200>] ? fcheck_files+0x9b/0xca [<c10d1259>] ? fget_light+0x2a/0x9c [<c1216bbb>] sys_recvmsg+0x2b/0x43 [<c1218145>] sys_socketcall+0x16d/0x1a5 [<c11679f0>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c100305f>] sysenter_do_call+0x12/0x38 Code: c6 05 8c ea a8 c1 01 e8 0c d4 d9 ff 85 f6 74 07 3e ff 86 80 00 00 00 b9 17 b6 2b c1 ba 01 00 00 00 b8 78 ed 48 c1 e8 23 f6 d9 ff <ff> 76 0c 68 28 e3 30 c1 68 2d 44 41 c1 e8 89 57 01 00 83 c4 0c Signed-off-by: James Chapman <jchapman@katalix.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03af_unix: fix EPOLLET regression for stream socketsEric Dumazet
[ Upstream commit 6f01fd6e6f6809061b56e78f1e8d143099716d70 ] Commit 0884d7aa24 (AF_UNIX: Fix poll blocking problem when reading from a stream socket) added a regression for epoll() in Edge Triggered mode (EPOLLET) Appropriate fix is to use skb_peek()/skb_unlink() instead of skb_dequeue(), and only call skb_unlink() when skb is fully consumed. This remove the need to requeue a partial skb into sk_receive_queue head and the extra sk->sk_data_ready() calls that added the regression. This is safe because once skb is given to sk_receive_queue, it is not modified by a writer, and readers are serialized by u->readlock mutex. This also reduce number of spinlock acquisition for small reads or MSG_PEEK users so should improve overall performance. Reported-by: Nick Mathewson <nickm@freehaven.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Alexey Moiseytsev <himeraster@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03net caif: Register properly as a pernet subsystem.Eric W. Biederman
[ Upstream commit 8a8ee9aff6c3077dd9c2c7a77478e8ed362b96c6 ] caif is a subsystem and as such it needs to register with register_pernet_subsys instead of register_pernet_device. Among other problems using register_pernet_device was resulting in net_generic being called before the caif_net structure was allocated. Which has been causing net_generic to fail with either BUG_ON's or by return NULL pointers. A more ugly problem that could be caused is packets in flight why the subsystem is shutting down. To remove confusion also remove the cruft cause by inappropriately trying to fix this bug. With the aid of the previous patch I have tested this patch and confirmed that using register_pernet_subsys makes the failure go away as it should. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Tested-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03netns: fix net_alloc_generic()Eric Dumazet
[ Upstream commit 073862ba5d249c20bd5c49fc6d904ff0e1f6a672 ] When a new net namespace is created, we should attach to it a "struct net_generic" with enough slots (even empty), or we can hit the following BUG_ON() : [ 200.752016] kernel BUG at include/net/netns/generic.h:40! ... [ 200.752016] [<ffffffff825c3cea>] ? get_cfcnfg+0x3a/0x180 [ 200.752016] [<ffffffff821cf0b0>] ? lockdep_rtnl_is_held+0x10/0x20 [ 200.752016] [<ffffffff825c41be>] caif_device_notify+0x2e/0x530 [ 200.752016] [<ffffffff810d61b7>] notifier_call_chain+0x67/0x110 [ 200.752016] [<ffffffff810d67c1>] raw_notifier_call_chain+0x11/0x20 [ 200.752016] [<ffffffff821bae82>] call_netdevice_notifiers+0x32/0x60 [ 200.752016] [<ffffffff821c2b26>] register_netdevice+0x196/0x300 [ 200.752016] [<ffffffff821c2ca9>] register_netdev+0x19/0x30 [ 200.752016] [<ffffffff81c1c67a>] loopback_net_init+0x4a/0xa0 [ 200.752016] [<ffffffff821b5e62>] ops_init+0x42/0x180 [ 200.752016] [<ffffffff821b600b>] setup_net+0x6b/0x100 [ 200.752016] [<ffffffff821b6466>] copy_net_ns+0x86/0x110 [ 200.752016] [<ffffffff810d5789>] create_new_namespaces+0xd9/0x190 net_alloc_generic() should take into account the maximum index into the ptr array, as a subsystem might use net_generic() anytime. This also reduces number of reallocations in net_assign_generic() Reported-by: Sasha Levin <levinsasha928@gmail.com> Tested-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Sjur Brændeland <sjur.brandeland@stericsson.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03mac80211: fix work removal on deauth requestJohannes Berg
commit bc4934bc61d0a11fd62c5187ff83645628f8be8b upstream. When deauth is requested while an auth or assoc work item is in progress, we currently delete it without regard for any state it might need to clean up. Fix it by cleaning up for those items. In the case Pontus found, the problem manifested itself as such: authenticate with 00:23:69:aa:dd:7b (try 1) authenticated failed to insert Dummy STA entry for the AP (error -17) deauthenticating from 00:23:69:aa:dd:7b by local choice (reason=2) It could also happen differently if the driver uses the tx_sync callback. We can't just call the ->done() method of the work items because that will lock up due to the locking in cfg80211. This fix isn't very clean, but that seems acceptable since I have patches pending to remove this code completely. Reported-by: Pontus Fuchs <pontus.fuchs@gmail.com> Tested-by: Pontus Fuchs <pontus.fuchs@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-01-25mac80211: revert on-channel work optimisationsJohannes Berg
commit e76aadc572288a158ae18ae1c10fe395c7bca066 upstream. Backport note: This patch it's a full revert of commit b23b025f "mac80211: Optimize scans on current operating channel.". On upstrem revert e76aadc5 we keep some bits from that commit, which are needed for upstream version of mac80211. The on-channel work optimisations have caused a number of issues, and the code is unfortunately very complex and almost impossible to follow. Instead of attempting to put in more workarounds let's just remove those optimisations, we can work on them again later, after we change the whole auth/assoc design. This should fix rate_control_send_low() warnings, see RH bug 731365. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25svcrpc: avoid memory-corruption on pool shutdownJ. Bruce Fields
commit b4f36f88b3ee7cf26bf0be84e6c7fc15f84dcb71 upstream. Socket callbacks use svc_xprt_enqueue() to add an xprt to a pool->sp_sockets list. In normal operation a server thread will later come along and take the xprt off that list. On shutdown, after all the threads have exited, we instead manually walk the sv_tempsocks and sv_permsocks lists to find all the xprt's and delete them. So the sp_sockets lists don't really matter any more. As a result, we've mostly just ignored them and hoped they would go away. Which has gotten us into trouble; witness for example ebc63e531cc6 "svcrpc: fix list-corrupting race on nfsd shutdown", the result of Ben Greear noticing that a still-running svc_xprt_enqueue() could re-add an xprt to an sp_sockets list just before it was deleted. The fix was to remove it from the list at the end of svc_delete_xprt(). But that only made corruption less likely--I can see nothing that prevents a svc_xprt_enqueue() from adding another xprt to the list at the same moment that we're removing this xprt from the list. In fact, despite the earlier xpo_detach(), I don't even see what guarantees that svc_xprt_enqueue() couldn't still be running on this xprt. So, instead, note that svc_xprt_enqueue() essentially does: lock sp_lock if XPT_BUSY unset add to sp_sockets unlock sp_lock So, if we do: set XPT_BUSY on every xprt. Empty every sp_sockets list, under the sp_socks locks. Then we're left knowing that the sp_sockets lists are all empty and will stay that way, since any svc_xprt_enqueue() will check XPT_BUSY under the sp_lock and see it set. And *then* we can continue deleting the xprt's. (Thanks to Jeff Layton for being correctly suspicious of this code....) Cc: Ben Greear <greearb@candelatech.com> Cc: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25svcrpc: destroy server sockets all at onceJ. Bruce Fields
commit 2fefb8a09e7ed251ae8996e0c69066e74c5aa560 upstream. There's no reason I can see that we need to call sv_shutdown between closing the two lists of sockets. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25svcrpc: fix double-free on shutdown of nfsd after changing pool modeJ. Bruce Fields
commit 61c8504c428edcebf23b97775a129c5b393a302b upstream. The pool_to and to_pool fields of the global svc_pool_map are freed on shutdown, but are initialized in nfsd startup only in the SVC_POOL_PERCPU and SVC_POOL_PERNODE cases. They *are* initialized to zero on kernel startup. So as long as you use only SVC_POOL_GLOBAL (the default), this will never be a problem. You're also OK if you only ever use SVC_POOL_PERCPU or SVC_POOL_PERNODE. However, the following sequence events leads to a double-free: 1. set SVC_POOL_PERCPU or SVC_POOL_PERNODE 2. start nfsd: both fields are initialized. 3. shutdown nfsd: both fields are freed. 4. set SVC_POOL_GLOBAL 5. start nfsd: the fields are left untouched. 6. shutdown nfsd: now we try to free them again. Step 4 is actually unnecessary, since (for some bizarre reason), nfsd automatically resets the pool mode to SVC_POOL_GLOBAL on shutdown. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25mac80211: fix rx->key NULL pointer dereference in promiscuous modeStanislaw Gruszka
commit 1140afa862842ac3e56678693050760edc4ecde9 upstream. Since: commit 816c04fe7ef01dd9649f5ccfe796474db8708be5 Author: Christian Lamparter <chunkeey@googlemail.com> Date: Sat Apr 30 15:24:30 2011 +0200 mac80211: consolidate MIC failure report handling is possible to that we dereference rx->key == NULL when driver set RX_FLAG_MMIC_STRIPPED and not RX_FLAG_IV_STRIPPED and we are in promiscuous mode. This happen with rt73usb and rt61pci at least. Before the commit we always check rx->key against NULL, so I assume fix should be done in mac80211 (also mic_fail path has similar check). References: https://bugzilla.redhat.com/show_bug.cgi?id=769766 http://rt2x00.serialmonkey.com/pipermail/users_rt2x00.serialmonkey.com/2012-January/004395.html Reported-by: Stuart D Gathman <stuart@gathman.org> Reported-by: Kai Wohlfahrt <kai.scorpio@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25NFSv4: include bitmap in nfsv4 get acl dataAndy Adamson
commit bf118a342f10dafe44b14451a1392c3254629a1f upstream. The NFSv4 bitmap size is unbounded: a server can return an arbitrary sized bitmap in an FATTR4_WORD0_ACL request. Replace using the nfs4_fattr_bitmap_maxsz as a guess to the maximum bitmask returned by a server with the inclusion of the bitmap (xdr length plus bitmasks) and the acl data xdr length to the (cached) acl page data. This is a general solution to commit e5012d1f "NFSv4.1: update nfs4_fattr_bitmap_maxsz" and fixes hitting a BUG_ON in xdr_shrink_bufhead when getting ACLs. Fix a bug in decode_getacl that returned -EINVAL on ACLs > page when getxattr was called with a NULL buffer, preventing ACL > PAGE_SIZE from being retrieved. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-12igmp: Avoid zero delay when receiving odd mixture of IGMP queriesBen Hutchings
commit a8c1f65c79cbbb2f7da782d4c9d15639a9b94b27 upstream. Commit 5b7c84066733c5dfb0e4016d939757b38de189e4 ('ipv4: correct IGMP behavior on v3 query during v2-compatibility mode') added yet another case for query parsing, which can result in max_delay = 0. Substitute a value of 1, as in the usual v3 case. Reported-by: Simon McVittie <smcv@debian.org> References: http://bugs.debian.org/654876 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-04Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-01-03Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth
2012-01-03sch_qfq: fix overflow in qfq_update_start()Eric Dumazet
grp->slot_shift is between 22 and 41, so using 32bit wide variables is probably a typo. This could explain QFQ hangs Dave reported to me, after 2^23 packets ? (23 = 64 - 41) Reported-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Dave Taht <dave.taht@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-31netfilter: ctnetlink: fix timeout calculationXi Wang
The sanity check (timeout < 0) never works; the dividend is unsigned and so is the division, which should have been a signed division. long timeout = (ct->timeout.expires - jiffies) / HZ; if (timeout < 0) timeout = 0; This patch converts the time values to signed for the division. Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-31ipvs: try also real server with port 0 in backup serverJulian Anastasov
We should not forget to try for real server with port 0 in the backup server when processing the sync message. We should do it in all cases because the backup server can use different forwarding method. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-27packet: fix possible dev refcnt leak when bind failWei Yongjun
If bind is fail when bind is called after set PACKET_FANOUT sock option, the dev refcnt will leak. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-24Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller
2011-12-24netem: dont call vfree() under spinlock and BH disabledEric Dumazet
commit 6373a9a286 (netem: use vmalloc for distribution table) added a regression, since vfree() is called while holding a spinlock and BH being disabled. Fix this by doing the pointers swap in critical section, and freeing after spinlock release. Also add __GFP_NOWARN to the kmalloc() try, since we fallback to vmalloc(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-24netfilter: ctnetlink: fix scheduling while atomic if helper is autoloadedPablo Neira Ayuso
This patch fixes one scheduling while atomic error: [ 385.565186] ctnetlink v0.93: registering with nfnetlink. [ 385.565349] BUG: scheduling while atomic: lt-expect_creat/16163/0x00000200 It can be triggered with utils/expect_create included in libnetfilter_conntrack if the FTP helper is not loaded. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-24netfilter: ctnetlink: fix return value of ctnetlink_get_expect()Pablo Neira Ayuso
This fixes one bogus error that is returned to user-space: libnetfilter_conntrack/utils# ./expect_get TEST: get expectation (-1)(Unknown error 18446744073709551504) This patch includes the correct handling for EAGAIN (nfnetlink uses this error value to restart the operation after module auto-loading). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-23Revert "Bluetooth: Increase HCI reset timeout in hci_dev_do_close"Gustavo F. Padovan
This reverts commit e1b6eb3ccb0c2a34302a9fd87dd15d7b86337f23. This was causing a delay of 10 seconds in the resume process of a Thinkpad laptop. I'm afraid this could affect more devices once 3.2 is released. Reported-by: Tomáš Janoušek <tomi@nomi.cz> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-12-23Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller
2011-12-23netfilter: xt_connbytes: handle negation correctlyFlorian Westphal
"! --connbytes 23:42" should match if the packet/byte count is not in range. As there is no explict "invert match" toggle in the match structure, userspace swaps the from and to arguments (i.e., as if "--connbytes 42:23" were given). However, "what <= 23 && what >= 42" will always be false. Change things so we use "||" in case "from" is larger than "to". This change may look like it breaks backwards compatibility when "to" is 0. However, older iptables binaries will refuse "connbytes 42:0", and current releases treat it to mean "! --connbytes 0:42", so we should be fine. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-23net: relax rcvbuf limitsEric Dumazet
skb->truesize might be big even for a small packet. Its even bigger after commit 87fb4b7b533 (net: more accurate skb truesize) and big MTU. We should allow queueing at least one packet per receiver, even with a low RCVBUF setting. Reported-by: Michal Simek <monstr@monstr.eu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()Xi Wang
Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will cause a kernel oops due to insufficient bounds checking. if (count > 1<<30) { /* Enforce a limit to prevent overflow */ return -EINVAL; } count = roundup_pow_of_two(count); table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count)); Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as: ... + (count * sizeof(struct rps_dev_flow)) where sizeof(struct rps_dev_flow) is 8. (1 << 30) * 8 will overflow 32 bits. This patch replaces the magic number (1 << 30) with a symbolic bound. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22net: introduce DST_NOPEER dst flagEric Dumazet
Chris Boot reported crashes occurring in ipv6_select_ident(). [ 461.457562] RIP: 0010:[<ffffffff812dde61>] [<ffffffff812dde61>] ipv6_select_ident+0x31/0xa7 [ 461.578229] Call Trace: [ 461.580742] <IRQ> [ 461.582870] [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2 [ 461.589054] [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155 [ 461.595140] [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b [ 461.601198] [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e [nf_conntrack_ipv6] [ 461.608786] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.614227] [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543 [ 461.620659] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.626440] [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a [bridge] [ 461.633581] [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459 [ 461.639577] [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76 [bridge] [ 461.646887] [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f [bridge] [ 461.653997] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.659473] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.665485] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.671234] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.677299] [<ffffffffa0379215>] ? nf_bridge_update_protocol+0x20/0x20 [bridge] [ 461.684891] [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack] [ 461.691520] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.697572] [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56 [bridge] [ 461.704616] [<ffffffffa0379031>] ? nf_bridge_push_encap_header+0x1c/0x26 [bridge] [ 461.712329] [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95 [bridge] [ 461.719490] [<ffffffffa037900a>] ? nf_bridge_pull_encap_header+0x1c/0x27 [bridge] [ 461.727223] [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge] [ 461.734292] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.739758] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.746203] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.751950] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.758378] [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56 [bridge] This is caused by bridge netfilter special dst_entry (fake_rtable), a special shared entry, where attaching an inetpeer makes no sense. Problem is present since commit 87c48fa3b46 (ipv6: make fragment identifications less predictable) Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and __ip_select_ident() fallback to the 'no peer attached' handling. Reported-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22mqprio: Avoid panic if no options are providedThomas Graf
Userspace may not provide TCA_OPTIONS, in fact tc currently does so not do so if no arguments are specified on the command line. Return EINVAL instead of panicing. Signed-off-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22bridge: provide a mtu() method for fake_dst_opsEric Dumazet
Commit 618f9bc74a039da76 (net: Move mtu handling down to the protocol depended handlers) forgot the bridge netfilter case, adding a NULL dereference in ip_fragment(). Reported-by: Chris Boot <bootc@bootc.net> CC: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22ipv4: using prefetch requires including prefetch.hStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: Add a flow_cache_flush_deferred function ipv4: reintroduce route cache garbage collector net: have ipconfig not wait if no dev is available sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd asix: new device id davinci-cpdma: fix locking issue in cpdma_chan_stop sctp: fix incorrect overflow check on autoclose r8169: fix Config2 MSIEnable bit setting. llc: llc_cmsg_rcv was getting called after sk_eat_skb. net: bpf_jit: fix an off-one bug in x86_64 cond jump target iwlwifi: update SCD BC table for all SCD queues Revert "Bluetooth: Revert: Fix L2CAP connection establishment" Bluetooth: Clear RFCOMM session timer when disconnecting last channel Bluetooth: Prevent uninitialized data access in L2CAP configuration iwlwifi: allow to switch to HT40 if not associated iwlwifi: tx_sync only on PAN context mwifiex: avoid double list_del in command cancel path ath9k: fix max phy rate at rate control init nfc: signedness bug in __nci_request() iwlwifi: do not set the sequence control bit is not needed
2011-12-21net: Add a flow_cache_flush_deferred functionSteffen Klassert
flow_cach_flush() might sleep but can be called from atomic context via the xfrm garbage collector. So add a flow_cache_flush_deferred() function and use this if the xfrm garbage colector is invoked from within the packet path. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-21ipv4: reintroduce route cache garbage collectorEric Dumazet
Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer) removed IP route cache garbage collector a bit too soon, as this gc was responsible for expired routes cleanup, releasing their neighbour reference. As pointed out by Robert Gladewitz, recent kernels can fill and exhaust their neighbour cache. Reintroduce the garbage collection, since we'll have to wait our neighbour lookups become refcount-less to not depend on this stuff. Reported-by: Robert Gladewitz <gladewitz@gmx.de> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-21Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2011-12-20Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Fix a regression in nfs_file_llseek() NFSv4: Do not accept delegated opens when a delegation recall is in effect NFSv4: Ensure correct locking when accessing the 'lock_states' list NFSv4.1: Ensure that we handle _all_ SEQUENCE status bits. NFSv4: Don't error if we handled it in nfs4_recovery_handle_error SUNRPC: Ensure we always bump the backlog queue in xprt_free_slot SUNRPC: Fix the execution time statistics in the face of RPC restarts
2011-12-20net: have ipconfig not wait if no dev is availableGerlando Falauto
previous commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c makes IP-Config wait for carrier on at least one network device. Before waiting (predefined value 120s), check that at least one device was successfully brought up. Otherwise (e.g. buggy bootloader which does not set the MAC address) there is no point in waiting for carrier. Cc: Micha Nelissen <micha@neli.hopto.org> Cc: Holger Brunck <holger.brunck@keymile.com> Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-20sctp: Do not account for sizeof(struct sk_buff) in estimated rwndThomas Graf
When checking whether a DATA chunk fits into the estimated rwnd a full sizeof(struct sk_buff) is added to the needed chunk size. This quickly exhausts the available rwnd space and leads to packets being sent which are much below the PMTU limit. This can lead to much worse performance. The reason for this behaviour was to avoid putting too much memory pressure on the receiver. The concept is not completely irational because a Linux receiver does in fact clone an skb for each DATA chunk delivered. However, Linux also reserves half the available socket buffer space for data structures therefore usage of it is already accounted for. When proposing to change this the last time it was noted that this behaviour was introduced to solve a performance issue caused by rwnd overusage in combination with small DATA chunks. Trying to reproduce this I found that with the sk_buff overhead removed, the performance would improve significantly unless socket buffer limits are increased. The following numbers have been gathered using a patched iperf supporting SCTP over a live 1 Gbit ethernet network. The -l option was used to limit DATA chunk sizes. The numbers listed are based on the average of 3 test runs each. Default values have been used for sk_(r|w)mem. Chunk Size Unpatched No Overhead ------------------------------------- 4 15.2 Kbit [!] 12.2 Mbit [!] 8 35.8 Kbit [!] 26.0 Mbit [!] 16 95.5 Kbit [!] 54.4 Mbit [!] 32 106.7 Mbit 102.3 Mbit 64 189.2 Mbit 188.3 Mbit 128 331.2 Mbit 334.8 Mbit 256 537.7 Mbit 536.0 Mbit 512 766.9 Mbit 766.6 Mbit 1024 810.1 Mbit 808.6 Mbit Signed-off-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-19sctp: fix incorrect overflow check on autocloseXi Wang
Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for limiting the autoclose value. If userspace passes in -1 on 32-bit platform, the overflow check didn't work and autoclose would be set to 0xffffffff. This patch defines a max_autoclose (in seconds) for limiting the value and exposes it through sysctl, with the following intentions. 1) Avoid overflowing autoclose * HZ. 2) Keep the default autoclose bound consistent across 32- and 64-bit platforms (INT_MAX / HZ in this patch). 3) Keep the autoclose value consistent between setsockopt() and getsockopt() calls. Suggested-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-19llc: llc_cmsg_rcv was getting called after sk_eat_skb.Alex Juncu
Received non stream protocol packets were calling llc_cmsg_rcv that used a skb after that skb was released by sk_eat_skb. This caused received STP packets to generate kernel panics. Signed-off-by: Alexandru Juncu <ajuncu@ixiacom.com> Signed-off-by: Kunjan Naik <knaik@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-19Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth
2011-12-19Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
2011-12-18Revert "Bluetooth: Revert: Fix L2CAP connection establishment"Gustavo F. Padovan
This reverts commit 4dff523a913197e3314c7b0d08734ab037709093. It was reported that this patch cause issues when trying to connect to legacy devices so reverting it. Reported-by: David Fries <david@fries.net> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-12-18Bluetooth: Clear RFCOMM session timer when disconnecting last channelMat Martineau
When the last RFCOMM data channel is closed, a timer is normally set up to disconnect the control channel at a later time. If the control channel disconnect command is sent with the timer pending, the timer needs to be cancelled. If the timer is not cancelled in this situation, the reference counting logic for the RFCOMM session does not work correctly when the remote device closes the L2CAP connection. The session is freed at the wrong time, leading to a kernel panic. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-12-18Bluetooth: Prevent uninitialized data access in L2CAP configurationMat Martineau
When configuring an ERTM or streaming mode connection, remote devices are expected to send an RFC option in a successful config response. A misbehaving remote device might not send an RFC option, and the L2CAP code should not access uninitialized data in this case. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-12-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: ipv6: Check dest prefix length on original route not copied one in rt6_alloc_cow(). sch_gred: should not use GFP_KERNEL while holding a spinlock ipip, sit: copy parms.name after register_netdevice ipv6: Fix for adding multicast route for loopback device automatically. ssb: fix init regression with SoCs rtl8192{ce,cu,de,se}: avoid problems because of possible ERFOFF -> ERFSLEEP transition mac80211: fix another race in aggregation start fsl_pq_mdio: Clean up tbi address configuration ppp: fix pptp double release_sock in pptp_bind() net/fec: fix the use of pdev->id ath9k: fix check for antenna diversity support batman-adv: delete global entry in case of roaming batman-adv: in case of roaming mark the client with TT_CLIENT_ROAM Bluetooth: Correct version check in hci_setup btusb: fix a memory leak in btusb_send_frame() Bluetooth: bnep: Fix module reference Bluetooth: cmtp: Fix module reference Bluetooth: btmrvl: support Marvell Bluetooth device SD8797