aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2010-10-28De-pessimize rds_page_copy_userLinus Torvalds
commit 799c10559d60f159ab2232203f222f18fa3c4a5f upstream. Don't try to "optimize" rds_page_copy_user() by using kmap_atomic() and the unsafe atomic user mode accessor functions. It's actually slower than the straightforward code on any reasonable modern CPU. Back when the code was written (although probably not by the time it was actually merged, though), 32-bit x86 may have been the dominant architecture. And there kmap_atomic() can be a lot faster than kmap() (unless you have very good locality, in which case the virtual address caching by kmap() can overcome all the downsides). But these days, x86-64 may not be more populous, but it's getting there (and if you care about performance, it's definitely already there - you'd have upgraded your CPU's already in the last few years). And on x86-64, the non-kmap_atomic() version is faster, simply because the code is simpler and doesn't have the "re-try page fault" case. People with old hardware are not likely to care about RDS anyway, and the optimization for the 32-bit case is simply buggy, since it doesn't verify the user addresses properly. Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-28wext: fix potential private ioctl memory content leakJohannes Berg
commit df6d02300f7c2fbd0fbe626d819c8e5237d72c62 upstream. When a driver doesn't fill the entire buffer, old heap contents may remain, and if it also doesn't update the length properly, this old heap content will be copied back to userspace. It is very unlikely that this happens in any of the drivers using private ioctls since it would show up as junk being reported by iwpriv, but it seems better to be safe here, so use kzalloc. Reported-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-28mac80211: fix use-after-freeJohannes Berg
commit cd87a2d3a33d75a646f1aa1aa2ee5bf712d6f963 upstream. commit 8c0c709eea5cbab97fb464cd68b06f24acc58ee1 Author: Johannes Berg <johannes@sipsolutions.net> Date: Wed Nov 25 17:46:15 2009 +0100 mac80211: move cmntr flag out of rx flags moved the CMTR flag into the skb's status, and in doing so introduced a use-after-free -- when the skb has been handed to cooked monitors the status setting will touch now invalid memory. Additionally, moving it there has effectively discarded the optimisation -- since the bit is only ever set on freed SKBs, and those were a copy, it could never be checked. For the current release, fixing this properly is a bit too involved, so let's just remove the problematic code and leave userspace with one copy of each frame for each virtual interface. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26sctp: Do not reset the packet during sctp_packet_config().Vlad Yasevich
commit 4bdab43323b459900578b200a4b8cf9713ac8fab upstream. sctp_packet_config() is called when getting the packet ready for appending of chunks. The function should not touch the current state, since it's possible to ping-pong between two transports when sending, and that can result packet corruption followed by skb overlfow crash. Reported-by: Thomas Dreibholz <dreibh@iem.uni-due.de> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26net/llc: make opt unsigned in llc_ui_setsockopt()Dan Carpenter
commit 339db11b219f36cf7da61b390992d95bb6b7ba2e upstream. The members of struct llc_sock are unsigned so if we pass a negative value for "opt" it can cause a sign bug. Also it can cause an integer overflow when we multiply "opt * HZ". Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26net: blackhole route should always be recalculatedJianzhao Wang
[ Upstream commit ae2688d59b5f861dc70a091d003773975d2ae7fb ] Blackhole routes are used when xfrm_lookup() returns -EREMOTE (error triggered by IKE for example), hence this kind of route is always temporary and so we should check if a better route exists for next packets. Bug has been introduced by commit d11a4dc18bf41719c9f0d7ed494d295dd2973b92. Signed-off-by: Jianzhao Wang <jianzhao.wang@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26l2tp: test for ethernet header in l2tp_eth_dev_recv()Eric Dumazet
[ Upstream commit bfc960a8eec023a170a80697fe65157cd4f44f81 ] close https://bugzilla.kernel.org/show_bug.cgi?id=16529 Before calling dev_forward_skb(), we should make sure skb head contains at least an ethernet header, even if length included in upper layer said so. Use pskb_may_pull() to make sure this ethernet header is present in skb head. Reported-by: Thomas Heil <heil@terminal-consulting.de> Reported-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26UNIX: Do not loop forever at unix_autobind().Tetsuo Handa
[ Upstream commit a9117426d0fcc05a194f728159a2d43df43c7add ] We assumed that unix_autobind() never fails if kzalloc() succeeded. But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can consume all names using fork()/socket()/bind(). If all names are in use, those who call bind() with addr_len == sizeof(short) or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue while (1) yield(); loop at unix_autobind() till a name becomes available. This patch adds a loop counter in order to give up after 1048576 attempts. Calling yield() for once per 256 attempts may not be sufficient when many names are already in use, for __unix_find_socket_byname() can take long time under such circumstance. Therefore, this patch also adds cond_resched() call. Note that currently a local user can consume 2GB of kernel memory if the user is allowed to create and autobind 1048576 UNIX domain sockets. We should consider adding some restriction for autobind operation. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26udp: add rehash on connect()Eric Dumazet
commit 719f835853a92f6090258114a72ffe41f09155cd upstream commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation) added a secondary hash on UDP, hashed on (local addr, local port). Problem is that following sequence : fd = socket(...) connect(fd, &remote, ...) not only selects remote end point (address and port), but also sets local address, while UDP stack stored in secondary hash table the socket while its local address was INADDR_ANY (or ipv6 equivalent) Sequence is : - autobind() : choose a random local port, insert socket in hash tables [while local address is INADDR_ANY] - connect() : set remote address and port, change local address to IP given by a route lookup. When an incoming UDP frame comes, if more than 10 sockets are found in primary hash table, we switch to secondary table, and fail to find socket because its local address changed. One solution to this problem is to rehash datagram socket if needed. We add a new rehash(struct socket *) method in "struct proto", and implement this method for UDP v4 & v6, using a common helper. This rehashing only takes care of secondary hash table, since primary hash (based on local port only) is not changed. Reported-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26tcp: select(writefds) don't hang up when a peer close connectionKOSAKI Motohiro
[ Upstream commit d84ba638e4ba3c40023ff997aa5e8d3ed002af36 ] This issue come from ruby language community. Below test program hang up when only run on Linux. % uname -mrsv Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686 % ruby -rsocket -ve ' BasicSocket.do_not_reverse_lookup = true serv = TCPServer.open("127.0.0.1", 0) s1 = TCPSocket.open("127.0.0.1", serv.addr[1]) s2 = serv.accept s2.close s1.write("a") rescue p $! s1.write("a") rescue p $! Thread.new { s1.write("a") }.join' ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux] #<Errno::EPIPE: Broken pipe> [Hang Here] FreeBSD, Solaris, Mac doesn't. because Ruby's write() method call select() internally. and tcp_poll has a bug. SUS defined 'ready for writing' of select() as following. | A descriptor shall be considered ready for writing when a call to an output | function with O_NONBLOCK clear would not block, whether or not the function | would transfer data successfully. That said, EPIPE situation is clearly one of 'ready for writing'. We don't have read-side issue because tcp_poll() already has read side shutdown care. | if (sk->sk_shutdown & RCV_SHUTDOWN) | mask |= POLLIN | POLLRDNORM | POLLRDHUP; So, Let's insert same logic in write side. - reference url http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31065 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31068 Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26tcp: fix three tcp sysctls tuningEric Dumazet
[ Upstream commit c5ed63d66f24fd4f7089b5a6e087b0ce7202aa8e ] As discovered by Anton Blanchard, current code to autotune tcp_death_row.sysctl_max_tw_buckets, sysctl_tcp_max_orphans and sysctl_max_syn_backlog makes little sense. The bigger a page is, the less tcp_max_orphans is : 4096 on a 512GB machine in Anton's case. (tcp_hashinfo.bhash_size * sizeof(struct inet_bind_hashbucket)) is much bigger if spinlock debugging is on. Its wrong to select bigger limits in this case (where kernel structures are also bigger) bhash_size max is 65536, and we get this value even for small machines. A better ground is to use size of ehash table, this also makes code shorter and more obvious. Based on a patch from Anton, and another from David. Reported-and-tested-by: Anton Blanchard <anton@samba.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26tcp: Combat per-cpu skew in orphan tests.David S. Miller
[ Upstream commit ad1af0fedba14f82b240a03fe20eb9b2fdbd0357 ] As reported by Anton Blanchard when we use percpu_counter_read_positive() to make our orphan socket limit checks, the check can be off by up to num_cpus_online() * batch (which is 32 by default) which on a 128 cpu machine can be as large as the default orphan limit itself. Fix this by doing the full expensive sum check if the optimized check triggers. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26net: RPS needs to depend upon USE_GENERIC_SMP_HELPERSDavid S. Miller
[ Upstream commit 6dcbc12290abb452a5e42713faa6461b248e2f55 ] You cannot invoke __smp_call_function_single() unless the architecture sets this symbol. Reported-by: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26rds: fix a leak of kernel memoryEric Dumazet
[ Upstream commit f037590fff3005ce8a1513858d7d44f50053cc8f ] struct rds_rdma_notify contains a 32 bits hole on 64bit arches, make sure it is zeroed before copying it to user. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26irda: Correctly clean up self->ias_obj on irda_bind() failure.David S. Miller
[ Upstream commit 628e300cccaa628d8fb92aa28cb7530a3d5f2257 ] If irda_open_tsap() fails, the irda_bind() code tries to destroy the ->ias_obj object by hand, but does so wrongly. In particular, it fails to a) release the hashbin attached to the object and b) reset the self->ias_obj pointer to NULL. Fix both problems by using irias_delete_object() and explicitly setting self->ias_obj to NULL, just as irda_release() does. Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26gro: Re-fix different skb headroomsJarek Poplawski
[ Upstream commit 64289c8e6851bca0e589e064c9a5c9fbd6ae5dd4 ] The patch: "gro: fix different skb headrooms" in its part: "2) allocate a minimal skb for head of frag_list" is buggy. The copied skb has p->data set at the ip header at the moment, and skb_gro_offset is the length of ip + tcp headers. So, after the change the length of mac header is skipped. Later skb_set_mac_header() sets it into the NET_SKB_PAD area (if it's long enough) and ip header is misaligned at NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the original skb was wrongly allocated, so let's copy it as it was. bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626 fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57 Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26gro: fix different skb headroomsEric Dumazet
[ Upstream commit 3d3be4333fdf6faa080947b331a6a19bce1a4f57 ] Packets entering GRO might have different headrooms, even for a given flow (because of implementation details in drivers, like copybreak). We cant force drivers to deliver packets with a fixed headroom. 1) fix skb_segment() skb_segment() makes the false assumption headrooms of fragments are same than the head. When CHECKSUM_PARTIAL is used, this can give csum_start errors, and crash later in skb_copy_and_csum_dev() 2) allocate a minimal skb for head of frag_list skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to allocate a fresh skb. This adds NET_SKB_PAD to a padding already provided by netdevice, depending on various things, like copybreak. Use alloc_skb() to allocate an exact padding, to reduce cache line needs: NET_SKB_PAD + NET_IP_ALIGN bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626 Many thanks to Plamen Petrov, testing many debugging patches ! With help of Jarek Poplawski. Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-26bridge: Clear INET control block of SKBs passed into ip_fragment().David S. Miller
[ Upstream commit 4ce6b9e1621c187a32a47a17bf6be93b1dc4a3df ] In a similar vain to commit 17762060c25590bfddd68cc1131f28ec720f405f ("bridge: Clear IPCB before possible entry into IP stack") Any time we call into the IP stack we have to make sure the state there is as expected by the ipv4 code. With help from Eric Dumazet and Herbert Xu. Reported-by: Brandan Das <brandan.das@stratus.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-20SUNRPC: Fix race corrupting rpc upcallTrond Myklebust
commit 5a67657a2e90c9e4a48518f95d4ba7777aa20fbb upstream. If rpc_queue_upcall() adds a new upcall to the rpci->pipe list just after rpc_pipe_release calls rpc_purge_list(), but before it calls gss_pipe_release (as rpci->ops->release_pipe(inode)), then the latter will free a message without deleting it from the rpci->pipe list. We will be left with a freed object on the rpc->pipe list. Most frequent symptoms are kernel crashes in rpc.gssd system calls on the pipe in question. Reported-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-20wireless extensions: fix kernel heap content leakJohannes Berg
commit 42da2f948d949efd0111309f5827bf0298bcc9a4 upstream. Wireless extensions have an unfortunate, undocumented requirement which requires drivers to always fill iwp->length when returning a successful status. When a driver doesn't do this, it leads to a kernel heap content leak when userspace offers a larger buffer than would have been necessary. Arguably, this is a driver bug, as it should, if it returns 0, fill iwp->length, even if it separately indicated that the buffer contents was not valid. However, we can also at least avoid the memory content leak if the driver doesn't do this by setting the iwp length to max_tokens, which then reflects how big the buffer is that the driver may fill, regardless of how big the userspace buffer is. To illustrate the point, this patch also fixes a corresponding cfg80211 bug (since this requirement isn't documented nor was ever pointed out by anyone during code review, I don't trust all drivers nor all cfg80211 handlers to implement it correctly). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-20irda: off by oneDan Carpenter
commit cf9b94f88bdbe8a02015fc30d7c232b2d262d4ad upstream. This is an off by one. We would go past the end when we NUL terminate the "value" string at end of the function. The "value" buffer is allocated in irlan_client_parse_response() or irlan_provider_parse_command(). CC: stable@kernel.org Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-20mac80211: delete work timerJohannes Berg
commit 071249b1d501b1f31a6b1af3fbcbe03158a84e5c upstream. The new workqueue changes helped me find this bug that's been lingering since the changes to the work processing in mac80211 -- the work timer is never deleted properly. Do that to avoid having it fire after all data structures have been freed. It can't be re-armed because all it will do, if running, is schedule the work, but that gets flushed later and won't have anything to do since all work items are gone by now (by way of interface removal). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-20netfilter: fix CONFIG_COMPAT supportFlorian Westphal
commit cca77b7c81876d819a5806f408b3c29b5b61a815 upstream. commit f3c5c1bfd430858d3a05436f82c51e53104feb6b (netfilter: xtables: make ip_tables reentrant) forgot to also compute the jumpstack size in the compat handlers. Result is that "iptables -I INPUT -j userchain" turns into -j DROP. Reported by Sebastian Roesner on #netfilter, closes http://bugzilla.netfilter.org/show_bug.cgi?id=669. Note: arptables change is compile-tested only. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26netlink: fix compat recvmsgJohannes Berg
commit 68d6ac6d2740b6a55f3ae92a4e0be6d881904b32 upstream. Since commit 1dacc76d0014a034b8aca14237c127d7c19d7726 Author: Johannes Berg <johannes@sipsolutions.net> Date: Wed Jul 1 11:26:02 2009 +0000 net/compat/wext: send different messages to compat tasks we had a race condition when setting and then restoring frag_list. Eric attempted to fix it, but the fix created even worse problems. However, the original motivation I had when I added the code that turned out to be racy is no longer clear to me, since we only copy up to skb->len to userspace, which doesn't include the frag_list length. As a result, not doing any frag_list clearing and restoring avoids the race condition, while not introducing any other problems. Additionally, while preparing this patch I found that since none of the remaining netlink code is really aware of the frag_list, we need to use the original skb's information for packet information and credentials. This fixes, for example, the group information received by compat tasks. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26can-raw: Fix skb_orphan_try handlingOliver Hartkopp
commit cff0d6e6edac7672b3f915bb4fb59f279243b7f9 upstream. Commit fc6055a5ba31e2c14e36e8939f9bf2b6d586a7f5 (net: Introduce skb_orphan_try()) allows an early orphan of the skb and takes care on tx timestamping, which needs the sk-reference in the skb on driver level. So does the can-raw socket, which has not been taken into account here. The patch below adds a 'prevent_sk_orphan' bit in the skb tx shared info, which fixes the problem discovered by Matthias Fuchs here: http://marc.info/?t=128030411900003&r=1&w=2 Even if it's not a primary tx timestamp topic it fits well into some skb shared tx context. Or should be find a different place for the information to protect the sk reference until it reaches the driver level? Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26act_nat: fix wild pointerChangli Gao
[ Upstream commit 072d79a31a3b870b49886f4347e23f81b7eca3ac ] pskb_may_pull() may change skb pointers, so adjust icmph after pskb_may_pull(). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26net: disable preemption before call smp_processor_id()Changli Gao
[ Upstream commit cece1945bffcf1a823cdfa36669beae118419351 ] Although netif_rx() isn't expected to be called in process context with preemption enabled, it'd better handle this case. And this is why get_cpu() is used in the non-RPS #ifdef branch. If tree RCU is selected, rcu_read_lock() won't disable preemption, so preempt_disable() should be called explictly. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26pkt_sched: Fix sch_sfq vs tc_modify_qdisc oopsJarek Poplawski
[ Upstream commit 41065fba846e795b31b17e4dec01cb904d56c6cd ] sch_sfq as a classful qdisc needs the .leaf handler. Otherwise, there is an oops possible in tc_modify_qdisc()/check_loop(). Fixes commit 7d2681a6ff4f9ab5e48d02550b4c6338f1638998 Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26pkt_sched: Fix sch_sfq vs tcf_bind_filter oopsJarek Poplawski
[ Upstream commit eb4a5527b1f0d581ac217c80ef3278ed5e38693c ] Since there was added ->tcf_chain() method without ->bind_tcf() to sch_sfq class options, there is oops when a filter is added with the classid parameter. Fixes commit 7d2681a6ff4f9ab5e48d02550b4c6338f1638998 netdev thread: null pointer at cls_api.c Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Reported-by: Franchoze Eric <franchoze@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26net: Fix a memmove bug in dev_gro_receive()Jarek Poplawski
[ Upstream commit e5093aec2e6b60c3df2420057ffab9ed4a6d2792 ] >Xin Xiaohui wrote: > I looked into the code dev_gro_receive(), found the code here: > if the frags[0] is pulled to 0, then the page will be released, > and memmove() frags left. > Is that right? I'm not sure if memmove do right or not, but > frags[0].size is never set after memove at least. what I think > a simple way is not to do anything if we found frags[0].size == 0. > The patch is as followed. ... This version of the patch fixes the bug directly in memmove. Reported-by: "Xin, Xiaohui" <xiaohui.xin@intel.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26net: Fix napi_gro_frags vs netpoll pathJarek Poplawski
[ Upstream commit ce9e76c8450fc248d3e1fc16ef05e6eb50c02fa5 ] The netpoll_rx_on() check in __napi_gro_receive() skips part of the "common" GRO_NORMAL path, especially "pull:" in dev_gro_receive(), where at least eth header should be copied for entirely paged skbs. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26can: add limit for nframes and clean up signed/unsigned variablesOliver Hartkopp
[ Upstream commit 5b75c4973ce779520b9d1e392483207d6f842cde ] This patch adds a limit for nframes as the number of frames in TX_SETUP and RX_SETUP are derived from a single byte multiplex value by default. Use-cases that would require to send/filter more than 256 CAN frames should be implemented in userspace for complexity reasons anyway. Additionally the assignments of unsigned values from userspace to signed values in kernelspace and vice versa are fixed by using unsigned values in kernelspace consistently. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Reported-by: Ben Hawkes <hawkes@google.com> Acked-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26act_nat: the checksum of ICMP doesn't have pseudo headerChangli Gao
[ Upstream commit 3a3dfb062c2e086c202d34f09ce29634515ad256 ] after updating the value of the ICMP payload, inet_proto_csum_replace4() should be called with zero pseudohdr. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26bridge: Fix skb leak when multicast parsing fails on TXHerbert Xu
[ Upstream commit 6d1d1d398cb7db7a12c5d652d50f85355345234f ] On the bridge TX path we're leaking an skb when br_multicast_rcv returns an error. Reported-by: David Lamparter <equinox@diac24.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26tcp: cookie transactions setsockopt memory leakDmitry Popov
[ Upstream commit a3bdb549e30e7a263f7a589747c40e9c50110315 ] There is a bug in do_tcp_setsockopt(net/ipv4/tcp.c), TCP_COOKIE_TRANSACTIONS case. In some cases (when tp->cookie_values == NULL) new tcp_cookie_values structure can be allocated (at cvp), but not bound to tp->cookie_values. So a memory leak occurs. Signed-off-by: Dmitry Popov <dp@highloadlab.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26bridge: add rcu_read_lock on transmitStephen Hemminger
[ Upstream commit eeaf61d8891f9c9ed12c1a667e72bf83f0857954 ] Long ago, when bridge was converted to RCU, rcu lock was equivalent to having preempt disabled. RCU has changed a lot since then and bridge code was still assuming the since transmit was called with bottom half disabled, it was RCU safe. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-26cfg80211: fix locking in action frame TXJohannes Berg
commit fe100acddf438591ecf3582cb57241e560da70b7 upstream. Accesses to "wdev->current_bss" must be locked with the wdev lock, which action frame transmission is missing. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10mac80211: avoid scheduling while atomic in mesh_rx_plink_frameJohn W. Linville
commit c937019761a758f2749b1f3a032b7a91fb044753 upstream. While mesh_rx_plink_frame holds sta->lock... mesh_rx_plink_frame -> mesh_plink_inc_estab_count -> ieee80211_bss_info_change_notify ...but ieee80211_bss_info_change_notify is allowed to sleep. A driver taking advantage of that allowance can cause a scheduling while atomic bug. Similar paths exist for mesh_plink_dec_estab_count, so work around those as well. http://bugzilla.kernel.org/show_bug.cgi?id=16099 Also, correct a minor kerneldoc comment error (mismatched function names). Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10cfg80211: don't get expired BSSesJohannes Berg
commit ccb6c1360f8dd43303c659db718e7e0b24175db5 upstream. When kernel-internal users use cfg80211_get_bss() to get a reference to a BSS struct, they may end up getting one that would have been removed from the list if there had been any userspace access to the list. This leads to inconsistencies and problems. Fix it by making cfg80211_get_bss() ignore BSSes that cfg80211_bss_expire() would remove. Fixes http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2180 Reported-by: Jiajia Zheng <jiajia.zheng@intel.com> Tested-by: Jiajia Zheng <jiajia.zheng@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10cfg80211: ignore spurious deauthJohannes Berg
commit 643f82e32f14faf0d0944c804203a6681b6b0a1e upstream. Ever since mac80211/drivers are no longer fully in charge of keeping track of the auth status, trying to make them do so will fail. Instead of warning and reporting the deauthentication to userspace, cfg80211 must simply ignore it so that spurious deauthentications, e.g. before starting authentication, aren't seen by userspace as actual deauthentications. Reported-by: Paul Stewart <pstew@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-109p: strlen() doesn't count the terminatorDan Carpenter
commit 5c4bfa17f3ec46becec4b23d12323f7605ebd696 upstream. This is an off by one bug because strlen() doesn't count the NULL terminator. We strcpy() addr into a fixed length array of size UNIX_PATH_MAX later on. The addr variable is the name of the device being mounted. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10arp_notify: allow drivers to explicitly request a notification event.Ian Campbell
commit 06c4648d46d1b757d6b9591a86810be79818b60c upstream. Currently such notifications are only generated when the device comes up or the address changes. However one use case for these notifications is to enable faster network recovery after a virtual machine migration (by causing switches to relearn their MAC tables). A migration appears to the network stack as a temporary loss of carrier and therefore does not trigger either of the current conditions. Rather than adding carrier up as a trigger (which can cause issues when interfaces a flapping) simply add an interface which the driver can use to explicitly trigger the notification. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-07-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: s2io: fixing DBG_PRINT() macro ath9k: fix dma direction for map/unmap in ath_rx_tasklet net: dev_forward_skb should call nf_reset net sched: fix race in mirred device removal tun: avoid BUG, dump packet on GSO errors bonding: set device in RLB ARP packet handler wimax/i2400m: Add PID & VID for Intel WiMAX 6250 ipv6: Don't add routes to ipv6 disabled interfaces. net: Fix skb_copy_expand() handling of ->csum_start net: Fix corruption of skb csum field in pskb_expand_head() of net/core/skbuff.c macvtap: Limit packet queue length ixgbe/igb: catch invalid VF settings bnx2x: Advance a module version bnx2x: Protect statistics ramrod and sequence number bnx2x: Protect a SM state change wireless: use netif_rx_ni in ieee80211_send_layer2_update
2010-07-26Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2010-07-25net: dev_forward_skb should call nf_resetBen Greear
With conn-track zones and probably with different network namespaces, the netfilter logic needs to be re-calculated on packet receive. If the netfilter logic is not reset, it will not be recalculated properly. This patch adds the nf_reset logic to dev_forward_skb. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24net sched: fix race in mirred device removalstephen hemminger
This fixes hang when target device of mirred packet classifier action is removed. If a mirror or redirection action is configured to cause packets to go to another device, the classifier holds a ref count, but was assuming the adminstrator cleaned up all redirections before removing. The fix is to add a notifier and cleanup during unregister. The new list is implicitly protected by RTNL mutex because it is held during filter add/delete as well as notifier. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22ipv6: Don't add routes to ipv6 disabled interfaces.Brian Haley
If the interface has IPv6 disabled, don't add a multicast or link-local route since we won't be adding a link-local address. Reported-by: Mahesh Kelkar <maheshkelkar@gmail.com> Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22net: Fix skb_copy_expand() handling of ->csum_startDavid S. Miller
It should only be adjusted if ip_summed == CHECKSUM_PARTIAL. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22net: Fix corruption of skb csum field in pskb_expand_head() of net/core/skbuff.cAndrea Shepard
Make pskb_expand_head() check ip_summed to make sure csum_start is really csum_start and not csum before adjusting it. This fixes a bug I encountered using a Sun Quad-Fast Ethernet card and VLANs. On my configuration, the sunhme driver produces skbs with differing amounts of headroom on receive depending on the packet size. See line 2030 of drivers/net/sunhme.c; packets smaller than RX_COPY_THRESHOLD have 52 bytes of headroom but packets larger than that cutoff have only 20 bytes. When these packets reach the VLAN driver, vlan_check_reorder_header() calls skb_cow(), which, if the packet has less than NET_SKB_PAD (== 32) bytes of headroom, uses pskb_expand_head() to make more. Then, pskb_expand_head() needs to adjust a lot of offsets into the skb, including csum_start. Since csum_start is a union with csum, if the packet has a valid csum value this will corrupt it, which was the effect I observed. The sunhme hardware computes receive checksums, so the skbs would be created by the driver with ip_summed == CHECKSUM_COMPLETE and a valid csum field, and then pskb_expand_head() would corrupt the csum field, leading to an "hw csum error" message later on, for example in icmp_rcv() for pings larger than the sunhme RX_COPY_THRESHOLD. On the basis of the comment at the beginning of include/linux/skbuff.h, I believe that the csum_start skb field is only meaningful if ip_csummed is CSUM_PARTIAL, so this patch makes pskb_expand_head() adjust it only in that case to avoid corrupting a valid csum value. Please see my more in-depth disucssion of tracking down this bug for more details if you like: http://puellavulnerata.livejournal.com/112186.html http://puellavulnerata.livejournal.com/112567.html http://puellavulnerata.livejournal.com/112891.html http://puellavulnerata.livejournal.com/113096.html http://puellavulnerata.livejournal.com/113591.html I am not subscribed to this list, so please CC me on replies. Signed-off-by: Andrea Shepard <andrea@persephoneslair.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-21mm: add context argument to shrinker callback to remaining shrinkersDave Chinner
Add the shrinkers missed in the first conversion of the API in commit 7f8275d0d660c146de6ee3017e1e2e594c49e820 ("mm: add context argument to shrinker callback"). Signed-off-by: Dave Chinner <dchinner@redhat.com>