aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2012-12-06netfilter: Mark SYN/ACK packets as invalid from original directionJozsef Kadlecsik
commit 64f509ce71b08d037998e93dd51180c19b2f464c upstream. Clients should not send such packets. By accepting them, we open up a hole by wich ephemeral ports can be discovered in an off-path attack. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-12-06wireless: allow 40 MHz on world roaming channels 12/13Johannes Berg
commit 43c771a1963ab461a2f194e3c97fded1d5fe262f upstream. When in world roaming mode, allow 40 MHz to be used on channels 12 and 13 so that an AP that is, e.g., using HT40+ on channel 9 (in the UK) can be used. Reported-by: Eddie Chapman <eddie@ehuk.net> Tested-by: Eddie Chapman <eddie@ehuk.net> Acked-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-12-06mac80211: sync acccess to tx_filtered/ps_tx_buf queuesArik Nemtsov
commit 987c285c2ae2e4e32aca3a9b3252d28171c75711 upstream. These are accessed without a lock when ending STA PSM. If the sta_cleanup timer accesses these lists at the same time, we might crash. This may fix some mysterious crashes we had during ieee80211_sta_ps_deliver_wakeup. Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Ido Yariv <ido@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16ipv6: send unsolicited neighbour advertisements to all-nodesHannes Frederic Sowa
[ Upstream commit 60713a0ca7fd6651b951cc1b4dbd528d1fc0281b ] As documented in RFC4861 (Neighbor Discovery for IP version 6) 7.2.6., unsolicited neighbour advertisements should be sent to the all-nodes multicast address. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16af-packet: fix oops when socket is not presentEric Leblond
[ Upstream commit a3d744e995d2b936c500585ae39d99ee251c89b4 ] Due to a NULL dereference, the following patch is causing oops in normal trafic condition: commit c0de08d04215031d68fa13af36f347a6cfa252ca Author: Eric Leblond <eric@regit.org> Date:   Thu Aug 16 22:02:58 2012 +0000     af_packet: don't emit packet on orig fanout group This buggy patch was a feature fix and has reached most stable branches. When skb->sk is NULL and when packet fanout is used, there is a crash in match_fanout_group where skb->sk is accessed. This patch fixes the issue by returning false as soon as the socket is NULL: this correspond to the wanted behavior because the kernel as to resend the skb to all the listening socket in this case. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16l2tp: fix oops in l2tp_eth_create() error pathTom Parkin
[ Upstream commit 789336360e0a2aeb9750c16ab704a02cbe035e9e ] When creating an L2TPv3 Ethernet session, if register_netdev() should fail for any reason (for example, automatic naming for "l2tpeth%d" interfaces hits the 32k-interface limit), the netdev is freed in the error path. However, the l2tp_eth_sess structure's dev pointer is left uncleared, and this results in l2tp_eth_delete() then attempting to unregister the same netdev later in the session teardown. This results in an oops. To avoid this, clear the session dev pointer in the error path. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16net: fix divide by zero in tcp algorithm illinoisJesper Dangaard Brouer
[ Upstream commit 8f363b77ee4fbf7c3bbcf5ec2c5ca482d396d664 ] Reading TCP stats when using TCP Illinois congestion control algorithm can cause a divide by zero kernel oops. The division by zero occur in tcp_illinois_info() at: do_div(t, ca->cnt_rtt); where ca->cnt_rtt can become zero (when rtt_reset is called) Steps to Reproduce: 1. Register tcp_illinois: # sysctl -w net.ipv4.tcp_congestion_control=illinois 2. Monitor internal TCP information via command "ss -i" # watch -d ss -i 3. Establish new TCP conn to machine Either it fails at the initial conn, or else it needs to wait for a loss or a reset. This is only related to reading stats. The function avg_delay() also performs the same divide, but is guarded with a (ca->cnt_rtt > 0) at its calling point in update_params(). Thus, simply fix tcp_illinois_info(). Function tcp_illinois_info() / get_info() is called without socket lock. Thus, eliminate any race condition on ca->cnt_rtt by using a local stack variable. Simply reuse info.tcpv_rttcnt, as its already set to ca->cnt_rtt. Function avg_delay() is not affected by this race condition, as its called with the socket lock. Cc: Petr Matousek <pmatouse@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16ipv6: Set default hoplimit as zero.Li RongQing
[ Upstream commit 14edd87dc67311556f1254a8f29cf4dd6cb5b7d1 ] Commit a02e4b7dae4551(Demark default hoplimit as zero) only changes the hoplimit checking condition and default value in ip6_dst_hoplimit, not zeros all hoplimit default value. Keep the zeroing ip6_template_metrics[RTAX_HOPLIMIT - 1] to force it as const, cause as a37e6e344910(net: force dst_default_metrics to const section) Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16tcp: fix FIONREAD/SIOCINQEric Dumazet
[ Upstream commit a3374c42aa5f7237e87ff3b0622018636b0c847e ] tcp_ioctl() tries to take into account if tcp socket received a FIN to report correct number bytes in receive queue. But its flaky because if the application ate the last skb, we return 1 instead of 0. Correct way to detect that FIN was received is to test SOCK_DONE. Reported-by: Elliot Hughes <enh@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16netlink: use kfree_rcu() in netlink_release()Eric Dumazet
[ Upstream commit 6d772ac5578f711d1ce7b03535d1c95bffb21dff ] On some suspend/resume operations involving wimax device, we have noticed some intermittent memory corruptions in netlink code. Stéphane Marchesin tracked this corruption in netlink_update_listeners() and suggested a patch. It appears netlink_release() should use kfree_rcu() instead of kfree() for the listeners structure as it may be used by other cpus using RCU protection. netlink_release() must set to NULL the listeners pointer when it is about to be freed. Also have to protect netlink_update_listeners() and netlink_has_listeners() if listeners is NULL. Add a nl_deref_protected() lockdep helper to properly document which locks protects us. Reported-by: Jonathan Kliegman <kliegs@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stéphane Marchesin <marcheu@google.com> Cc: Sam Leffler <sleffler@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16sctp: fix call to SCTP_CMD_PROCESS_SACK in sctp_cmd_interpreter()Zijie Pan
[ Upstream commit f6e80abeab928b7c47cc1fbf53df13b4398a2bec ] Bug introduced by commit edfee0339e681a784ebacec7e8c2dc97dc6d2839 (sctp: check src addr when processing SACK to update transport state) Signed-off-by: Zijie Pan <zijie.pan@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16mac80211: make sure data is accessible in EAPOL checkJohannes Berg
commit 6dbda2d00d466225f9db1dc695ff852443f28832 upstream. The code to allow EAPOL frames even when the station isn't yet marked associated needs to check that the incoming frame is long enough and due to paged RX it also can't assume skb->data contains the right data, it must use skb_copy_bits(). Fix this to avoid using data that doesn't really exist. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16mac80211: verify that skb data is presentJohannes Berg
commit 9b395bc3be1cebf0144a127c7e67d56dbdac0930 upstream. A number of places in the mesh code don't check that the frame data is present and in the skb header when trying to access. Add those checks and the necessary pskb_may_pull() calls. This prevents accessing data that doesn't actually exist. To do this, export ieee80211_get_mesh_hdrlen() to be able to use it in mac80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16mac80211: check management frame header lengthJohannes Berg
commit 4a4f1a5808c8bb0b72a4f6e5904c53fb8c9cd966 upstream. Due to pskb_may_pull() checking the skb length, all non-management frames are checked on input whether their 802.11 header is fully present. Also add that check for management frames and remove a check that is now duplicate. This prevents accessing skb data beyond the frame end. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16wireless: drop invalid mesh address extension framesJohannes Berg
commit 7dd111e8ee10cc6816669eabcad3334447673236 upstream. The mesh header can have address extension by a 4th or a 5th and 6th address, but never both. Drop such frames in 802.11 -> 802.3 conversion along with any frames that have the wrong extension. Reviewed-by: Javier Cardona <javier@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16mac80211: fix SSID copy on IBSS JOINAntonio Quartulli
commit badecb001a310408d3473b1fc2ed5aefd0bc92a9 upstream. The 'ssid' field of the cfg80211_ibss_params is a u8 pointer and its length is likely to be less than IEEE80211_MAX_SSID_LEN most of the time. This patch fixes the ssid copy in ieee80211_ibss_join() by using the SSID length to prevent it from reading beyond the string. Signed-off-by: Antonio Quartulli <ordex@autistici.org> [rewrapped commit message, small rewording] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16mac80211: don't inspect Sequence Control field on control framesJavier Cardona
commit f7fbf70ee9db6da6033ae50d100e017ac1f26555 upstream. Per IEEE Std. 802.11-2012, Sec 8.2.4.4.1, the sequence Control field is not present in control frames. We noticed this problem when processing Block Ack Requests. Signed-off-by: Javier Cardona <javier@cozybit.com> Signed-off-by: Javier Lopez <jlopex@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16mac80211: Only process mesh config header on frames that RA_MATCHJavier Cardona
commit 555cb715be8ef98b8ec362b23dfc254d432a35b1 upstream. Doing otherwise is wrong, and may wreak havoc on the mpp tables, specially if the frame is encrypted. Reported-by: Chaoxing Lin <Chaoxing.Lin@ultra-3eti.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> [bwh: Backported to 3.2: we have a large block conditional on IEEE80211_RX_RA_MATCH rather than a goto conditional on the opposite, so delete the condition] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16cfg80211: fix antenna gain handlingFelix Fietkau
commit c4a9fafc77a5318f5ed26c509bbcddf03e18c201 upstream. No driver initializes chan->max_antenna_gain to something sensible, and the only place where it is being used right now is inside ath9k. This leads to ath9k potentially using less tx power than it can use, which can decrease performance/range in some rare cases. Rather than going through every single driver, this patch initializes chan->orig_mag in wiphy_register(), ignoring whatever value the driver left in there. If a driver for some reason wishes to limit it independent from regulatory rulesets, it can do so internally. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16rtnetlink: fix rtnl_calcit() and rtnl_dump_ifinfo()Eric Dumazet
commit a4b64fbe482c7766f7925f03067fc637716bfa3f upstream. nlmsg_parse() might return an error, so test its return value before potential random memory accesses. Errors introduced in commit 115c9b81928 (rtnetlink: Fix problem with buffer allocation) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16rtnetlink: Fix problem with buffer allocationGreg Rose
commit 115c9b81928360d769a76c632bae62d15206a94a upstream. Implement a new netlink attribute type IFLA_EXT_MASK. The mask is a 32 bit value that can be used to indicate to the kernel that certain extended ifinfo values are requested by the user application. At this time the only mask value defined is RTEXT_FILTER_VF to indicate that the user wants the ifinfo dump to send information about the VFs belonging to the interface. This patch fixes a bug in which certain applications do not have large enough buffers to accommodate the extra information returned by the kernel with large numbers of SR-IOV virtual functions. Those applications will not send the new netlink attribute with the interface info dump request netlink messages so they will not get unexpectedly large request buffers returned by the kernel. Modifies the rtnl_calcit function to traverse the list of net devices and compute the minimum buffer size that can hold the info dumps of all matching devices based upon the filter passed in via the new netlink attribute filter mask. If no filter mask is sent then the buffer allocation defaults to NLMSG_GOODSIZE. With this change it is possible to add yet to be defined netlink attributes to the dump request which should make it fairly extensible in the future. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: drop the change in do_setlink() that reverts commit f18da14565819ba43b8321237e2426a2914cc2ef, which we never applied] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16Bluetooth: Avoid calling undefined smp_conn_security()Ben Hutchings
Commit ff03261adc8b4bdd8291f1783c079b53a892b429 ('Bluetooth: Fix sending a HCI Authorization Request over LE links', commit d8343f125710fb596f7a88cd756679f14f4e77b9 upstream) added an call to smp_conn_security() from hci_conn_security(). The former is only defined if CONFIG_BT_L2CAP=y. This is not required in mainline since commit f1e91e1640d808d332498a6b09b2bcd01462eff9 ('Bluetooth: Always compile SCO and L2CAP in Bluetooth Core') removed that option. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30netfilter: nf_conntrack: fix racy timer handling with reliable eventsPablo Neira Ayuso
commit 5b423f6a40a0327f9d40bc8b97ce9be266f74368 upstream. Existing code assumes that del_timer returns true for alive conntrack entries. However, this is not true if reliable events are enabled. In that case, del_timer may return true for entries that were just inserted in the dying list. Note that packets / ctnetlink may hold references to conntrack entries that were just inserted to such list. This patch fixes the issue by adding an independent timer for event delivery. This increases the size of the ecache extension. Still we can revisit this later and use variable size extensions to allocate this area on demand. Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30SUNRPC: Get rid of the xs_error_report socket callbackTrond Myklebust
commit f878b657ce8e7d3673afe48110ec208a29e38c4a upstream. Chris Perl reports that we're seeing races between the wakeup call in xs_error_report and the connect attempts. Basically, Chris has shown that in certain circumstances, the call to xs_error_report causes the rpc_task that is responsible for reconnecting to wake up early, thus triggering a disconnect and retry. Since the sk->sk_error_report() calls in the socket layer are always followed by a tcp_done() in the cases where we care about waking up the rpc_tasks, just let the state_change callbacks take responsibility for those wake ups. Reported-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30SUNRPC: Prevent races in xs_abort_connection()Trond Myklebust
commit 4bc1e68ed6a8b59be8a79eb719be515a55c7bc68 upstream. The call to xprt_disconnect_done() that is triggered by a successful connection reset will trigger another automatic wakeup of all tasks on the xprt->pending rpc_wait_queue. In particular it will cause an early wake up of the task that called xprt_connect(). All we really want to do here is clear all the socket-specific state flags, so we split that functionality out of xs_sock_mark_closed() into a helper that can be called by xs_abort_connection() Reported-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30Revert "SUNRPC: Ensure we close the socket on EPIPE errors too..."Trond Myklebust
commit b9d2bb2ee537424a7f855e1f93eed44eb9ee0854 upstream. This reverts commit 55420c24a0d4d1fce70ca713f84aa00b6b74a70e. Now that we clear the connected flag when entering TCP_CLOSE_WAIT, the deadlock described in this commit is no longer possible. Instead, the resulting call to xs_tcp_shutdown() can interfere with pending reconnection attempts. Reported-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30SUNRPC: Clear the connect flag when socket state is TCP_CLOSE_WAITTrond Myklebust
commit d0bea455dd48da1ecbd04fedf00eb89437455fdc upstream. This is needed to ensure that we call xprt_connect() upon the next call to call_connect(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30SUNRPC: Prevent kernel stack corruption on long values of flushSasha Levin
commit 212ba90696ab4884e2025b0b13726d67aadc2cd4 upstream. The buffer size in read_flush() is too small for the longest possible values for it. This can lead to a kernel stack corruption: [ 43.047329] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff833e64b4 [ 43.047329] [ 43.049030] Pid: 6015, comm: trinity-child18 Tainted: G W 3.5.0-rc7-next-20120716-sasha #221 [ 43.050038] Call Trace: [ 43.050435] [<ffffffff836c60c2>] panic+0xcd/0x1f4 [ 43.050931] [<ffffffff833e64b4>] ? read_flush.isra.7+0xe4/0x100 [ 43.051602] [<ffffffff810e94e6>] __stack_chk_fail+0x16/0x20 [ 43.052206] [<ffffffff833e64b4>] read_flush.isra.7+0xe4/0x100 [ 43.052951] [<ffffffff833e6500>] ? read_flush_pipefs+0x30/0x30 [ 43.053594] [<ffffffff833e652c>] read_flush_procfs+0x2c/0x30 [ 43.053596] [<ffffffff812b9a8c>] proc_reg_read+0x9c/0xd0 [ 43.053596] [<ffffffff812b99f0>] ? proc_reg_write+0xd0/0xd0 [ 43.053596] [<ffffffff81250d5b>] do_loop_readv_writev+0x4b/0x90 [ 43.053596] [<ffffffff81250fd6>] do_readv_writev+0xf6/0x1d0 [ 43.053596] [<ffffffff812510ee>] vfs_readv+0x3e/0x60 [ 43.053596] [<ffffffff812511b8>] sys_readv+0x48/0xb0 [ 43.053596] [<ffffffff8378167d>] system_call_fastpath+0x1a/0x1f Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30mac80211: check if key has TKIP type before updating IVStanislaw Gruszka
commit 4045f72bcf3c293c7c5932ef001742d8bb5ded76 upstream. This patch fix corruption which can manifest itself by following crash when switching on rfkill switch with rt2x00 driver: https://bugzilla.redhat.com/attachment.cgi?id=615362 Pointer key->u.ccmp.tfm of group key get corrupted in: ieee80211_rx_h_michael_mic_verify(): /* update IV in key information to be able to detect replays */ rx->key->u.tkip.rx[rx->security_idx].iv32 = rx->tkip_iv32; rx->key->u.tkip.rx[rx->security_idx].iv16 = rx->tkip_iv16; because rt2x00 always set RX_FLAG_MMIC_STRIPPED, even if key is not TKIP. We already check type of the key in different path in ieee80211_rx_h_michael_mic_verify() function, so adding additional check here is reasonable. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30Bluetooth: SMP: Fix setting unknown auth_req bitsJohan Hedberg
commit 065a13e2cc665f6547dc7e8a9d6b6565badf940a upstream. When sending a pairing request or response we should not just blindly copy the value that the remote device sent. Instead we should at least make sure to mask out any unknown bits. This is particularly critical from the upcoming LE Secure Connections feature perspective as incorrectly indicating support for it (by copying the remote value) would cause a failure to pair with devices that support it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30pktgen: fix crash when generating IPv6 packetsAmerigo Wang
commit 5aa8b572007c4bca1e6d3dd4c4820f1ae49d6bb2 upstream. For IPv6, sizeof(struct ipv6hdr) = 40, thus the following expression will result negative: datalen = pkt_dev->cur_pkt_size - 14 - sizeof(struct ipv6hdr) - sizeof(struct udphdr) - pkt_dev->pkt_overhead; And, the check "if (datalen < sizeof(struct pktgen_hdr))" will be passed as "datalen" is promoted to unsigned, therefore will cause a crash later. This is a quick fix by checking if "datalen" is negative. The following patch will increase the default value of 'min_pkt_size' for IPv6. This bug should exist for a long time, so Cc -stable too. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30tcp: resets are misroutedAlexey Kuznetsov
[ Upstream commit 4c67525849e0b7f4bd4fab2487ec9e43ea52ef29 ] After commit e2446eaa ("tcp_v4_send_reset: binding oif to iif in no sock case").. tcp resets are always lost, when routing is asymmetric. Yes, backing out that patch will result in misrouting of resets for dead connections which used interface binding when were alive, but we actually cannot do anything here. What's died that's died and correct handling normal unbound connections is obviously a priority. Comment to comment: > This has few benefits: > 1. tcp_v6_send_reset already did that. It was done to route resets for IPv6 link local addresses. It was a mistake to do so for global addresses. The patch fixes this as well. Actually, the problem appears to be even more serious than guaranteed loss of resets. As reported by Sergey Soloviev <sol@eqv.ru>, those misrouted resets create a lot of arp traffic and huge amount of unresolved arp entires putting down to knees NAT firewalls which use asymmetric routing. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30RDS: fix rds-ping spinlock recursionjeff.liu
[ Upstream commit 5175a5e76bbdf20a614fb47ce7a38f0f39e70226 ] This is the revised patch for fixing rds-ping spinlock recursion according to Venkat's suggestions. RDS ping/pong over TCP feature has been broken for years(2.6.39 to 3.6.0) since we have to set TCP cork and call kernel_sendmsg() between ping/pong which both need to lock "struct sock *sk". However, this lock has already been hold before rds_tcp_data_ready() callback is triggerred. As a result, we always facing spinlock resursion which would resulting in system panic. Given that RDS ping is only used to test the connectivity and not for serious performance measurements, we can queue the pong transmit to rds_wq as a delayed response. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> CC: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> CC: David S. Miller <davem@davemloft.net> CC: James Morris <james.l.morris@oracle.com> Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30vlan: don't deliver frames for unknown vlans to protocolsFlorian Zumbiehl
[ Upstream commit 48cc32d38a52d0b68f91a171a8d00531edc6a46e ] 6a32e4f9dd9219261f8856f817e6655114cfec2f made the vlan code skip marking vlan-tagged frames for not locally configured vlans as PACKET_OTHERHOST if there was an rx_handler, as the rx_handler could cause the frame to be received on a different (virtual) vlan-capable interface where that vlan might be configured. As rx_handlers do not necessarily return RX_HANDLER_ANOTHER, this could cause frames for unknown vlans to be delivered to the protocol stack as if they had been received untagged. For example, if an ipv6 router advertisement that's tagged for a locally not configured vlan is received on an interface with macvlan interfaces attached, macvlan's rx_handler returns RX_HANDLER_PASS after delivering the frame to the macvlan interfaces, which caused it to be passed to the protocol stack, leading to ipv6 addresses for the announced prefix being configured even though those are completely unusable on the underlying interface. The fix moves marking as PACKET_OTHERHOST after the rx_handler so the rx_handler, if there is one, sees the frame unchanged, but afterwards, before the frame is delivered to the protocol stack, it gets marked whether there is an rx_handler or not. Signed-off-by: Florian Zumbiehl <florz@florz.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30net: Fix skb_under_panic oops in neigh_resolve_outputramesh.nagappa@gmail.com
[ Upstream commit e1f165032c8bade3a6bdf546f8faf61fda4dd01c ] The retry loop in neigh_resolve_output() and neigh_connected_output() call dev_hard_header() with out reseting the skb to network_header. This causes the retry to fail with skb_under_panic. The fix is to reset the network_header within the retry loop. Signed-off-by: Ramesh Nagappa <ramesh.nagappa@ericsson.com> Reviewed-by: Shawn Lu <shawn.lu@ericsson.com> Reviewed-by: Robert Coulson <robert.coulson@ericsson.com> Reviewed-by: Billie Alsup <billie.alsup@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-30SUNRPC: Set alloc_slot for backchannel tcp opsBryan Schumaker
commit 84e28a307e376f271505af65a7b7e212dd6f61f4 upstream. f39c1bfb5a03e2d255451bff05be0d7255298fa4 (SUNRPC: Fix a UDP transport regression) introduced the "alloc_slot" function for xprt operations, but never created one for the backchannel operations. This patch fixes a null pointer dereference when mounting NFS over v4.1. Call Trace: [<ffffffffa0207957>] ? xprt_reserve+0x47/0x50 [sunrpc] [<ffffffffa02023a4>] call_reserve+0x34/0x60 [sunrpc] [<ffffffffa020e280>] __rpc_execute+0x90/0x400 [sunrpc] [<ffffffffa020e61a>] rpc_async_schedule+0x2a/0x40 [sunrpc] [<ffffffff81073589>] process_one_work+0x139/0x500 [<ffffffff81070e70>] ? alloc_worker+0x70/0x70 [<ffffffffa020e5f0>] ? __rpc_execute+0x400/0x400 [sunrpc] [<ffffffff81073d1e>] worker_thread+0x15e/0x460 [<ffffffff8145c839>] ? preempt_schedule+0x49/0x70 [<ffffffff81073bc0>] ? rescuer_thread+0x230/0x230 [<ffffffff81079603>] kthread+0x93/0xa0 [<ffffffff81465d04>] kernel_thread_helper+0x4/0x10 [<ffffffff81079570>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff81465d00>] ? gs_change+0x13/0x13 Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-17netfilter: xt_limit: have r->cost != 0 case workJan Engelhardt
commit 82e6bfe2fbc4d48852114c4f979137cd5bf1d1a8 upstream. Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when a running state is saved to userspace and then reinstated from there. Make sure that private xt_limit area is initialized with correct values. Otherwise, random matchings due to use of uninitialized memory. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-17netfilter: limit, hashlimit: avoid duplicated inlineFlorian Westphal
commit 7a909ac70f6b0823d9f23a43f19598d4b57ac901 upstream. credit_cap can be set to credit, which avoids inlining user2credits twice. Also, remove inline keyword and let compiler decide. old: 684 192 0 876 36c net/netfilter/xt_limit.o 4927 344 32 5303 14b7 net/netfilter/xt_hashlimit.o now: 668 192 0 860 35c net/netfilter/xt_limit.o 4793 344 32 5169 1431 net/netfilter/xt_hashlimit.o Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-17netfilter: nf_ct_expect: fix possible access to uninitialized timerPablo Neira Ayuso
commit 2614f86490122bf51eb7c12ec73927f1900f4e7d upstream. In __nf_ct_expect_check, the function refresh_timer returns 1 if a matching expectation is found and its timer is successfully refreshed. This results in nf_ct_expect_related returning 0. Note that at this point: - the passed expectation is not inserted in the expectation table and its timer was not initialized, since we have refreshed one matching/existing expectation. - nf_ct_expect_alloc uses kmem_cache_alloc, so the expectation timer is in some undefined state just after the allocation, until it is appropriately initialized. This can be a problem for the SIP helper during the expectation addition: ... if (nf_ct_expect_related(rtp_exp) == 0) { if (nf_ct_expect_related(rtcp_exp) != 0) nf_ct_unexpect_related(rtp_exp); ... Note that nf_ct_expect_related(rtp_exp) may return 0 for the timer refresh case that is detailed above. Then, if nf_ct_unexpect_related(rtcp_exp) returns != 0, nf_ct_unexpect_related(rtp_exp) is called, which does: spin_lock_bh(&nf_conntrack_lock); if (del_timer(&exp->timeout)) { nf_ct_unlink_expect(exp); nf_ct_expect_put(exp); } spin_unlock_bh(&nf_conntrack_lock); Note that del_timer always returns false if the timer has been initialized. However, the timer was not initialized since setup_timer was not called, therefore, the expectation timer remains in some undefined state. If I'm not missing anything, this may lead to the removal an unexistent expectation. To fix this, the optimization that allows refreshing an expectation is removed. Now nf_conntrack_expect_related looks more consistent to me since it always add the expectation in case that it returns success. Thanks to Patrick McHardy for participating in the discussion of this patch. I think this may be the source of the problem described by: http://marc.info/?l=netfilter-devel&m=134073514719421&w=2 Reported-by: Rafal Fitt <rafalf@aplusc.com.pl> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-17netfilter: nf_nat_sip: fix via header translation with multiple parametersPatrick McHardy
commit f22eb25cf5b1157b29ef88c793b71972efc47143 upstream. Via-headers are parsed beginning at the first character after the Via-address. When the address is translated first and its length decreases, the offset to start parsing at is incorrect and header parameters might be missed. Update the offset after translating the Via-address to fix this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-17netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectationPablo Neira Ayuso
commit 3f509c689a07a4aa989b426893d8491a7ffcc410 upstream. We're hitting bug while trying to reinsert an already existing expectation: kernel BUG at kernel/timer.c:895! invalid opcode: 0000 [#1] SMP [...] Call Trace: <IRQ> [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack] [<ffffffff812d423a>] ? in4_pton+0x72/0x131 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip] [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip] [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip] [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip] We have to remove the RTP expectation if the RTCP expectation hits EBUSY since we keep trying with other ports until we succeed. Reported-by: Rafal Fitt <rafalf@aplusc.com.pl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-17netfilter: nf_ct_ipv4: packets with wrong ihl are invalidJozsef Kadlecsik
commit 07153c6ec074257ade76a461429b567cff2b3a1e upstream. It was reported that the Linux kernel sometimes logs: klogd: [2629147.402413] kernel BUG at net / netfilter / nf_conntrack_proto_tcp.c: 447! klogd: [1072212.887368] kernel BUG at net / netfilter / nf_conntrack_proto_tcp.c: 392 ipv4_get_l4proto() in nf_conntrack_l3proto_ipv4.c and tcp_error() in nf_conntrack_proto_tcp.c should catch malformed packets, so the errors at the indicated lines - TCP options parsing - should not happen. However, tcp_error() relies on the "dataoff" offset to the TCP header, calculated by ipv4_get_l4proto(). But ipv4_get_l4proto() does not check bogus ihl values in IPv4 packets, which then can slip through tcp_error() and get caught at the TCP options parsing routines. The patch fixes ipv4_get_l4proto() by invalidating packets with bogus ihl value. The patch closes netfilter bugzilla id 771. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-17SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAITTrond Myklebust
commit a519fc7a70d1a918574bb826cc6905b87b482eb9 upstream. Instead of doing a shutdown() call, we need to do an actual close(). Ditto if/when the server is sending us junk RPC headers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Simon Kirby <sim@hostway.ca> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-10net: do not disable sg for packets requiring no checksumEd Cashin
[ Upstream commit c0d680e577ff171e7b37dbdb1b1bf5451e851f04 ] A change in a series of VLAN-related changes appears to have inadvertently disabled the use of the scatter gather feature of network cards for transmission of non-IP ethernet protocols like ATA over Ethernet (AoE). Below is a reference to the commit that introduces a "harmonize_features" function that turns off scatter gather when the NIC does not support hardware checksumming for the ethernet protocol of an sk buff. commit f01a5236bd4b140198fbcc550f085e8361fd73fa Author: Jesse Gross <jesse@nicira.com> Date: Sun Jan 9 06:23:31 2011 +0000 net offloading: Generalize netif_get_vlan_features(). The can_checksum_protocol function is not equipped to consider a protocol that does not require checksumming. Calling it for a protocol that requires no checksum is inappropriate. The patch below has harmonize_features call can_checksum_protocol when the protocol needs a checksum, so that the network layer is not forced to perform unnecessary skb linearization on the transmission of AoE packets. Unnecessary linearization results in decreased performance and increased memory pressure, as reported here: http://www.spinics.net/lists/linux-mm/msg15184.html The problem has probably not been widely experienced yet, because only recently has the kernel.org-distributed aoe driver acquired the ability to use payloads of over a page in size, with the patchset recently included in the mm tree: https://lkml.org/lkml/2012/8/28/140 The coraid.com-distributed aoe driver already could use payloads of greater than a page in size, but its users generally do not use the newest kernels. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-10netrom: copy_datagram_iovec can failAlan Cox
[ Upstream commit 6cf5c951175abcec4da470c50565cc0afe6cd11d ] Check for an error from this and if so bail properly. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-10l2tp: fix a typo in l2tp_eth_dev_recv()Eric Dumazet
[ Upstream commit c0cc88a7627c333de50b07b7c60b1d49d9d2e6cc ] While investigating l2tp bug, I hit a bug in eth_type_trans(), because not enough bytes were pulled in skb head. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-10ipv6: mip6: fix mip6_mh_filter()Eric Dumazet
[ Upstream commit 96af69ea2a83d292238bdba20e4508ee967cf8cb ] mip6_mh_filter() should not modify its input, or else its caller would need to recompute ipv6_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-10ipv6: raw: fix icmpv6_filter()Eric Dumazet
[ Upstream commit 1b05c4b50edbddbdde715c4a7350629819f6655e ] icmpv6_filter() should not modify its input, or else its caller would need to recompute ipv6_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() and change the prototype to make clear both sk and skb are const. Also, if icmpv6 header cannot be found, do not deliver the packet, as we do in IPv4. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-10ipv4: raw: fix icmp_filter()Eric Dumazet
[ Upstream commit ab43ed8b7490cb387782423ecf74aeee7237e591 ] icmp_filter() should not modify its input, or else its caller would need to recompute ip_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() and change the prototype to make clear both sk and skb are const. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-10-10net: guard tcp_set_keepalive() to tcp socketsEric Dumazet
[ Upstream commit 3e10986d1d698140747fcfc2761ec9cb64c1d582 ] Its possible to use RAW sockets to get a crash in tcp_set_keepalive() / sk_reset_timer() Fix is to make sure socket is a SOCK_STREAM one. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>