<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/ipv4, branch v3.2.38</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/net/ipv4?h=v3.2.38</id>
<link rel='self' href='https://git.amat.us/linux/atom/net/ipv4?h=v3.2.38'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-01-16T01:13:27Z</updated>
<entry>
<title>tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation</title>
<updated>2013-01-16T01:13:27Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-10-21T19:57:11Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e252bbd8c87b95e9cecdc01350fbb0b46a0f9bf1'/>
<id>urn:sha1:e252bbd8c87b95e9cecdc01350fbb0b46a0f9bf1</id>
<content type='text'>
[ Upstream commit 354e4aa391ed50a4d827ff6fc11e0667d0859b25 ]

RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation]

  All TCP stacks MAY implement the following mitigation.  TCP stacks
  that implement this mitigation MUST add an additional input check to
  any incoming segment.  The ACK value is considered acceptable only if
  it is in the range of ((SND.UNA - MAX.SND.WND) &lt;= SEG.ACK &lt;=
  SND.NXT).  All incoming segments whose ACK value doesn't satisfy the
  above condition MUST be discarded and an ACK sent back.

Move tcp_send_challenge_ack() before tcp_ack() to avoid a forward
declaration.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Jerry Chu &lt;hkchu@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: tcp_replace_ts_recent() should not be called from tcp_validate_incoming()</title>
<updated>2013-01-16T01:13:26Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-11-13T05:37:18Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=9ae46af9cdaaac4938974c51ad7db2b8dc60ff83'/>
<id>urn:sha1:9ae46af9cdaaac4938974c51ad7db2b8dc60ff83</id>
<content type='text'>
[ Upstream commit bd090dfc634ddd711a5fbd0cadc6e0ab4977bcaf ]

We added support for RFC 5961 in latest kernels but TCP fails
to perform exhaustive check of ACK sequence.

We can update our view of peer tsval from a frame that is
later discarded by tcp_ack()

This makes timestamps enabled sessions vulnerable to injection of
a high tsval : peers start an ACK storm, since the victim
sends a dupack each time it receives an ACK from the other peer.

As tcp_validate_incoming() is called before tcp_ack(), we should
not peform tcp_replace_ts_recent() from it, and let callers do it
at the right time.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Nandita Dukkipati &lt;nanditad@google.com&gt;
Cc: H.K. Jerry Chu &lt;hkchu@google.com&gt;
Cc: Romain Francoise &lt;romain@orebokech.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: refine SYN handling in tcp_validate_incoming</title>
<updated>2013-01-16T01:13:26Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-07-17T12:29:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d37f92d306c41ebd908bcdef373dea512b17cafb'/>
<id>urn:sha1:d37f92d306c41ebd908bcdef373dea512b17cafb</id>
<content type='text'>
[ Upstream commit e371589917011efe6ff8c7dfb4e9e81934ac5855 ]

Followup of commit 0c24604b68fc (tcp: implement RFC 5961 4.2)

As reported by Vijay Subramanian, we should send a challenge ACK
instead of a dup ack if a SYN flag is set on a packet received out of
window.

This permits the ratelimiting to work as intended, and to increase
correct SNMP counters.

Suggested-by: Vijay Subramanian &lt;subramanian.vijay@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Vijay Subramanian &lt;subramanian.vijay@gmail.com&gt;
Cc: Kiran Kumar Kella &lt;kkiran@broadcom.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: implement RFC 5961 4.2</title>
<updated>2013-01-16T01:13:25Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-07-17T01:41:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=481079c4df95e11d3893b92fa4000f58e1cd713b'/>
<id>urn:sha1:481079c4df95e11d3893b92fa4000f58e1cd713b</id>
<content type='text'>
[ Upstream commit 0c24604b68fc7810d429d6c3657b6f148270e528 ]

Implement the RFC 5691 mitigation against Blind
Reset attack using SYN bit.

Section 4.2 of RFC 5961 advises to send a Challenge ACK and drop
incoming packet, instead of resetting the session.

Add a new SNMP counter to count number of challenge acks sent
in response to SYN packets.
(netstat -s | grep TCPSYNChallenge)

Remove obsolete TCPAbortOnSyn, since we no longer abort a TCP session
because of a SYN flag.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Kiran Kumar Kella &lt;kkiran@broadcom.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: implement RFC 5961 3.2</title>
<updated>2013-01-16T01:13:25Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-07-17T08:13:05Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=61f69dc4e40e41b0018f00fa4aeb23d3239556fb'/>
<id>urn:sha1:61f69dc4e40e41b0018f00fa4aeb23d3239556fb</id>
<content type='text'>
[ Upstream commit 282f23c6ee343126156dd41218b22ece96d747e3 ]

Implement the RFC 5691 mitigation against Blind
Reset attack using RST bit.

Idea is to validate incoming RST sequence,
to match RCV.NXT value, instead of previouly accepted
window : (RCV.NXT &lt;= SEG.SEQ &lt; RCV.NXT+RCV.WND)

If sequence is in window but not an exact match, send
a "challenge ACK", so that the other part can resend an
RST with the appropriate sequence.

Add a new sysctl, tcp_challenge_ack_limit, to limit
number of challenge ACK sent per second.

Add a new SNMP counter to count number of challenge acks sent.
(netstat -s | grep TCPChallengeACK)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Kiran Kumar Kella &lt;kkiran@broadcom.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock</title>
<updated>2013-01-16T01:13:24Z</updated>
<author>
<name>Christoph Paasch</name>
<email>christoph.paasch@uclouvain.be</email>
</author>
<published>2012-12-14T04:07:58Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=9c68c2b7558ca787ad75075eb3f4e106033ed2e7'/>
<id>urn:sha1:9c68c2b7558ca787ad75075eb3f4e106033ed2e7</id>
<content type='text'>
[ Upstream commit e337e24d6624e74a558aa69071e112a65f7b5758 ]

If in either of the above functions inet_csk_route_child_sock() or
__inet_inherit_port() fails, the newsk will not be freed:

unreferenced object 0xffff88022e8a92c0 (size 1592):
  comm "softirq", pid 0, jiffies 4294946244 (age 726.160s)
  hex dump (first 32 bytes):
    0a 01 01 01 0a 01 01 02 00 00 00 00 a7 cc 16 00  ................
    02 00 03 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff8153d190&gt;] kmemleak_alloc+0x21/0x3e
    [&lt;ffffffff810ab3e7&gt;] kmem_cache_alloc+0xb5/0xc5
    [&lt;ffffffff8149b65b&gt;] sk_prot_alloc.isra.53+0x2b/0xcd
    [&lt;ffffffff8149b784&gt;] sk_clone_lock+0x16/0x21e
    [&lt;ffffffff814d711a&gt;] inet_csk_clone_lock+0x10/0x7b
    [&lt;ffffffff814ebbc3&gt;] tcp_create_openreq_child+0x21/0x481
    [&lt;ffffffff814e8fa5&gt;] tcp_v4_syn_recv_sock+0x3a/0x23b
    [&lt;ffffffff814ec5ba&gt;] tcp_check_req+0x29f/0x416
    [&lt;ffffffff814e8e10&gt;] tcp_v4_do_rcv+0x161/0x2bc
    [&lt;ffffffff814eb917&gt;] tcp_v4_rcv+0x6c9/0x701
    [&lt;ffffffff814cea9f&gt;] ip_local_deliver_finish+0x70/0xc4
    [&lt;ffffffff814cec20&gt;] ip_local_deliver+0x4e/0x7f
    [&lt;ffffffff814ce9f8&gt;] ip_rcv_finish+0x1fc/0x233
    [&lt;ffffffff814cee68&gt;] ip_rcv+0x217/0x267
    [&lt;ffffffff814a7bbe&gt;] __netif_receive_skb+0x49e/0x553
    [&lt;ffffffff814a7cc3&gt;] netif_receive_skb+0x50/0x82

This happens, because sk_clone_lock initializes sk_refcnt to 2, and thus
a single sock_put() is not enough to free the memory. Additionally, things
like xfrm, memcg, cookie_values,... may have been initialized.
We have to free them properly.

This is fixed by forcing a call to tcp_done(), ending up in
inet_csk_destroy_sock, doing the final sock_put(). tcp_done() is necessary,
because it ends up doing all the cleanup on xfrm, memcg, cookie_values,
xfrm,...

Before calling tcp_done, we have to set the socket to SOCK_DEAD, to
force it entering inet_csk_destroy_sock. To avoid the warning in
inet_csk_destroy_sock, inet_num has to be set to 0.
As inet_csk_destroy_sock does a dec on orphan_count, we first have to
increase it.

Calling tcp_done() allows us to remove the calls to
tcp_clear_xmit_timer() and tcp_cleanup_congestion_control().

A similar approach is taken for dccp by calling dccp_done().

This is in the kernel since 093d282321 (tproxy: fix hash locking issue
when using port redirection in __inet_inherit_port()), thus since
version &gt;= 2.6.37.

Signed-off-by: Christoph Paasch &lt;christoph.paasch@uclouvain.be&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ipv4: ip_check_defrag must not modify skb before unsharing</title>
<updated>2013-01-03T03:33:52Z</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2012-12-09T23:41:06Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3bbb42f09dd86794f0ae82bdcbd511810d4ff2db'/>
<id>urn:sha1:3bbb42f09dd86794f0ae82bdcbd511810d4ff2db</id>
<content type='text'>
[ Upstream commit 1bf3751ec90cc3174e01f0d701e8449ce163d113 ]

ip_check_defrag() might be called from af_packet within the
RX path where shared SKBs are used, so it must not modify
the input SKB before it has unshared it for defragmentation.
Use skb_copy_bits() to get the IP header and only pull in
everything later.

The same is true for the other caller in macvlan as it is
called from dev-&gt;rx_handler which can also get a shared SKB.

Reported-by: Eric Leblond &lt;eric@regit.org&gt;
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ipv4: avoid undefined behavior in do_ip_setsockopt()</title>
<updated>2012-12-06T11:20:11Z</updated>
<author>
<name>Xi Wang</name>
<email>xi.wang@gmail.com</email>
</author>
<published>2012-11-11T11:20:01Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=58fda05f80418cf457e3425d8b58b15262198398'/>
<id>urn:sha1:58fda05f80418cf457e3425d8b58b15262198398</id>
<content type='text'>
[ Upstream commit 0c9f79be295c99ac7e4b569ca493d75fdcc19e4e ]

(1&lt;&lt;optname) is undefined behavior in C with a negative optname or
optname larger than 31.  In those cases the result of the shift is
not necessarily zero (e.g., on x86).

This patch simplifies the code with a switch statement on optname.
It also allows the compiler to generate better code (e.g., using a
64-bit mask).

Signed-off-by: Xi Wang &lt;xi.wang@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_nat: don't check for port change on ICMP tuples</title>
<updated>2012-12-06T11:20:10Z</updated>
<author>
<name>Ulrich Weber</name>
<email>ulrich.weber@sophos.com</email>
</author>
<published>2012-10-25T05:34:45Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=5c1972f1b4f784560003075e05dcd95c1db8116e'/>
<id>urn:sha1:5c1972f1b4f784560003075e05dcd95c1db8116e</id>
<content type='text'>
commit 38fe36a248ec3228f8e6507955d7ceb0432d2000 upstream.

ICMP tuples have id in src and type/code in dst.
So comparing src.u.all with dst.u.all will always fail here
and ip_xfrm_me_harder() is called for every ICMP packet,
even if there was no NAT.

Signed-off-by: Ulrich Weber &lt;ulrich.weber@sophos.com&gt;
[Pablo Neira Ayuso: Backported to.3.0]
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>net: fix divide by zero in tcp algorithm illinois</title>
<updated>2012-11-16T16:47:16Z</updated>
<author>
<name>Jesper Dangaard Brouer</name>
<email>brouer@redhat.com</email>
</author>
<published>2012-10-31T02:45:32Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=df769f065d7bebf0ddc5f61605dbb1d8ea5ee2d8'/>
<id>urn:sha1:df769f065d7bebf0ddc5f61605dbb1d8ea5ee2d8</id>
<content type='text'>
[ Upstream commit 8f363b77ee4fbf7c3bbcf5ec2c5ca482d396d664 ]

Reading TCP stats when using TCP Illinois congestion control algorithm
can cause a divide by zero kernel oops.

The division by zero occur in tcp_illinois_info() at:
 do_div(t, ca-&gt;cnt_rtt);
where ca-&gt;cnt_rtt can become zero (when rtt_reset is called)

Steps to Reproduce:
 1. Register tcp_illinois:
     # sysctl -w net.ipv4.tcp_congestion_control=illinois
 2. Monitor internal TCP information via command "ss -i"
     # watch -d ss -i
 3. Establish new TCP conn to machine

Either it fails at the initial conn, or else it needs to wait
for a loss or a reset.

This is only related to reading stats.  The function avg_delay() also
performs the same divide, but is guarded with a (ca-&gt;cnt_rtt &gt; 0) at its
calling point in update_params().  Thus, simply fix tcp_illinois_info().

Function tcp_illinois_info() / get_info() is called without
socket lock.  Thus, eliminate any race condition on ca-&gt;cnt_rtt
by using a local stack variable.  Simply reuse info.tcpv_rttcnt,
as its already set to ca-&gt;cnt_rtt.
Function avg_delay() is not affected by this race condition, as
its called with the socket lock.

Cc: Petr Matousek &lt;pmatouse@redhat.com&gt;
Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
</feed>
