<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/ipv4, branch v3.0.84</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/net/ipv4?h=v3.0.84</id>
<link rel='self' href='https://git.amat.us/linux/atom/net/ipv4?h=v3.0.84'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-06-27T17:34:32Z</updated>
<entry>
<title>ip_tunnel: fix kernel panic with icmp_dest_unreach</title>
<updated>2013-06-27T17:34:32Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-05-24T05:49:58Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=34e4c0aed353934a72809784999900fc7b5653ef'/>
<id>urn:sha1:34e4c0aed353934a72809784999900fc7b5653ef</id>
<content type='text'>
[ Upstream commit a622260254ee481747cceaaa8609985b29a31565 ]

Daniel Petre reported crashes in icmp_dst_unreach() with following call
graph:

Daniel found a similar problem mentioned in
 http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html

And indeed this is the root cause : skb-&gt;cb[] contains data fooling IP
stack.

We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure()
is called. Or else skb-&gt;cb[] might contain garbage from GSO segmentation
layer.

A similar fix was tested on linux-3.9, but gre code was refactored in
linux-3.10. I'll send patches for stable kernels as well.

Many thanks to Daniel for providing reports, patches and testing !

Reported-by: Daniel Petre &lt;daniel.petre@rcs-rds.ro&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: xps: fix reordering issues</title>
<updated>2013-06-27T17:34:32Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-05-23T07:44:20Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=04e093d128963bfe46f50dbcff4c70147464fd0a'/>
<id>urn:sha1:04e093d128963bfe46f50dbcff4c70147464fd0a</id>
<content type='text'>
[ Upstream commit 547669d483e5783d722772af1483fa474da7caf9 ]

commit 3853b5841c01a ("xps: Improvements in TX queue selection")
introduced ooo_okay flag, but the condition to set it is slightly wrong.

In our traces, we have seen ACK packets being received out of order,
and RST packets sent in response.

We should test if we have any packets still in host queue.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Tom Herbert &lt;therbert@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: fix tcp_md5_hash_skb_data()</title>
<updated>2013-06-27T17:34:31Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-05-13T21:25:52Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=41a187532a9458873a603aee48f017c0288d03fe'/>
<id>urn:sha1:41a187532a9458873a603aee48f017c0288d03fe</id>
<content type='text'>
[ Upstream commit 54d27fcb338bd9c42d1dfc5a39e18f6f9d373c2e ]

TCP md5 communications fail [1] for some devices, because sg/crypto code
assume page offsets are below PAGE_SIZE.

This was discovered using mlx4 driver [2], but I suspect loopback
might trigger the same bug now we use order-3 pages in tcp_sendmsg()

[1] Failure is giving following messages.

huh, entered softirq 3 NET_RX ffffffff806ad230 preempt_count 00000100,
exited with 00000101?

[2] mlx4 driver uses order-2 pages to allocate RX frags

Reported-by: Matt Schnall &lt;mischnal@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Bernhard Beck &lt;bbeck@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: drop dst before queueing fragments</title>
<updated>2013-05-01T15:56:40Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-04-16T12:55:41Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=8a53479a31bed3ef13f55c6752cb1a3962affcff'/>
<id>urn:sha1:8a53479a31bed3ef13f55c6752cb1a3962affcff</id>
<content type='text'>
[ Upstream commit 97599dc792b45b1669c3cdb9a4b365aad0232f65 ]

Commit 4a94445c9a5c (net: Use ip_route_input_noref() in input path)
added a bug in IP defragmentation handling, as non refcounted
dst could escape an RCU protected section.

Commit 64f3b9e203bd068 (net: ip_expire() must revalidate route) fixed
the case of timeouts, but not the general problem.

Tom Parkin noticed crashes in UDP stack and provided a patch,
but further analysis permitted us to pinpoint the root cause.

Before queueing a packet into a frag list, we must drop its dst,
as this dst has limited lifetime (RCU protected)

When/if a packet is finally reassembled, we use the dst of the very
last skb, still protected by RCU and valid, as the dst of the
reassembled packet.

Use same logic in IPv6, as there is no need to hold dst references.

Reported-by: Tom Parkin &lt;tparkin@katalix.com&gt;
Tested-by: Tom Parkin &lt;tparkin@katalix.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: call tcp_replace_ts_recent() from tcp_ack()</title>
<updated>2013-05-01T15:56:38Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-04-19T07:19:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7c79dac82743cab718a07520617810eb5fb8eb56'/>
<id>urn:sha1:7c79dac82743cab718a07520617810eb5fb8eb56</id>
<content type='text'>
[ Upstream commit 12fb3dd9dc3c64ba7d64cec977cca9b5fb7b1d4e ]

commit bd090dfc634d (tcp: tcp_replace_ts_recent() should not be called
from tcp_validate_incoming()) introduced a TS ecr bug in slow path
processing.

1 A &gt; B P. 1:10001(10000) ack 1 &lt;nop,nop,TS val 1001 ecr 200&gt;
2 B &lt; A . 1:1(0) ack 1 win 257 &lt;sack 9001:10001,TS val 300 ecr 1001&gt;
3 A &gt; B . 1:1001(1000) ack 1 win 227 &lt;nop,nop,TS val 1002 ecr 200&gt;
4 A &gt; B . 1001:2001(1000) ack 1 win 227 &lt;nop,nop,TS val 1002 ecr 200&gt;

(ecr 200 should be ecr 300 in packets 3 &amp; 4)

Problem is tcp_ack() can trigger send of new packets (retransmits),
reflecting the prior TSval, instead of the TSval contained in the
currently processed incoming packet.

Fix this by calling tcp_replace_ts_recent() from tcp_ack() after the
checks, but before the actions.

Reported-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>esp4: fix error return code in esp_output()</title>
<updated>2013-05-01T15:56:37Z</updated>
<author>
<name>Wei Yongjun</name>
<email>yongjun_wei@trendmicro.com.cn</email>
</author>
<published>2013-04-13T15:49:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fe18256f3e2410e574cc48dbf033a706d000b0ff'/>
<id>urn:sha1:fe18256f3e2410e574cc48dbf033a706d000b0ff</id>
<content type='text'>
[ Upstream commit 06848c10f720cbc20e3b784c0df24930b7304b93 ]

Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun &lt;yongjun_wei@trendmicro.com.cn&gt;
Acked-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: incoming connections might use wrong route under synflood</title>
<updated>2013-05-01T15:56:37Z</updated>
<author>
<name>Dmitry Popov</name>
<email>dp@highloadlab.com</email>
</author>
<published>2013-04-11T08:55:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7b14772957275672b360e5ebd5604623561e0f30'/>
<id>urn:sha1:7b14772957275672b360e5ebd5604623561e0f30</id>
<content type='text'>
[ Upstream commit d66954a066158781ccf9c13c91d0316970fe57b6 ]

There is a bug in cookie_v4_check (net/ipv4/syncookies.c):
	flowi4_init_output(&amp;fl4, 0, sk-&gt;sk_mark, RT_CONN_FLAGS(sk),
			   RT_SCOPE_UNIVERSE, IPPROTO_TCP,
			   inet_sk_flowi_flags(sk),
			   (opt &amp;&amp; opt-&gt;srr) ? opt-&gt;faddr : ireq-&gt;rmt_addr,
			   ireq-&gt;loc_addr, th-&gt;source, th-&gt;dest);

Here we do not respect sk-&gt;sk_bound_dev_if, therefore wrong dst_entry may be
taken. This dst_entry is used by new socket (get_cookie_sock -&gt;
tcp_v4_syn_recv_sock), so its packets may take the wrong path.

Signed-off-by: Dmitry Popov &lt;dp@highloadlab.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: undo spurious timeout after SACK reneging</title>
<updated>2013-04-05T17:16:53Z</updated>
<author>
<name>Yuchung Cheng</name>
<email>ycheng@google.com</email>
</author>
<published>2013-03-24T10:42:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=01326198867ce1ab1bb26acb48de7b06eab571fe'/>
<id>urn:sha1:01326198867ce1ab1bb26acb48de7b06eab571fe</id>
<content type='text'>
[ Upstream commit 7ebe183c6d444ef5587d803b64a1f4734b18c564 ]

On SACK reneging the sender immediately retransmits and forces a
timeout but disables Eifel (undo). If the (buggy) receiver does not
drop any packet this can trigger a false slow-start retransmit storm
driven by the ACKs of the original packets. This can be detected with
undo and TCP timestamps.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: preserve ACK clocking in TSO</title>
<updated>2013-04-05T17:16:52Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-03-21T17:36:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=06551316b51c5f140c29748011c37057cb7ba932'/>
<id>urn:sha1:06551316b51c5f140c29748011c37057cb7ba932</id>
<content type='text'>
[ Upstream commit f4541d60a449afd40448b06496dcd510f505928e ]

A long standing problem with TSO is the fact that tcp_tso_should_defer()
rearms the deferred timer, while it should not.

Current code leads to following bad bursty behavior :

20:11:24.484333 IP A &gt; B: . 297161:316921(19760) ack 1 win 119
20:11:24.484337 IP B &gt; A: . ack 263721 win 1117
20:11:24.485086 IP B &gt; A: . ack 265241 win 1117
20:11:24.485925 IP B &gt; A: . ack 266761 win 1117
20:11:24.486759 IP B &gt; A: . ack 268281 win 1117
20:11:24.487594 IP B &gt; A: . ack 269801 win 1117
20:11:24.488430 IP B &gt; A: . ack 271321 win 1117
20:11:24.489267 IP B &gt; A: . ack 272841 win 1117
20:11:24.490104 IP B &gt; A: . ack 274361 win 1117
20:11:24.490939 IP B &gt; A: . ack 275881 win 1117
20:11:24.491775 IP B &gt; A: . ack 277401 win 1117
20:11:24.491784 IP A &gt; B: . 316921:332881(15960) ack 1 win 119
20:11:24.492620 IP B &gt; A: . ack 278921 win 1117
20:11:24.493448 IP B &gt; A: . ack 280441 win 1117
20:11:24.494286 IP B &gt; A: . ack 281961 win 1117
20:11:24.495122 IP B &gt; A: . ack 283481 win 1117
20:11:24.495958 IP B &gt; A: . ack 285001 win 1117
20:11:24.496791 IP B &gt; A: . ack 286521 win 1117
20:11:24.497628 IP B &gt; A: . ack 288041 win 1117
20:11:24.498459 IP B &gt; A: . ack 289561 win 1117
20:11:24.499296 IP B &gt; A: . ack 291081 win 1117
20:11:24.500133 IP B &gt; A: . ack 292601 win 1117
20:11:24.500970 IP B &gt; A: . ack 294121 win 1117
20:11:24.501388 IP B &gt; A: . ack 295641 win 1117
20:11:24.501398 IP A &gt; B: . 332881:351881(19000) ack 1 win 119

While the expected behavior is more like :

20:19:49.259620 IP A &gt; B: . 197601:202161(4560) ack 1 win 119
20:19:49.260446 IP B &gt; A: . ack 154281 win 1212
20:19:49.261282 IP B &gt; A: . ack 155801 win 1212
20:19:49.262125 IP B &gt; A: . ack 157321 win 1212
20:19:49.262136 IP A &gt; B: . 202161:206721(4560) ack 1 win 119
20:19:49.262958 IP B &gt; A: . ack 158841 win 1212
20:19:49.263795 IP B &gt; A: . ack 160361 win 1212
20:19:49.264628 IP B &gt; A: . ack 161881 win 1212
20:19:49.264637 IP A &gt; B: . 206721:211281(4560) ack 1 win 119
20:19:49.265465 IP B &gt; A: . ack 163401 win 1212
20:19:49.265886 IP B &gt; A: . ack 164921 win 1212
20:19:49.266722 IP B &gt; A: . ack 166441 win 1212
20:19:49.266732 IP A &gt; B: . 211281:215841(4560) ack 1 win 119
20:19:49.267559 IP B &gt; A: . ack 167961 win 1212
20:19:49.268394 IP B &gt; A: . ack 169481 win 1212
20:19:49.269232 IP B &gt; A: . ack 171001 win 1212
20:19:49.269241 IP A &gt; B: . 215841:221161(5320) ack 1 win 119

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Van Jacobson &lt;vanj@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Nandita Dukkipati &lt;nanditad@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>inet: limit length of fragment queue hash table bucket lists</title>
<updated>2013-03-28T19:05:59Z</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-03-15T11:32:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7b7a1b8b3bd1742ca5ab259e741da0070e936db0'/>
<id>urn:sha1:7b7a1b8b3bd1742ca5ab259e741da0070e936db0</id>
<content type='text'>
[ Upstream commit 5a3da1fe9561828d0ca7eca664b16ec2b9bf0055 ]

This patch introduces a constant limit of the fragment queue hash
table bucket list lengths. Currently the limit 128 is choosen somewhat
arbitrary and just ensures that we can fill up the fragment cache with
empty packets up to the default ip_frag_high_thresh limits. It should
just protect from list iteration eating considerable amounts of cpu.

If we reach the maximum length in one hash bucket a warning is printed.
This is implemented on the caller side of inet_frag_find to distinguish
between the different users of inet_fragment.c.

I dropped the out of memory warning in the ipv4 fragment lookup path,
because we already get a warning by the slab allocator.

Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Jesper Dangaard Brouer &lt;jbrouer@redhat.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
