<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/netfilter, branch v2.6.35</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/net/netfilter?h=v2.6.35</id>
<link rel='self' href='https://git.amat.us/linux/atom/net/netfilter?h=v2.6.35'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2010-06-09T14:10:57Z</updated>
<entry>
<title>ipvs: Add missing locking during connection table hashing and unhashing</title>
<updated>2010-06-09T14:10:57Z</updated>
<author>
<name>Sven Wegener</name>
<email>sven.wegener@stealer.net</email>
</author>
<published>2010-06-09T14:10:57Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=aea9d711f3d68c656ad31ab578ecfb0bb5cd7f97'/>
<id>urn:sha1:aea9d711f3d68c656ad31ab578ecfb0bb5cd7f97</id>
<content type='text'>
The code that hashes and unhashes connections from the connection table
is missing locking of the connection being modified, which opens up a
race condition and results in memory corruption when this race condition
is hit.

Here is what happens in pretty verbose form:

CPU 0					CPU 1
------------				------------
An active connection is terminated and
we schedule ip_vs_conn_expire() on this
CPU to expire this connection.

					IRQ assignment is changed to this CPU,
					but the expire timer stays scheduled on
					the other CPU.

					New connection from same ip:port comes
					in right before the timer expires, we
					find the inactive connection in our
					connection table and get a reference to
					it. We proper lock the connection in
					tcp_state_transition() and read the
					connection flags in set_tcp_state().

ip_vs_conn_expire() gets called, we
unhash the connection from our
connection table and remove the hashed
flag in ip_vs_conn_unhash(), without
proper locking!

					While still holding proper locks we
					write the connection flags in
					set_tcp_state() and this sets the hashed
					flag again.

ip_vs_conn_expire() fails to expire the
connection, because the other CPU has
incremented the reference count. We try
to re-insert the connection into our
connection table, but this fails in
ip_vs_conn_hash(), because the hashed
flag has been set by the other CPU. We
re-schedule execution of
ip_vs_conn_expire(). Now this connection
has the hashed flag set, but isn't
actually hashed in our connection table
and has a dangling list_head.

					We drop the reference we held on the
					connection and schedule the expire timer
					for timeouting the connection on this
					CPU. Further packets won't be able to
					find this connection in our connection
					table.

					ip_vs_conn_expire() gets called again,
					we think it's already hashed, but the
					list_head is dangling and while removing
					the connection from our connection table
					we write to the memory location where
					this list_head points to.

The result will probably be a kernel oops at some other point in time.

This race condition is pretty subtle, but it can be triggered remotely.
It needs the IRQ assignment change or another circumstance where packets
coming from the same ip:port for the same service are being processed on
different CPUs. And it involves hitting the exact time at which
ip_vs_conn_expire() gets called. It can be avoided by making sure that
all packets from one connection are always processed on the same CPU and
can be made harder to exploit by changing the connection timeouts to
some custom values.

Signed-off-by: Sven Wegener &lt;sven.wegener@stealer.net&gt;
Cc: stable@kernel.org
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
</content>
</entry>
<entry>
<title>netfilter: xtables: stackptr should be percpu</title>
<updated>2010-05-31T14:41:35Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-05-31T14:41:35Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7489aec8eed4f2f1eb3b4d35763bd3ea30b32ef5'/>
<id>urn:sha1:7489aec8eed4f2f1eb3b4d35763bd3ea30b32ef5</id>
<content type='text'>
commit f3c5c1bfd4 (netfilter: xtables: make ip_tables reentrant)
introduced a performance regression, because stackptr array is shared by
all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache
line)

Fix this using alloc_percpu()

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Acked-By: Jan Engelhardt &lt;jengelh@medozas.de&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
</content>
</entry>
<entry>
<title>netfilter: don't xt_jumpstack_alloc twice in xt_register_table</title>
<updated>2010-05-31T14:41:09Z</updated>
<author>
<name>Xiaotian Feng</name>
<email>dfeng@redhat.com</email>
</author>
<published>2010-05-31T14:41:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c936e8bd1de2fa50c49e3df6fa5036bf07870b67'/>
<id>urn:sha1:c936e8bd1de2fa50c49e3df6fa5036bf07870b67</id>
<content type='text'>
In xt_register_table, xt_jumpstack_alloc is called first, later
xt_replace_table is used. But in xt_replace_table, xt_jumpstack_alloc
will be used again. Then the memory allocated by previous xt_jumpstack_alloc
will be leaked. We can simply remove the previous xt_jumpstack_alloc because
there aren't any users of newinfo between xt_jumpstack_alloc and
xt_replace_table.

Signed-off-by: Xiaotian Feng &lt;dfeng@redhat.com&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Jan Engelhardt &lt;jengelh@medozas.de&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Acked-By: Jan Engelhardt &lt;jengelh@medozas.de&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
</content>
</entry>
<entry>
<title>xt_tee: use skb_dst_drop()</title>
<updated>2010-05-28T10:41:17Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-05-28T10:41:17Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=50636af715ac1ceb1872bd29a4bdcc68975c3263'/>
<id>urn:sha1:50636af715ac1ceb1872bd29a4bdcc68975c3263</id>
<content type='text'>
After commit 7fee226a (net: add a noref bit on skb dst), its wrong to
use : dst_release(skb_dst(skb)), since we could decrement a refcount
while skb dst was not refcounted.

We should use skb_dst_drop(skb) instead.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6</title>
<updated>2010-05-21T06:12:18Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2010-05-21T06:12:18Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=41499bd6766314079417d1467c466d31b8612fec'/>
<id>urn:sha1:41499bd6766314079417d1467c466d31b8612fec</id>
<content type='text'>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: fix a race in __nf_conntrack_confirm against nf_ct_get_next_corpse()</title>
<updated>2010-05-20T13:55:30Z</updated>
<author>
<name>Joerg Marx</name>
<email>joerg.marx@secunet.com</email>
</author>
<published>2010-05-20T13:55:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fc350777c705a39a312728ac5e8a6f164a828f5d'/>
<id>urn:sha1:fc350777c705a39a312728ac5e8a6f164a828f5d</id>
<content type='text'>
This race was triggered by a 'conntrack -F' command running in parallel
to the insertion of a hash for a new connection. Losing this race led to
a dead conntrack entry effectively blocking traffic for a particular
connection until timeout or flushing the conntrack hashes again.
Now the check for an already dying connection is done inside the lock.

Signed-off-by: Joerg Marx &lt;joerg.marx@secunet.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
</content>
</entry>
<entry>
<title>net: add a noref bit on skb dst</title>
<updated>2010-05-18T00:18:50Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-05-11T23:19:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7fee226ad2397b635e2fd565a59ca3ae08a164cd'/>
<id>urn:sha1:7fee226ad2397b635e2fd565a59ca3ae08a164cd</id>
<content type='text'>
Use low order bit of skb-&gt;_skb_dst to tell dst is not refcounted.

Change _skb_dst to _skb_refdst to make sure all uses are catched.

skb_dst() returns the dst, regardless of noref bit set or not, but
with a lockdep check to make sure a noref dst is not given if current
user is not rcu protected.

New skb_dst_set_noref() helper to set an notrefcounted dst on a skb.
(with lockdep check)

skb_dst_drop() drops a reference only if skb dst was refcounted.

skb_dst_force() helper is used to force a refcount on dst, when skb
is queued and not anymore RCU protected.

Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if
!IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in
sock_queue_rcv_skb(), in __nf_queue().

Use skb_dst_force() in dev_requeue_skb().

Note: dst_use_noref() still dirties dst, we might transform it
later to do one dirtying per jiffies.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: xt_TEE depends on NF_CONNTRACK</title>
<updated>2010-05-14T20:52:30Z</updated>
<author>
<name>Randy Dunlap</name>
<email>randy.dunlap@oracle.com</email>
</author>
<published>2010-05-14T20:52:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=83827f6a891e20de7468b1181f2ae8a3cc72587b'/>
<id>urn:sha1:83827f6a891e20de7468b1181f2ae8a3cc72587b</id>
<content type='text'>
Fix xt_TEE build for the case of NF_CONNTRACK=m and
NETFILTER_XT_TARGET_TEE=y:

xt_TEE.c:(.text+0x6df5c): undefined reference to `nf_conntrack_untracked'
4x

Built with all 4 m/y combinations.

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Acked-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_ct_sip: handle non-linear skbs</title>
<updated>2010-05-14T19:18:17Z</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2010-05-14T19:18:17Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=a1d7c1b4b8dfbc5ecadcff9284d64bb6ad4c0196'/>
<id>urn:sha1:a1d7c1b4b8dfbc5ecadcff9284d64bb6ad4c0196</id>
<content type='text'>
Handle non-linear skbs by linearizing them instead of silently failing.
Long term the helper should be fixed to either work with non-linear skbs
directly by using the string search API or work on a copy of the data.

Based on patch by Jason Gunthorpe &lt;jgunthorpe@obsidianresearch.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6</title>
<updated>2010-05-13T21:14:10Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2010-05-13T21:14:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e7874c996b8591f59d78efa519031dab5b58723b'/>
<id>urn:sha1:e7874c996b8591f59d78efa519031dab5b58723b</id>
<content type='text'>
</content>
</entry>
</feed>
