<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/netfilter/ipvs, branch v3.12.10</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/net/netfilter/ipvs?h=v3.12.10</id>
<link rel='self' href='https://git.amat.us/linux/atom/net/netfilter/ipvs?h=v3.12.10'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-12-08T15:29:13Z</updated>
<entry>
<title>netfilter: push reasm skb through instead of original frag skbs</title>
<updated>2013-12-08T15:29:13Z</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@resnulli.us</email>
</author>
<published>2013-11-06T16:52:20Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0a6905b2186e5ae1715545d067dde8ad830fc3f5'/>
<id>urn:sha1:0a6905b2186e5ae1715545d067dde8ad830fc3f5</id>
<content type='text'>
[ Upstream commit 6aafeef03b9d9ecf255f3a80ed85ee070260e1ae ]

Pushing original fragments through causes several problems. For example
for matching, frags may not be matched correctly. Take following
example:

&lt;example&gt;
On HOSTA do:
ip6tables -I INPUT -p icmpv6 -j DROP
ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT

and on HOSTB you do:
ping6 HOSTA -s2000    (MTU is 1500)

Incoming echo requests will be filtered out on HOSTA. This issue does
not occur with smaller packets than MTU (where fragmentation does not happen)
&lt;/example&gt;

As was discussed previously, the only correct solution seems to be to use
reassembled skb instead of separete frags. Doing this has positive side
effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams
dances in ipvs and conntrack can be removed.

Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c
entirely and use code in net/ipv6/reassembly.c instead.

Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Marcelo Ricardo Leitner &lt;mleitner@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf</title>
<updated>2013-10-01T16:39:35Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2013-10-01T16:39:35Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e024bdc051ab99eafb5dd9bad87e79afc27f8a44'/>
<id>urn:sha1:e024bdc051ab99eafb5dd9bad87e79afc27f8a44</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter/IPVS fixes for your net
tree, they are:

* Fix BUG_ON splat due to malformed TCP packets seen by synproxy, from
  Patrick McHardy.

* Fix possible weight overflow in lblc and lblcr schedulers due to
  32-bits arithmetics, from Simon Kirby.

* Fix possible memory access race in the lblc and lblcr schedulers,
  introduced when it was converted to use RCU, two patches from
  Julian Anastasov.

* Fix hard dependency on CPU 0 when reading per-cpu stats in the
  rate estimator, from Julian Anastasov.

* Fix race that may lead to object use after release, when invoking
  ipvsadm -C &amp;&amp; ipvsadm -R, introduced when adding RCU, from Julian
  Anastasov.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ip: generate unique IP identificator if local fragmentation is allowed</title>
<updated>2013-09-19T18:11:15Z</updated>
<author>
<name>Ansis Atteka</name>
<email>aatteka@nicira.com</email>
</author>
<published>2013-09-18T22:29:53Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=703133de331a7a7df47f31fb9de51dc6f68a9de8'/>
<id>urn:sha1:703133de331a7a7df47f31fb9de51dc6f68a9de8</id>
<content type='text'>
If local fragmentation is allowed, then ip_select_ident() and
ip_select_ident_more() need to generate unique IDs to ensure
correct defragmentation on the peer.

For example, if IPsec (tunnel mode) has to encrypt large skbs
that have local_df bit set, then all IP fragments that belonged
to different ESP datagrams would have used the same identificator.
If one of these IP fragments would get lost or reordered, then
peer could possibly stitch together wrong IP fragments that did
not belong to the same datagram. This would lead to a packet loss
or data corruption.

Signed-off-by: Ansis Atteka &lt;aatteka@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipvs: stats should not depend on CPU 0</title>
<updated>2013-09-18T19:40:20Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-09-12T08:21:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d1ee4fea0b6946dd8bc61b46db35ea80af7af34b'/>
<id>urn:sha1:d1ee4fea0b6946dd8bc61b46db35ea80af7af34b</id>
<content type='text'>
When reading percpu stats we need to properly reset
the sum when CPU 0 is not present in the possible mask.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: do not use dest after ip_vs_dest_put in LBLCR</title>
<updated>2013-09-18T19:39:39Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-09-12T08:21:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=742617b176909e586a4cf9b142c996c25986fce8'/>
<id>urn:sha1:742617b176909e586a4cf9b142c996c25986fce8</id>
<content type='text'>
commit c5549571f975ab ("ipvs: convert lblcr scheduler to rcu")
allows RCU readers to use dest after calling ip_vs_dest_put().
In the corner case it can race with ip_vs_dest_trash_expire()
which can release the dest while it is being returned to the
RCU readers as scheduling result.

To fix the problem do not allow e-&gt;dest to be replaced and
defer the ip_vs_dest_put() call by using RCU callback. Now
e-&gt;dest does not need to be RCU pointer.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: do not use dest after ip_vs_dest_put in LBLC</title>
<updated>2013-09-18T19:39:09Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-09-12T08:21:08Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2f3d771a35fee21a1f17364b46b3c8cc66dc6892'/>
<id>urn:sha1:2f3d771a35fee21a1f17364b46b3c8cc66dc6892</id>
<content type='text'>
commit c2a4ffb70eef39 ("ipvs: convert lblc scheduler to rcu")
allows RCU readers to use dest after calling ip_vs_dest_put().
In the corner case it can race with ip_vs_dest_trash_expire()
which can release the dest while it is being returned to the
RCU readers as scheduling result.

To fix the problem do not allow en-&gt;dest to be replaced and
defer the ip_vs_dest_put() call by using RCU callback. Now
en-&gt;dest does not need to be RCU pointer.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: make the service replacement more robust</title>
<updated>2013-09-18T19:39:03Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-09-12T08:21:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=bcbde4c0a7556cca72874c5e1efa4dccb5198a2b'/>
<id>urn:sha1:bcbde4c0a7556cca72874c5e1efa4dccb5198a2b</id>
<content type='text'>
commit 578bc3ef1e473a ("ipvs: reorganize dest trash") added
IP_VS_DEST_STATE_REMOVING flag and RCU callback named
ip_vs_dest_wait_readers() to keep dests and services after
removal for at least a RCU grace period. But we have the
following corner cases:

- we can not reuse the same dest if its service is removed
while IP_VS_DEST_STATE_REMOVING is still set because another dest
removal in the first grace period can not extend this period.
It can happen when ipvsadm -C &amp;&amp; ipvsadm -R is used.

- dest-&gt;svc can be replaced but ip_vs_in_stats() and
ip_vs_out_stats() have no explicit read memory barriers
when accessing dest-&gt;svc. It can happen that dest-&gt;svc
was just freed (replaced) while we use it to update
the stats.

We solve the problems as follows:

- IP_VS_DEST_STATE_REMOVING is removed and we ensure a fixed
idle period for the dest (IP_VS_DEST_TRASH_PERIOD). idle_start
will remember when for first time after deletion we noticed
dest-&gt;refcnt=0. Later, the connections can grab a reference
while in RCU grace period but if refcnt becomes 0 we can
safely free the dest and its svc.

- dest-&gt;svc becomes RCU pointer. As result, we add explicit
RCU locking in ip_vs_in_stats() and ip_vs_out_stats().

- __ip_vs_unbind_svc is renamed to __ip_vs_svc_put(), it
now can free the service immediately or after a RCU grace
period. dest-&gt;svc is not set to NULL anymore.

	As result, unlinked dests and their services are
freed always after IP_VS_DEST_TRASH_PERIOD period, unused
services are freed after a RCU grace period.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: fix overflow on dest weight multiply</title>
<updated>2013-09-18T19:38:53Z</updated>
<author>
<name>Simon Kirby</name>
<email>sim@hostway.ca</email>
</author>
<published>2013-08-10T08:26:18Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c16526a7b99c1c28e9670a8c8e3dbcf741bb32be'/>
<id>urn:sha1:c16526a7b99c1c28e9670a8c8e3dbcf741bb32be</id>
<content type='text'>
Schedulers such as lblc and lblcr require the weight to be as high as the
maximum number of active connections. In commit b552f7e3a9524abcbcdf
("ipvs: unify the formula to estimate the overhead of processing
connections"), the consideration of inactconns and activeconns was cleaned
up to always count activeconns as 256 times more important than inactconns.
In cases where 3000 or more connections are expected, a weight of 3000 *
256 * 3000 connections overflows the 32-bit signed result used to determine
if rescheduling is required.

On amd64, this merely changes the multiply and comparison instructions to
64-bit. On x86, a 64-bit result is already present from imull, so only
a few more comparison instructions are emitted.

Signed-off-by: Simon Kirby &lt;sim@hostway.ca&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next</title>
<updated>2013-08-20T20:30:54Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2013-08-20T20:30:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=89d5e23210f53ab53b7ff64843bce62a106d454f'/>
<id>urn:sha1:89d5e23210f53ab53b7ff64843bce62a106d454f</id>
<content type='text'>
Conflicts:
	net/netfilter/nf_conntrack_proto_tcp.c

The conflict had to do with overlapping changes dealing with
fixing the use of an "s32" to hold the value returned by
NAT_OFFSET().

Pablo Neira Ayuso says:

====================
The following batch contains Netfilter/IPVS updates for your net-next tree.
More specifically, they are:

* Trivial typo fix in xt_addrtype, from Phil Oester.

* Remove net_ratelimit in the conntrack logging for consistency with other
  logging subsystem, from Patrick McHardy.

* Remove unneeded includes from the recently added xt_connlabel support, from
  Florian Westphal.

* Allow to update conntracks via nfqueue, don't need NFQA_CFG_F_CONNTRACK for
  this, from Florian Westphal.

* Remove tproxy core, now that we have socket early demux, from Florian
  Westphal.

* A couple of patches to refactor conntrack event reporting to save a good
  bunch of lines, from Florian Westphal.

* Fix missing locking in NAT sequence adjustment, it did not manifested in
  any known bug so far, from Patrick McHardy.

* Change sequence number adjustment variable to 32 bits, to delay the
  possible early overflow in long standing connections, also from Patrick.

* Comestic cleanups for IPVS, from Dragos Foianu.

* Fix possible null dereference in IPVS in the SH scheduler, from Daniel
  Borkmann.

* Allow to attach conntrack expectations via nfqueue. Before this patch, you
  had to use ctnetlink instead, thus, we save the conntrack lookup.

* Export xt_rpfilter and xt_HMARK header files, from Nicolas Dichtel.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipvs: ip_vs_sh: ip_vs_sh_get_port: check skb_header_pointer for NULL</title>
<updated>2013-08-06T23:57:57Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2013-08-06T09:20:23Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=54e35cc52346149a7bce8a2f622e215ed17bb56d'/>
<id>urn:sha1:54e35cc52346149a7bce8a2f622e215ed17bb56d</id>
<content type='text'>
skb_header_pointer could return NULL, so check for it as we do it
everywhere else in ipvs code. This fixes a coverity warning.

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
</feed>
