<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/core, branch v3.12.24</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/net/core?h=v3.12.24</id>
<link rel='self' href='https://git.amat.us/linux/atom/net/core?h=v3.12.24'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2014-06-27T08:25:16Z</updated>
<entry>
<title>net: Do not enable tx-nocache-copy by default</title>
<updated>2014-06-27T08:25:16Z</updated>
<author>
<name>Benjamin Poirier</name>
<email>bpoirier@suse.de</email>
</author>
<published>2014-01-07T15:11:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=9f6e089cb55bdc5a90fe6ec755a20941da9a0b3b'/>
<id>urn:sha1:9f6e089cb55bdc5a90fe6ec755a20941da9a0b3b</id>
<content type='text'>
commit cdb3f4a31b64c3a1c6eef40bc01ebc9594c58a8c upstream.

There are many cases where this feature does not improve performance or even
reduces it.

For example, here are the results from tests that I've run using 3.12.6 on one
Intel Xeon W3565 and one i7 920 connected by ixgbe adapters. The results are
from the Xeon, but they're similar on the i7. All numbers report the
mean±stddev over 10 runs of 10s.

1) latency tests similar to what is described in "c6e1a0d net: Allow no-cache
copy from user on transmit"
There is no statistically significant difference between tx-nocache-copy
on/off.
nic irqs spread out (one queue per cpu)

200x netperf -r 1400,1
tx-nocache-copy off
        692000±1000 tps
        50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
tx-nocache-copy on
        693000±1000 tps
        50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7

200x netperf -r 14000,14000
tx-nocache-copy off
        86450±80 tps
        50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
tx-nocache-copy on
        86110±60 tps
        50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20

2) single stream throughput tests
tx-nocache-copy leads to higher service demand

                        throughput  cpu0        cpu1        demand
                        (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)

nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)

tx-nocache-copy off     9402±5      9.4±0.2                 0.80±0.01
tx-nocache-copy on      9403±3      9.85±0.04               0.838±0.004

nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)

tx-nocache-copy off     9401±5      5.83±0.03   5.0±0.1     0.923±0.007
tx-nocache-copy on      9404±2      5.74±0.03   5.523±0.009 0.958±0.002

As a second example, here are some results from Eric Dumazet with latest
net-next.
tx-nocache-copy also leads to higher service demand

(cpu is Intel(R) Xeon(R) CPU X5660  @ 2.80GHz)

lpq83:~# ./ethtool -K eth0 tx-nocache-copy on
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9407.44   2.50     -1.00    0.522   -1.000

 Performance counter stats for './netperf -H lpq84 -c':

       4282.648396 task-clock                #    0.423 CPUs utilized
             9,348 context-switches          #    0.002 M/sec
                88 CPU-migrations            #    0.021 K/sec
               355 page-faults               #    0.083 K/sec
    11,812,797,651 cycles                    #    2.758 GHz                     [82.79%]
     9,020,522,817 stalled-cycles-frontend   #   76.36% frontend cycles idle    [82.54%]
     4,579,889,681 stalled-cycles-backend    #   38.77% backend  cycles idle    [67.33%]
     6,053,172,792 instructions              #    0.51  insns per cycle
                                             #    1.49  stalled cycles per insn [83.64%]
       597,275,583 branches                  #  139.464 M/sec                   [83.70%]
         8,960,541 branch-misses             #    1.50% of all branches         [83.65%]

      10.128990264 seconds time elapsed

lpq83:~# ./ethtool -K eth0 tx-nocache-copy off
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9412.45   2.15     -1.00    0.449   -1.000

 Performance counter stats for './netperf -H lpq84 -c':

       2847.375441 task-clock                #    0.281 CPUs utilized
            11,632 context-switches          #    0.004 M/sec
                49 CPU-migrations            #    0.017 K/sec
               354 page-faults               #    0.124 K/sec
     7,646,889,749 cycles                    #    2.686 GHz                     [83.34%]
     6,115,050,032 stalled-cycles-frontend   #   79.97% frontend cycles idle    [83.31%]
     1,726,460,071 stalled-cycles-backend    #   22.58% backend  cycles idle    [66.55%]
     2,079,702,453 instructions              #    0.27  insns per cycle
                                             #    2.94  stalled cycles per insn [83.22%]
       363,773,213 branches                  #  127.757 M/sec                   [83.29%]
         4,242,732 branch-misses             #    1.17% of all branches         [83.51%]

      10.128449949 seconds time elapsed

CC: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: Benjamin Poirier &lt;bpoirier@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>rtnetlink: fix userspace API breakage for iproute2 &lt; v3.9.0</title>
<updated>2014-06-23T08:28:05Z</updated>
<author>
<name>Michal Schmidt</name>
<email>mschmidt@redhat.com</email>
</author>
<published>2014-05-28T12:15:19Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=76acb94289d2ca897bd820b9627f4ade15230663'/>
<id>urn:sha1:76acb94289d2ca897bd820b9627f4ade15230663</id>
<content type='text'>
[ Upstream commit e5eca6d41f53db48edd8cf88a3f59d2c30227f8e ]

When running RHEL6 userspace on a current upstream kernel, "ip link"
fails to show VF information.

The reason is a kernel&lt;-&gt;userspace API change introduced by commit
88c5b5ce5cb57 ("rtnetlink: Call nlmsg_parse() with correct header length"),
after which the kernel does not see iproute2's IFLA_EXT_MASK attribute
in the netlink request.

iproute2 adjusted for the API change in its commit 63338dca4513
("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter").

The problem has been noticed before:
http://marc.info/?l=linux-netdev&amp;m=136692296022182&amp;w=2
(Subject: Re: getting VF link info seems to be broken in 3.9-rc8)

We can do better than tell those with old userspace to upgrade. We can
recognize the old iproute2 in the kernel by checking the netlink message
length. Even when including the IFLA_EXT_MASK attribute, its netlink
message is shorter than struct ifinfomsg.

With this patch "ip link" shows VF information in both old and new
iproute2 versions.

Signed-off-by: Michal Schmidt &lt;mschmidt@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: force a list_del() in unregister_netdevice_many()</title>
<updated>2014-06-23T08:28:03Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-06T13:44:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=cbaf35e5ba05b9ff2f43d798fe4082cca8151861'/>
<id>urn:sha1:cbaf35e5ba05b9ff2f43d798fe4082cca8151861</id>
<content type='text'>
[ Upstream commit 87757a917b0b3c0787e0563c679762152be81312 ]

unregister_netdevice_many() API is error prone and we had too
many bugs because of dangling LIST_HEAD on stacks.

See commit f87e6f47933e3e ("net: dont leave active on stack LIST_HEAD")

In fact, instead of making sure no caller leaves an active list_head,
just force a list_del() in the callee. No one seems to need to access
the list after unregister_netdevice_many()

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: Use netlink_ns_capable to verify the permisions of netlink messages</title>
<updated>2014-06-23T08:27:57Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2014-04-23T21:29:27Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=50b8b6e75fa0c08cef1e1ed30a7ab91f05bcb779'/>
<id>urn:sha1:50b8b6e75fa0c08cef1e1ed30a7ab91f05bcb779</id>
<content type='text'>
[ Upstream commit 90f62cf30a78721641e08737bda787552428061e ]

It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.

To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.

Reported-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: Add variants of capable for use on on sockets</title>
<updated>2014-06-23T08:27:56Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2014-04-23T21:26:56Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=8c91777bd4951c1f1ad59842fe4d208f94a97dfe'/>
<id>urn:sha1:8c91777bd4951c1f1ad59842fe4d208f94a97dfe</id>
<content type='text'>
[ Upstream commit a3b299da869d6e78cf42ae0b1b41797bcb8c5e4b ]

sk_net_capable - The common case, operations that are safe in a network namespace.
sk_capable - Operations that are not known to be safe in a network namespace
sk_ns_capable - The general case for special cases.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: Move the permission check in sock_diag_put_filterinfo to packet_diag_dump</title>
<updated>2014-06-23T08:27:55Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2014-04-23T21:26:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=4dcecf4ce16a20c210e91a62ae28275418e3ed2c'/>
<id>urn:sha1:4dcecf4ce16a20c210e91a62ae28275418e3ed2c</id>
<content type='text'>
[ Upstream commit a53b72c83a4216f2eb883ed45a0cbce014b8e62d ]

The permission check in sock_diag_put_filterinfo is wrong, and it is so removed
from it's sources it is not clear why it is wrong.  Move the computation
into packet_diag_dump and pass a bool of the result into sock_diag_filterinfo.

This does not yet correct the capability check but instead simply moves it to make
it clear what is going on.

Reported-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>skbuff: skb_segment: orphan frags before copying</title>
<updated>2014-06-23T08:27:51Z</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-03-10T17:28:08Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=07d054ef6765b307277f02c11b51f0695d6b3d7c'/>
<id>urn:sha1:07d054ef6765b307277f02c11b51f0695d6b3d7c</id>
<content type='text'>
[ Upstream commit 1fd819ecb90cc9b822cd84d3056ddba315d3340f ]

skb_segment copies frags around, so we need
to copy them carefully to avoid accessing
user memory after reporting completion to userspace
through a callback.

skb_segment doesn't normally happen on datapath:
TSO needs to be disabled - so disabling zero copy
in this case does not look like a big deal.

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>skbuff: skb_segment: s/fskb/list_skb/</title>
<updated>2014-06-23T08:27:48Z</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-03-10T17:27:59Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f2a855e3e126a716b71ddf369754f70573b26bce'/>
<id>urn:sha1:f2a855e3e126a716b71ddf369754f70573b26bce</id>
<content type='text'>
[ Upstream commit 1a4cedaf65491e66e1e55b8428c89209da729209 ]

fskb is unrelated to frag: it's coming from
frag_list. Rename it list_skb to avoid confusion.

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>skbuff: skb_segment: s/skb/head_skb/</title>
<updated>2014-06-23T08:27:38Z</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-03-10T16:29:19Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=14568bf24e155542476e3cc14cd28bbda61d6f1f'/>
<id>urn:sha1:14568bf24e155542476e3cc14cd28bbda61d6f1f</id>
<content type='text'>
[ Upstream commit df5771ffefb13f8af5392bd54fd7e2b596a3a357 ]

rename local variable to make it easier to tell at a glance that we are
dealing with a head skb.

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>skbuff: skb_segment: s/skb_frag/frag/</title>
<updated>2014-06-23T08:27:35Z</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-03-10T16:29:14Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=74781837f9e968490b4e2fe900f6c1880583dd3e'/>
<id>urn:sha1:74781837f9e968490b4e2fe900f6c1880583dd3e</id>
<content type='text'>
[ Upstream commit 4e1beba12d094c6c761ba5c49032b9b9e46380e8 ]

skb_frag can in fact point at either skb
or fskb so rename it generally "frag".

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
</feed>
