Age | Commit message (Collapse) | Author |
|
functions
[ Upstream commit 85fbaa75037d0b6b786ff18658ddf0b4014ce2a4 ]
Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
of uninitialized memory to user in recv syscalls") conditionally updated
addr_len if the msg_name is written to. The recv_error and rxpmtu
functions relied on the recvmsg functions to set up addr_len before.
As this does not happen any more we have to pass addr_len to those
functions as well and set it to the size of the corresponding sockaddr
length.
This broke traceroute and such.
Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")
Reported-by: Brad Spengler <spender@grsecurity.net>
Reported-by: Tom Labanowski
Cc: mpb <mpb.mail@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
sockaddr_storage)
[ Upstream commit 68c6beb373955da0886d8f4f5995b3922ceda4be ]
In that case it is probable that kernel code overwrote part of the
stack. So we should bail out loudly here.
The BUG_ON may be removed in future if we are sure all protocols are
conformant.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c ]
This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.
This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.
Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.
Also document these changes in include/linux/net.h as suggested by David
Miller.
Changes since RFC:
Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.
With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
msg->msg_name = NULL
".
This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.
Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.
Cc: David Miller <davem@davemloft.net>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit bceaa90240b6019ed73b49965eac7d167610be69 ]
Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.
If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.
Reported-by: mpb <mpb.mail@gmail.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c9e9042994d37cbc1ee538c500e9da1bb9d1bcdf ]
ip4_datagram_connect() being called from process context,
it should use IP_INC_STATS() instead of IP_INC_STATS_BH()
otherwise we can deadlock on 32bit arches, or get corruptions of
SNMP counters.
Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit dccf76ca6b626c0c4a4e09bb221adee3270ab0ef ]
We had some reports of crashes using TCP fastopen, and Dave Jones
gave a nice stack trace pointing to the error.
Issue is that tcp_get_metrics() should not be called with a NULL dst
Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Tested-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 98e09386c0ef4dfd48af7ba60ff908f0d525cdee ]
After commit c9eeec26e32e ("tcp: TSQ can use a dynamic limit"), several
users reported throughput regressions, notably on mvneta and wifi
adapters.
802.11 AMPDU requires a fair amount of queueing to be effective.
This patch partially reverts the change done in tcp_write_xmit()
so that the minimal amount is sysctl_tcp_limit_output_bytes.
It also remove the use of this sysctl while building skb stored
in write queue, as TSO autosizing does the right thing anyway.
Users with well behaving NICS and correct qdisc (like sch_fq),
can then lower the default sysctl_tcp_limit_output_bytes value from
128KB to 8KB.
This new usage of sysctl_tcp_limit_output_bytes permits each driver
authors to check how their driver performs when/if the value is set
to a minimum of 4KB.
Normally, line rate for a single TCP flow should be possible,
but some drivers rely on timers to perform TX completion and
too long TX completion delays prevent reaching full throughput.
Fixes: c9eeec26e32e ("tcp: TSQ can use a dynamic limit")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sujith Manoharan <sujith@msujith.org>
Reported-by: Arnaud Ebalard <arno@natisbad.org>
Tested-by: Sujith Manoharan <sujith@msujith.org>
Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1188f05497e7bd2f2614b99c54adfbe7413d5749 ]
If priority/traffic class field in IPv6 header is set (seen when
using ssh), the uncompression sets the TC and Flow fields incorrectly.
Example:
This is IPv6 header of a sent packet. Note the priority/TC (=1) in
the first byte.
00000000: 61 00 00 00 00 2c 06 40 fe 80 00 00 00 00 00 00
00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00
00000020: 02 1e ab ff fe 4c 52 57
This gets compressed like this in the sending side
00000000: 72 31 04 06 02 1e ab ff fe 4c 52 57 ec c2 00 16
00000010: aa 2d fe 92 86 4e be c6 ....
In the receiving end, the packet gets uncompressed to this
IPv6 header
00000000: 60 06 06 02 00 2a 1e 40 fe 80 00 00 00 00 00 00
00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00
00000020: ab ff fe 4c 52 57 ec c2
First four bytes are set incorrectly and we have also lost
two bytes from destination address.
The fix is to switch the case values in switch statement
when checking the TC field.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f8c31c8f80dd882f7eb49276989a4078d33d67a7 ]
Fixes a suspicious rcu derference warning.
Cc: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f104a567e673f382b09542a8dc3500aa689957b4 ]
As the rfc 4191 said, the Router Preference and Lifetime values in a
::/0 Route Information Option should override the preference and lifetime
values in the Router Advertisement header. But when the kernel deals with
a ::/0 Route Information Option, the rt6_get_route_info() always return
NULL, that means that overriding will not happen, because those default
routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router().
In order to deal with that condition, we should call rt6_get_dflt_router
when the prefix length is 0.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 13eb2ab2d33c57ebddc57437a7d341995fc9138c ]
When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.
Originally reported at: http://bugs.debian.org/724783
Reported-by: Nicolas HICHER <nhicher@avencall.com>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0e033e04c2678dbbe74a46b23fffb7bb918c288e ]
Commit 1e2bd517c108816220f262d7954b697af03b5f9c ("udp6: Fix udp
fragmentation for tunnel traffic.") changed the calculation if
there is enough space to include a fragment header in the skb from a
skb->mac_header dervived one to skb_headroom. Because we already peeled
off the skb to transport_header this is wrong. Change this back to check
if we have enough room before the mac_header.
This fixes a panic Saran Neti reported. He used the tbf scheduler which
skb_gso_segments the skb. The offsets get negative and we panic in memcpy
because the skb was erroneously not expanded at the head.
Reported-by: Saran Neti <Saran.Neti@telus.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 79845c662eeb95c9a180b9bd0d3ad848ee65b94c upstream.
Since rdev->sched_scan_req is dereferenced outside the
lock protecting it, this might be done at the wrong
time, causing crashes. Move the dereference to where
it should be - inside the RTNL locked section.
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e0d6cb9cd3a3ac8a3b8e5b22b83c4f8619786f22 upstream.
This driver adds an attribute to the existing virtio device so a CHANGE
event is required in order udev rules to make use of it. The ADD event
happens before this driver is probed and unlike a more typical driver
like a block device there isn't a higher level device to watch for.
Signed-off-by: Michael Marineau <michael.marineau@coreos.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a6b31d18b02ff9d7915c5898c9b5ca41a798cd73 upstream.
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f1ff0c27fd9987c59d707cd1a6b6c1fc3ae0a250 upstream.
The NFS layer needs to know when a key has expired.
This change also returns -EKEYEXPIRED to the application, and the informative
"Key has expired" error message is displayed. The user then knows that
credential renewal is required.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6c519bad7b19a2c14a075b400edabaa630330123 upstream.
batman-adv saves its table of packet handlers as a global state, so handlers
must be set up only once (and setting them up a second time will fail).
The recently-added network coding support tries to set up its handler each time
a new softif is registered, which obviously fails when more that one softif is
used (and in consequence, the softif creation fails).
Fix this by splitting up batadv_nc_init into batadv_nc_init (which is called
only once) and batadv_nc_mesh_init (which is called for each softif); in
addition batadv_nc_free is renamed to batadv_nc_mesh_free to keep naming
consistent.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6f092343855a71e03b8d209815d8c45bf3a27fcd ]
We don't validate iph->ihl which may lead a dead loop if we meet a IPIP
skb whose iph->ihl is zero. Fix this by failing immediately when iph->ihl
is evil (less than 5).
This issue were introduced by commit ec5efe7946280d1e84603389a1030ccec0a767ae
(rps: support IPIP encapsulation).
Signed-off-by: Jason Wang <jasowang@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e3bc10bd95d7fcc3f2ac690c6ff22833ea6781d6 ]
On receiving a packet too big icmp error we check if our current cached
dst_entry in the socket is still valid. This validation check did not
care about the expiration of the (cached) route.
The error path I traced down:
The socket receives a packet too big mtu notification. It still has a
valid dst_entry and thus issues the ip6_rt_pmtu_update on this dst_entry,
setting RTF_EXPIRE and updates the dst.expiration value (which could
fail because of not up-to-date expiration values, see previous patch).
In some seldom cases we race with a) the ip6_fib gc or b) another routing
lookup which would result in a recreation of the cached rt6_info from its
parent non-cached rt6_info. While copying the rt6_info we reinitialize the
metrics store by copying it over from the parent thus invalidating the
just installed pmtu update (both dsts use the same key to the inetpeer
storage). The dst_entry with the just invalidated metrics data would
just get its RTF_EXPIRES flag cleared and would continue to stay valid
for the socket.
We should have not issued the pmtu update on the already expired dst_entry
in the first placed. By checking the expiration on the dst entry and
doing a relookup in case it is out of date we close the race because
we would install a new rt6_info into the fib before we issue the pmtu
update, thus closing this race.
Not reliably updating the dst.expire value was fixed by the patch "ipv6:
reset dst.expires value when clearing expire flag".
Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Reported-by: Valentijn Sessink <valentyn@blub.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Valentijn Sessink <valentyn@blub.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ No applicable upstream commit, the upstream implementation is
now completely different and doesn't have this bug. ]
In case of WCCPv2 GRE header has extra four bytes. Following
patch pull those extra four bytes so that skb offsets are set
correctly.
CC: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Peter Schmitt <peter.schmitt82@yahoo.de>
Tested-by: Peter Schmitt <peter.schmitt82@yahoo.de>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f478f33a93f9353dcd1fe55445343d76b1c3f84a upstream.
Fix kernel warning when using WEXT for configuring ad-hoc mode,
e.g. "iwconfig wlan0 essid test channel 1"
WARNING: at net/wireless/chan.c:373 cfg80211_chandef_usable+0x50/0x21c [cfg80211]()
The warning is caused by an uninitialized variable center_freq1.
Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d86aa4f8ca58898ec6a94c0635da20b948171ed7 upstream.
If a frame's timestamp is calculated, and the bitrate
calculation goes wrong and returns zero, the system
will attempt to divide by zero and crash. Catch this
case and print the rate information that the driver
reported when this happens.
Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0c5b93290b2f3c7a376567c03ae8d385b0e99851 upstream.
When clients are idle for too long, hostapd sends nullfunc frames for
probing. When those are acked by the client, the idle time needs to be
updated.
To make this work (and to avoid unnecessary probing), update sta->last_rx
whenever an ACK was received for a tx packet. Only do this if the flag
IEEE80211_HW_REPORTS_TX_ACK_STATUS is set.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 03bb7f42765ce596604f03d179f3137d7df05bba upstream.
This allows calls for clients in AP_VLANs (e.g. for 4-addr) to succeed
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6329b8d917adc077caa60c2447385554130853a3 upstream.
If an Ad-Hoc node receives packets with the Cell ID or its own MAC
address as source address, it hits a WARN_ON in sta_info_insert_check()
With many packets, this can massively spam the logs. One way that this
can easily happen is through having Cisco APs in the area with rouge AP
detection and countermeasures enabled.
Such Cisco APs will regularly send fake beacons, disassoc and deauth
packets that trigger these warnings.
To fix this issue, drop such spoofed packets early in the rx path.
Reported-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a754055a1296fcbe6f32de3a5eaca6efb2fd1865 upstream.
__ieee80211_scan_completed is called from a worker. This
means that the following flow is possible.
* driver calls ieee80211_scan_completed
* mac80211 cancels the scan (that is already complete)
* __ieee80211_scan_completed runs
When scan_work will finally run, it will see that the scan
hasn't been aborted and might even trigger another scan on
another band. This leads to a situation where cfg80211's
scan is not done and no further scan can be issued.
Fix this by setting a new flag when a HW scan is being
cancelled so that no other scan will be triggered.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f5563318ff1bde15b10e736e97ffce13be08bc1a upstream.
When parsing an invalid radiotap header, the parser can overrun
the buffer that is passed in because it doesn't correctly check
1) the minimum radiotap header size
2) the space for extended bitmaps
The first issue doesn't affect any in-kernel user as they all
check the minimum size before calling the radiotap function.
The second issue could potentially affect the kernel if an skb
is passed in that consists only of the radiotap header with a
lot of extended bitmaps that extend past the SKB. In that case
a read-only buffer overrun by at most 4 bytes is possible.
Fix this by adding the appropriate checks to the parser.
Reported-by: Evan Huus <eapache@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c2f17e827b419918c856131f592df9521e1a38e3 ]
Routes need to be probed asynchronous otherwise the call stack gets
exhausted when the kernel attemps to deliver another skb inline, like
e.g. xt_TEE does, and we probe at the same time.
We update neigh->updated still at once, otherwise we would send to
many probes.
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 56e42441ed54b092d6c7411138ce60d049e7c731 ]
Now when rt6_nexthop() can return nexthop address we can use it
for proper nexthop comparison of directly connected destinations.
For more information refer to commit bbb5823cf742a7
("netfilter: nf_conntrack: fix rt_gateway checks for H.323 helper").
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 550bab42f83308c9d6ab04a980cc4333cef1c8fa ]
Make sure rt6i_gateway contains nexthop information in
all routes returned from lookup or when routes are directly
attached to skb for generated ICMP packets.
The effect of this patch should be a faster version of
rt6_nexthop() and the consideration of local addresses as
nexthop.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ This is a simplified -stable version of a set of upstream commits. ]
This is a replacement patch only for stable which does fix the problems
handled by the following two commits in -net:
"ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9)
"ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b)
Three frames are written on a corked udp socket for which the output
netdevice has UFO enabled. If the first and third frame are smaller than
the mtu and the second one is bigger, we enqueue the second frame with
skb_append_datato_frags without initializing the gso fields. This leads
to the third frame appended regulary and thus constructing an invalid skb.
This fixes the problem by always using skb_append_datato_frags as soon
as the first frag got enqueued to the skb without marking the packet
as SKB_GSO_UDP.
The problem with only two frames for ipv6 was fixed by "ipv6: udp
packets following an UFO enqueued packet need also be handled by UFO"
(2811ebac2521ceac84f2bdae402455baa6a7fb47).
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ]
In the case of credentials passing in unix stream sockets (dgram
sockets seem not affected), we get a rather sparse race after
commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default").
We have a stream server on receiver side that requests credential
passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED
on each spawned/accepted socket on server side to 1 first (as it's
not inherited), it can happen that in the time between accept() and
setsockopt() we get interrupted, the sender is being scheduled and
continues with passing data to our receiver. At that time SO_PASSCRED
is neither set on sender nor receiver side, hence in cmsg's
SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534
(== overflow{u,g}id) instead of what we actually would like to see.
On the sender side, here nc -U, the tests in maybe_add_creds()
invoked through unix_stream_sendmsg() would fail, as at that exact
time, as mentioned, the sender has neither SO_PASSCRED on his side
nor sees it on the server side, and we have a valid 'other' socket
in place. Thus, sender believes it would just look like a normal
connection, not needing/requesting SO_PASSCRED at that time.
As reverting 16e5726 would not be an option due to the significant
performance regression reported when having creds always passed,
one way/trade-off to prevent that would be to set SO_PASSCRED on
the listener socket and allow inheriting these flags to the spawned
socket on server side in accept(). It seems also logical to do so
if we'd tell the listener socket to pass those flags onwards, and
would fix the race.
Before, strace:
recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}},
msg_flags=0}, 0) = 5
After, strace:
recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}},
msg_flags=0}, 0) = 5
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ]
IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum.
This causes problems if SCTP packets has to be fragmented and
ipsummed has been set to PARTIAL due to checksum offload support.
This condition can happen when retransmitting after MTU discover,
or when INIT or other control chunks are larger then MTU.
Check for the rare fragmentation condition in SCTP and use software
checksum calculation in this case.
CC: Fan Du <fan.du@windriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ]
igb/ixgbe have hardware sctp checksum support, when this feature is enabled
and also IPsec is armed to protect sctp traffic, ugly things happened as
xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing
up and pack the 16bits result in the checksum field). The result is fail
establishment of sctp communication.
Signed-off-by: Fan Du <fan.du@windriver.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4b6c7879d84ad06a2ac5b964808ed599187a188d ]
Commit be4f154d5ef0ca147ab6bcd38857a774133f5450
bridge: Clamp forward_delay when enabling STP
had a typo when attempting to clamp maximum forward delay.
It is possible to set bridge_forward_delay to be higher then
permitted maximum when STP is off. When turning STP on, the
higher then allowed delay has to be clamed down to max value.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6865d1e834be84ddd5808d93d5035b492346c64a ]
When filling the netlink message we miss to wipe the pad field,
therefore leak one byte of heap memory to userland. Fix this by
setting pad to 0.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ]
François Cachereul made a very nice bug report and suspected
the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from
process context was not good.
This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165
("l2tp: Fix locking in l2tp_core.c").
l2tp_eth_dev_xmit() runs from BH context, so we must disable BH
from other l2tp_xmit_skb() users.
[ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662]
[ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox
ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod
virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
[ 452.064012] CPU 1
[ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643]
[ 452.080015] CPU 2
[ 452.080015]
[ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
[ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f
[ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293
[ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000
[ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110
[ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000
[ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286
[ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000
[ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000
[ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0
[ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0)
[ 452.080015] Stack:
[ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1
[ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e
[ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600
[ 452.080015] Call Trace:
[ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
[ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3
[ 452.080015] Call Trace:
[ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
[ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.064012]
[ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
[ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f
[ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297
[ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002
[ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110
[ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c
[ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18
[ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0
[ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000
[ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0
[ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410)
[ 452.064012] Stack:
[ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a
[ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62
[ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276
[ 452.064012] Call Trace:
[ 452.064012] <IRQ>
[ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb
[ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
[ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
[ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
[ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
[ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
[ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
[ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
[ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269
[ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
[ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
[ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
[ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
[ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
[ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
[ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
[ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
[ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82
[ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c
[ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
[ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
[ 452.064012] <EOI>
[ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48
[ 452.064012] Call Trace:
[ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb
[ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
[ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
[ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
[ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
[ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
[ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
[ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
[ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269
[ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
[ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
[ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
[ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
[ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
[ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
[ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
[ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
[ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82
[ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c
[ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
[ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
[ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
Reported-by: François Cachereul <f.cachereul@alphalink.fr>
Tested-by: François Cachereul <f.cachereul@alphalink.fr>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7263a5187f9e9de45fcb51349cf0e031142c19a1 ]
This patch fixes and improves the use of vti interfaces (while
lightly changing the way of configuring them).
Currently:
- it is necessary to identify and mark inbound IPsec
packets destined to each vti interface, via netfilter rules in
the mangle table at prerouting hook.
- the vti module cannot retrieve the right tunnel in input since
commit b9959fd3: vti tunnels all have an i_key, but the tunnel lookup
is done with flag TUNNEL_NO_KEY, so there no chance to retrieve them.
- the i_key is used by the outbound processing as a mark to lookup
for the right SP and SA bundle.
This patch uses the o_key to store the vti mark (instead of i_key) and
enables:
- to avoid the need for previously marking the inbound skbuffs via a
netfilter rule.
- to properly retrieve the right tunnel in input, only based on the IPsec
packet outer addresses.
- to properly perform an inbound policy check (using the tunnel o_key
as a mark).
- to properly perform an outbound SPD and SAD lookup (using the tunnel
o_key as a mark).
- to keep the current mark of the skbuff. The skbuff mark is neither
used nor changed by the vti interface. Only the vti interface o_key
is used.
SAs have a wildcard mark.
SPs have a mark equal to the vti interface o_key.
The vti interface must be created as follows (i_key = 0, o_key = mark):
ip link add vti1 mode vti local 1.1.1.1 remote 2.2.2.2 okey 1
The SPs attached to vti1 must be created as follows (mark = vti1 o_key):
ip xfrm policy add dir out mark 1 tmpl src 1.1.1.1 dst 2.2.2.2 \
proto esp mode tunnel
ip xfrm policy add dir in mark 1 tmpl src 2.2.2.2 dst 1.1.1.1 \
proto esp mode tunnel
The SAs are created with the default wildcard mark. There is no
distinction between global vs. vti SAs. Just their addresses will
possibly link them to a vti interface:
ip xfrm state add src 1.1.1.1 dst 2.2.2.2 proto esp spi 1000 mode tunnel \
enc "cbc(aes)" "azertyuiopqsdfgh"
ip xfrm state add src 2.2.2.2 dst 1.1.1.1 proto esp spi 2000 mode tunnel \
enc "cbc(aes)" "sqbdhgqsdjqjsdfh"
To avoid matching "global" (not vti) SPs in vti interfaces, global SPs
should no use the default wildcard mark, but explicitly match mark 0.
To avoid a double SPD lookup in input and output (in global and vti SPDs),
the NOPOLICY and NOXFRM options should be set on the vti interfaces:
echo 1 > /proc/sys/net/ipv4/conf/vti1/disable_policy
echo 1 > /proc/sys/net/ipv4/conf/vti1/disable_xfrm
The outgoing traffic is steered to vti1 by a route via the vti interface:
ip route add 192.168.0.0/16 dev vti1
The incoming IPsec traffic is steered to vti1 because its outer addresses
match the vti1 tunnel configuration.
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ]
This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit cb03db9d0e964568407fb08ea46cc2b6b7f67587 ]
net_secret() is only used when CONFIG_IPV6 or CONFIG_INET are selected.
Building a defconfig with both of these symbols unselected (Using the ARM
at91sam9rl_defconfig, for example) leads to the following build warning:
$ make at91sam9rl_defconfig
#
# configuration written to .config
#
$ make net/core/secure_seq.o
scripts/kconfig/conf --silentoldconfig Kconfig
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
make[1]: `include/generated/mach-types.h' is up to date.
CALL scripts/checksyscalls.sh
CC net/core/secure_seq.o
net/core/secure_seq.c:17:13: warning: 'net_secret_init' defined but not used [-Wunused-function]
Fix this warning by protecting the definition of net_secret() with these
symbols.
Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ]
When sending out multicast messages, the source address in inet->mc_addr is
ignored and rewritten by an autoselected one. This is caused by a typo in
commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output
route lookups").
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-b |