Age | Commit message (Collapse) | Author |
|
commit 6329b8d917adc077caa60c2447385554130853a3 upstream.
If an Ad-Hoc node receives packets with the Cell ID or its own MAC
address as source address, it hits a WARN_ON in sta_info_insert_check()
With many packets, this can massively spam the logs. One way that this
can easily happen is through having Cisco APs in the area with rouge AP
detection and countermeasures enabled.
Such Cisco APs will regularly send fake beacons, disassoc and deauth
packets that trigger these warnings.
To fix this issue, drop such spoofed packets early in the rx path.
Reported-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
[bwh: Backported to 3.2: use compare_ether_addr() instead of ether_addr_equal()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
packet
commit 3a7b21eaf4fb3c971bdb47a98f570550ddfe4471 upstream.
Some Cisco phones create huge messages that are spread over multiple packets.
After calculating the offset of the SIP body, it is validated to be within
the packet and the packet is dropped otherwise. This breaks operation of
these phones. Since connection tracking is supposed to be passive, just let
those packets pass unmodified and untracked.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: there is no log message to delete]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ This is a simplified -stable version of a set of upstream commits. ]
This is a replacement patch only for stable which does fix the problems
handled by the following two commits in -net:
"ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9)
"ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b)
Three frames are written on a corked udp socket for which the output
netdevice has UFO enabled. If the first and third frame are smaller than
the mtu and the second one is bigger, we enqueue the second frame with
skb_append_datato_frags without initializing the gso fields. This leads
to the third frame appended regulary and thus constructing an invalid skb.
This fixes the problem by always using skb_append_datato_frags as soon
as the first frag got enqueued to the skb without marking the packet
as SKB_GSO_UDP.
The problem with only two frames for ipv6 was fixed by "ipv6: udp
packets following an UFO enqueued packet need also be handled by UFO"
(2811ebac2521ceac84f2bdae402455baa6a7fb47).
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ]
In the case of credentials passing in unix stream sockets (dgram
sockets seem not affected), we get a rather sparse race after
commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default").
We have a stream server on receiver side that requests credential
passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED
on each spawned/accepted socket on server side to 1 first (as it's
not inherited), it can happen that in the time between accept() and
setsockopt() we get interrupted, the sender is being scheduled and
continues with passing data to our receiver. At that time SO_PASSCRED
is neither set on sender nor receiver side, hence in cmsg's
SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534
(== overflow{u,g}id) instead of what we actually would like to see.
On the sender side, here nc -U, the tests in maybe_add_creds()
invoked through unix_stream_sendmsg() would fail, as at that exact
time, as mentioned, the sender has neither SO_PASSCRED on his side
nor sees it on the server side, and we have a valid 'other' socket
in place. Thus, sender believes it would just look like a normal
connection, not needing/requesting SO_PASSCRED at that time.
As reverting 16e5726 would not be an option due to the significant
performance regression reported when having creds always passed,
one way/trade-off to prevent that would be to set SO_PASSCRED on
the listener socket and allow inheriting these flags to the spawned
socket on server side in accept(). It seems also logical to do so
if we'd tell the listener socket to pass those flags onwards, and
would fix the race.
Before, strace:
recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}},
msg_flags=0}, 0) = 5
After, strace:
recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}},
msg_flags=0}, 0) = 5
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ]
IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum.
This causes problems if SCTP packets has to be fragmented and
ipsummed has been set to PARTIAL due to checksum offload support.
This condition can happen when retransmitting after MTU discover,
or when INIT or other control chunks are larger then MTU.
Check for the rare fragmentation condition in SCTP and use software
checksum calculation in this case.
CC: Fan Du <fan.du@windriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ]
igb/ixgbe have hardware sctp checksum support, when this feature is enabled
and also IPsec is armed to protect sctp traffic, ugly things happened as
xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing
up and pack the 16bits result in the checksum field). The result is fail
establishment of sctp communication.
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ]
François Cachereul made a very nice bug report and suspected
the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from
process context was not good.
This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165
("l2tp: Fix locking in l2tp_core.c").
l2tp_eth_dev_xmit() runs from BH context, so we must disable BH
from other l2tp_xmit_skb() users.
[ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662]
[ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox
ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod
virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
[ 452.064012] CPU 1
[ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643]
[ 452.080015] CPU 2
[ 452.080015]
[ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
[ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f
[ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293
[ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000
[ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110
[ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000
[ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286
[ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000
[ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000
[ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0
[ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0)
[ 452.080015] Stack:
[ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1
[ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e
[ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600
[ 452.080015] Call Trace:
[ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
[ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3
[ 452.080015] Call Trace:
[ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
[ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.064012]
[ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
[ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f
[ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297
[ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002
[ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110
[ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c
[ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18
[ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0
[ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000
[ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0
[ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410)
[ 452.064012] Stack:
[ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a
[ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62
[ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276
[ 452.064012] Call Trace:
[ 452.064012] <IRQ>
[ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb
[ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
[ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
[ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
[ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
[ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
[ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
[ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
[ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269
[ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
[ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
[ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
[ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
[ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
[ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
[ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
[ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
[ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82
[ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c
[ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
[ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
[ 452.064012] <EOI>
[ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48
[ 452.064012] Call Trace:
[ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb
[ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
[ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
[ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
[ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
[ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
[ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
[ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
[ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269
[ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
[ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
[ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
[ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
[ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
[ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
[ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
[ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
[ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82
[ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c
[ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
[ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
[ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
Reported-by: François Cachereul <f.cachereul@alphalink.fr>
Tested-by: François Cachereul <f.cachereul@alphalink.fr>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ]
This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
This patch is based on 3.2.y branch, the one used by reporter. Please let me
know if it should be different. Thanks.
The patch which introduced the regression was applied on stables:
3.0.64 3.4.31 3.7.8 3.2.39
The patch which introduced the regression was for stable trees only.
---8<---
Commit 0d6a77079c475033cb622c07c5a880b392ef664e "ipv6: do not create
neighbor entries for local delivery" introduced a regression on
which routes to local delivery would not work anymore. Like this:
$ ip -6 route add local 2001::/64 dev lo
$ ping6 -c1 2001::9
PING 2001::9(2001::9) 56 data bytes
ping: sendmsg: Invalid argument
As this is a local delivery, that commit would not allow the creation of a
neighbor entry and thus the packet cannot be sent.
But as TPROXY scenario actually needs to avoid the neighbor entry creation only
for input flow, this patch now limits previous patch to input flow, keeping
output as before that patch.
Reported-by: Debabrata Banerjee <dbavatar@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ]
When sending out multicast messages, the source address in inet->mc_addr is
ignored and rewritten by an autoselected one. This is caused by a typo in
commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output
route lookups").
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ]
We need to cap ->msg_namelen or it leads to a buffer overflow when we
to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to
exploit this bug.
The call tree is:
___sys_recvmsg()
move_addr_to_user()
audit_sockaddr()
__audit_sockaddr()
Reported-by: Jüri Aedla <juri.aedla@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ]
commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU /
hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets.
We should instead use inet_twsk_put()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 5e8a402f831dbe7ee831340a91439e46f0d38acd ]
Yuchung found following problem :
There are bugs in the SACK processing code, merging part in
tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
skbs FIN flag. When a receiver first SACK the FIN sequence, and later
throw away ofo queue (e.g., sack-reneging), the sender will stop
retransmitting the FIN flag, and hangs forever.
Following packetdrill test can be used to reproduce the bug.
$ cat sack-merge-bug.pkt
`sysctl -q net.ipv4.tcp_fack=0`
// Establish a connection and send 10 MSS.
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+.000 bind(3, ..., ...) = 0
+.000 listen(3, 1) = 0
+.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
+.001 < . 1:1(0) ack 1 win 1024
+.000 accept(3, ..., ...) = 4
+.100 write(4, ..., 12000) = 12000
+.000 shutdown(4, SHUT_WR) = 0
+.000 > . 1:10001(10000) ack 1
+.050 < . 1:1(0) ack 2001 win 257
+.000 > FP. 10001:12001(2000) ack 1
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
// SACK reneg
+.050 < . 1:1(0) ack 12001 win 257
+0 %{ print "unacked: ",tcpi_unacked }%
+5 %{ print "" }%
First, a typo inverted left/right of one OR operation, then
code forgot to advance end_seq if the merged skb carried FIN.
Bug was added in 2.6.29 by commit 832d11c5cd076ab
("tcp: Try to restore large SKBs while SACK processing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ]
TCP stack should make sure it owns skbs before mangling them.
We had various crashes using bnx2x, and it turned out gso_size
was cleared right before bnx2x driver was populating TC descriptor
of the _previous_ packet send. TCP stack can sometime retransmit
packets that are still in Qdisc.
Of course we could make bnx2x driver more robust (using
ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.
We have identified two points where skb_unclone() was needed.
This patch adds a WARN_ON_ONCE() to warn us if we missed another
fix of this kind.
Kudos to Neal for finding the root cause of this bug. Its visible
using small MSS.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
This reverts commit de77b7955c3985ca95f64af3cb10557eb17eacee, which was
commit f6e80abeab928b7c47cc1fbf53df13b4398a2bec upstream.
This fix was only appropriate for Linux 3.7 onward, and introduced a
regression when applied to earlier versions.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 9260d3e1013701aa814d10c8fc6a9f92bd17d643 ]
It is possible for the timer handlers to run after the call to
ipv6_mc_down so use in6_dev_put instead of __in6_dev_put in the
handler function in order to do proper cleanup when the refcnt
reaches 0. Otherwise, the refcnt can reach zero without the
inet6_dev being destroyed and we end up leaking a reference to
the net_device and see messages like the following,
unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Tested on linux-3.4.43.
Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit e2401654dd0f5f3fb7a8d80dad9554d73d7ca394 ]
It is possible for the timer handlers to run after the call to
ip_mc_down so use in_dev_put instead of __in_dev_put in the handler
function in order to do proper cleanup when the refcnt reaches 0.
Otherwise, the refcnt can reach zero without the in_device being
destroyed and we end up leaking a reference to the net_device and
see messages like the following,
unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Tested on linux-3.4.43.
Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 2811ebac2521ceac84f2bdae402455baa6a7fb47 ]
In the following scenario the socket is corked:
If the first UDP packet is larger then the mtu we try to append it to the
write queue via ip6_ufo_append_data. A following packet, which is smaller
than the mtu would be appended to the already queued up gso-skb via
plain ip6_append_data. This causes random memory corruptions.
In ip6_ufo_append_data we also have to be careful to not queue up the
same skb multiple times. So setup the gso frame only when no first skb
is available.
This also fixes a shortcoming where we add the current packet's length to
cork->length but return early because of a packet > mtu with dontfrag set
(instead of sutracting it again).
Found with trinity.
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 703133de331a7a7df47f31fb9de51dc6f68a9de8 ]
If local fragmentation is allowed, then ip_select_ident() and
ip_select_ident_more() need to generate unique IDs to ensure
correct defragmentation on the peer.
For example, if IPsec (tunnel mode) has to encrypt large skbs
that have local_df bit set, then all IP fragments that belonged
to different ESP datagrams would have used the same identificator.
If one of these IP fragments would get lost or reordered, then
peer could possibly stitch together wrong IP fragments that did
not belong to the same datagram. This would lead to a packet loss
or data corruption.
Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 9a0620133ccce9dd35c00a96405c8d80938c2cc0 ]
This changes the message_age_timer calculation to use the BPDU's max age as
opposed to the local bridge's max age. This is in accordance with section
8.6.2.3.2 Step 2 of the 802.1D-1998 sprecification.
With the current implementation, when running with very large bridge
diameters, convergance will not always occur even if a root bridge is
configured to have a longer max age.
Tested successfully on bridge diameters of ~200.
Signed-off-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 95ee62083cb6453e056562d91f597552021e6ae7 ]
Alan Chester reported an issue with IPv6 on SCTP that IPsec traffic is not
being encrypted, whereas on IPv4 it is. Setting up an AH + ESP transport
does not seem to have the desired effect:
SCTP + IPv4:
22:14:20.809645 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto AH (51), length 116)
192.168.0.2 > 192.168.0.5: AH(spi=0x00000042,sumlen=16,seq=0x1): ESP(spi=0x00000044,seq=0x1), length 72
22:14:20.813270 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto AH (51), length 340)
192.168.0.5 > 192.168.0.2: AH(spi=0x00000043,sumlen=16,seq=0x1):
SCTP + IPv6:
22:31:19.215029 IP6 (class 0x02, hlim 64, next-header SCTP (132) payload length: 364)
fe80::222:15ff:fe87:7fc.3333 > fe80::92e6:baff:fe0d:5a54.36767: sctp
1) [INIT ACK] [init tag: 747759530] [rwnd: 62464] [OS: 10] [MIS: 10]
Moreover, Alan says:
This problem was seen with both Racoon and Racoon2. Other people have seen
this with OpenSwan. When IPsec is configured to encrypt all upper layer
protocols the SCTP connection does not initialize. After using Wireshark to
follow packets, this is because the SCTP packet leaves Box A unencrypted and
Box B believes all upper layer protocols are to be encrypted so it drops
this packet, causing the SCTP connection to fail to initialize. When IPsec
is configured to encrypt just SCTP, the SCTP packets are observed unencrypted.
In fact, using `socat sctp6-listen:3333 -` on one end and transferring "plaintext"
string on the other end, results in cleartext on the wire where SCTP eventually
does not report any errors, thus in the latter case that Alan reports, the
non-paranoid user might think he's communicating over an encrypted transport on
SCTP although he's not (tcpdump ... -X):
...
0x0030: 5d70 8e1a 0003 001a 177d eb6c 0000 0000 ]p.......}.l....
0x0040: 0000 0000 706c 6169 6e74 6578 740a 0000 ....plaintext...
Only in /proc/net/xfrm_stat we can see XfrmInTmplMismatch increasing on the
receiver side. Initial follow-up analysis from Alan's bug report was done by
Alexey Dobriyan. Also thanks to Vlad Yasevich for feedback on this.
SCTP has its own implementation of sctp_v6_xmit() not calling inet6_csk_xmit().
This has the implication that it probably never really got updated along with
changes in inet6_csk_xmit() and therefore does not seem to invoke xfrm handlers.
SCTP's IPv4 xmit however, properly calls ip_queue_xmit() to do the work. Since
a call to inet6_csk_xmit() would solve this problem, but result in unecessary
route lookups, let us just use the cached flowi6 instead that we got through
sctp_v6_get_dst(). Since all SCTP packets are being sent through sctp_packet_transmit(),
we do the route lookup / flow caching in sctp_transport_route(), hold it in
tp->dst and skb_dst_set() right after that. If we would alter fl6->daddr in
sctp_v6_xmit() to np->opt->srcrt, we possibly could run into the same effect
of not having xfrm layer pick it up, hence, use fl6_update_dst() in sctp_v6_get_dst()
instead to get the correct source routed dst entry, which we assign to the skb.
Also source address routing example from 625034113 ("sctp: fix sctp to work with
ipv6 source address routing") still works with this patch! Nevertheless, in RFC5095
it is actually 'recommended' to not use that anyway due to traffic amplification [1].
So it seems we're not supposed to do that anyway in sctp_v6_xmit(). Moreover, if
we overwrite the flow destination here, the lower IPv6 layer will be unable to
put the correct destination address into IP header, as routing header is added in
ipv6_push_nfrag_opts() but then probably with wrong final destination. Things aside,
result of this patch is that we do not have any XfrmInTmplMismatch increase plus on
the wire with this patch it now looks like:
SCTP + IPv6:
08:17:47.074080 IP6 2620:52:0:102f:7a2b:cbff:fe27:1b0a > 2620:52:0:102f:213:72ff:fe32:7eba:
AH(spi=0x00005fb4,seq=0x1): ESP(spi=0x00005fb5,seq=0x1), length 72
08:17:47.074264 IP6 2620:52:0:102f:213:72ff:fe32:7eba > 2620:52:0:102f:7a2b:cbff:fe27:1b0a:
AH(spi=0x00003d54,seq=0x1): ESP(spi=0x00003d55,seq=0x1), length 296
This fixes Kernel Bugzilla 24412. This security issue seems to be present since
2.6.18 kernels. Lets just hope some big passive adversary in the wild didn't have
its fun with that. lksctp-tools IPv6 regression test suite passes as well with
this patch.
[1] http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
Reported-by: Alan Chester <alan.chester@tekelec.com>
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit d0fe8c888b1fd1a2f84b9962cabcb98a70988aec ]
I've been hitting a NULL ptr deref while using netconsole because the
np->dev check and the pointer manipulation in netpoll_cleanup are done
without rtnl and the following sequence happens when having a netconsole
over a vlan and we remove the vlan while disabling the netconsole:
CPU 1 CPU2
removes vlan and calls the notifier
enters store_enabled(), calls
netdev_cleanup which checks np->dev
and then waits for rtnl
executes the netconsole netdev
release notifier making np->dev
== NULL and releases rtnl
continues to dereference a member of
np->dev which at this point is == NULL
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 88362ad8f9a6cea787420b57cc27ccacef000dbe ]
This was originally reported in [1] and posted by Neil Horman [2], he said:
Fix up a missed null pointer check in the asconf code. If we don't find
a local address, but we pass in an address length of more than 1, we may
dereference a NULL laddr pointer. Currently this can't happen, as the only
users of the function pass in the value 1 as the addrcnt parameter, but
its not hot path, and it doesn't hurt to check for NULL should that ever
be the case.
The callpath from sctp_asconf_mgmt() looks okay. But this could be triggered
from sctp_setsockopt_bindx() call with SCTP_BINDX_REM_ADDR and addrcnt > 1
while passing all possible addresses from the bind list to SCTP_BINDX_REM_ADDR
so that we do *not* find a single address in the association's bind address
list that is not in the packed array of addresses. If this happens when we
have an established association with ASCONF-capable peers, then we could get
a NULL pointer dereference as we only check for laddr == NULL && addrcnt == 1
and call later sctp_make_asconf_update_ip() with NULL laddr.
BUT: this actually won't happen as sctp_bindx_rem() will catch such a case
and return with an error earlier. As this is incredably unintuitive and error
prone, add a check to catch at least future bugs here. As Neil says, its not
hot path. Introduced by 8a07eb0a5 ("sctp: Add ASCONF operation on the
single-homed host").
[1] http://www.spinics.net/lists/linux-sctp/msg02132.html
[2] http://www.spinics.net/lists/linux-sctp/msg02133.html
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Michio Honda <micchie@sfc.wide.ad.jp>
Acked-By: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 0c1db731bfcf3a9fd6c58132134f8b0f423552f0 ]
The indentation here implies this was meant to be a multi-line if.
Introduced several years back in commit c85c2951d4da1236e32f1858db418221e624aba5
("caif: Handle dev_queue_xmit errors.")
Signed-off-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 73d9f7eef3d98c3920e144797cc1894c6b005a1e upstream.
For nofail == false request, if __map_request failed, the caller does
cleanup work, like releasing the relative pages. It doesn't make any sense
to retry this request.
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
[bwh: Backported to 3.2: adjust indentation]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 55432d2b543a4b6dfae54f5c432a566877a85d90 ]
commit 5faa5df1fa2024 (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :
Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.
inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 5faa5df1fa2024bd750089ff21dcc4191798263d ]
We initialize the routing metrics with the values cached on the
inetpeer in rt_init_metrics(). So if we have the metrics cached on the
inetpeer, we ignore the user configured fib_metrics.
To fix this issue, we replace the old tree with a fresh initialized
inet_peer_base. The old tree is removed later with a delayed work queue.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 4225a398c1352a7a5c14dc07277cb5cc4473983b ]
When the lockdep validator is enabled, it will report the below
warning when we enable a TIPC bearer:
[ INFO: possible irq lock inversion dependency detected ]
---------------------------------------------------------
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(ptype_lock);
local_irq_disable();
lock(tipc_net_lock);
lock(ptype_lock);
<Interrupt>
lock(tipc_net_lock);
*** DEADLOCK ***
the shortest dependencies between 2nd lock and 1st lock:
-> (ptype_lock){+.+...} ops: 10 {
[...]
SOFTIRQ-ON-W at:
[<c1089418>] __lock_acquire+0x528/0x13e0
[<c108a360>] lock_acquire+0x90/0x100
[<c1553c38>] _raw_spin_lock+0x38/0x50
[<c14651ca>] dev_add_pack+0x3a/0x60
[<c182da75>] arp_init+0x1a/0x48
[<c182dce5>] inet_init+0x181/0x27e
[<c1001114>] do_one_initcall+0x34/0x170
[<c17f7329>] kernel_init+0x110/0x1b2
[<c155b6a2>] kernel_thread_helper+0x6/0x10
[...]
... key at: [<c17e4b10>] ptype_lock+0x10/0x20
... acquired at:
[<c108a360>] lock_acquire+0x90/0x100
[<c1553c38>] _raw_spin_lock+0x38/0x50
[<c14651ca>] dev_add_pack+0x3a/0x60
[<c8bc18d2>] enable_bearer+0xf2/0x140 [tipc]
[<c8bb283a>] tipc_enable_bearer+0x1ba/0x450 [tipc]
[<c8bb3a04>] tipc_cfg_do_cmd+0x5c4/0x830 [tipc]
[<c8bbc032>] handle_cmd+0x42/0xd0 [tipc]
[<c148e802>] genl_rcv_msg+0x232/0x280
[<c148d3f6>] netlink_rcv_skb+0x86/0xb0
[<c148e5bc>] genl_rcv+0x1c/0x30
[<c148d144>] netlink_unicast+0x174/0x1f0
[<c148ddab>] netlink_sendmsg+0x1eb/0x2d0
[<c1456bc1>] sock_aio_write+0x161/0x170
[<c1135a7c>] do_sync_write+0xac/0xf0
[<c11360f6>] vfs_write+0x156/0x170
[<c11361e2>] sys_write+0x42/0x70
[<c155b0df>] sysenter_do_call+0x12/0x38
[...]
}
-> (tipc_net_lock){+..-..} ops: 4 {
[...]
IN-SOFTIRQ-R at:
[<c108953a>] __lock_acquire+0x64a/0x13e0
[<c108a360>] lock_acquire+0x90/0x100
[<c15541cd>] _raw_read_lock_bh+0x3d/0x50
[<c8bb874d>] tipc_recv_msg+0x1d/0x830 [tipc]
[<c8bc195f>] recv_msg+0x3f/0x50 [tipc]
[<c146a5fa>] __netif_receive_skb+0x22a/0x590
[<c146ab0b>] netif_receive_skb+0x2b/0xf0
[<c13c43d2>] pcnet32_poll+0x292/0x780
[<c146b00a>] net_rx_action+0xfa/0x1e0
[<c103a4be>] __do_softirq+0xae/0x1e0
[...]
}
>From the log, we can see three different call chains between
CPU0 and CPU1:
Time 0 on CPU0:
kernel_init()->inet_init()->dev_add_pack()
At time 0, the ptype_lock is held by CPU0 in dev_add_pack();
Time 1 on CPU1:
tipc_enable_bearer()->enable_bearer()->dev_add_pack()
At time 1, tipc_enable_bearer() first holds tipc_net_lock, and then
wants to take ptype_lock to register TIPC protocol handler into the
networking stack. But the ptype_lock has been taken by dev_add_pack()
on CPU0, so at this time the dev_add_pack() running on CPU1 has to be
busy looping.
Time 2 on CPU0:
netif_receive_skb()->recv_msg()->tipc_recv_msg()
At time 2, an incoming TIPC packet arrives at CPU0, hence
tipc_recv_msg() will be invoked. In tipc_recv_msg(), it first wants
to hold tipc_net_lock. At the moment, below scenario happens:
On CPU0, below is our sequence of taking locks:
lock(ptype_lock)->lock(tipc_net_lock)
On CPU1, our sequence of taking locks looks like:
lock(tipc_net_lock)->lock(ptype_lock)
Obviously deadlock may happen in this case.
But please note the deadlock possibly doesn't occur at all when the
first TIPC bearer is enabled. Before enable_bearer() -- running on
CPU1 does not hold ptype_lock, so the TIPC receive handler (i.e.
recv_msg()) is not registered successfully via dev_add_pack(), so
the tipc_recv_msg() cannot be called by recv_msg() even if a TIPC
message comes to CPU0. But when the second TIPC bearer is
registered, the deadlock can perhaps really happen.
To fix it, we will push the work of registering TIPC protocol
handler into workqueue context. After the change, both paths taking
ptype_lock are always in process contexts, thus, the deadlock should
never occur.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 61e76b178dbe7145e8d6afa84bb4ccea71918994 ]
RFC 4443 has defined two additional codes for ICMPv6 type 1 (destination
unreachable) messages:
5 - Source address failed ingress/egress policy
6 - Reject route to destination
Now they are treated as protocol error and icmpv6_err_convert() converts them
to EPROTO.
RFC 4443 says:
"Codes 5 and 6 are more informative subsets of code 1."
Treat codes 5 and 6 as code 1 (EACCES)
Btw, connect() returning -EPROTO confuses firefox, so that fallback to
other/IPv4 addresses does not work:
https://bugzilla.mozilla.org/show_bug.cgi?id=910773
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 2d98c29b6fb3de44d9eaa73c09f9cf7209346383 ]
While looking into MLDv1/v2 code, I noticed that bridging code does
not convert it's max delay into jiffies for MLDv2 messages as we do
in core IPv6' multicast code.
RFC3810, 5.1.3. Maximum Response Code says:
The Maximum Response Code field specifies the maximum time allowed
before sending a responding Report. The actual time allowed, called
the Maximum Response Delay, is represented in units of milliseconds,
and is derived from the Maximum Response Code as follows: [...]
As we update timers that work with jiffies, we need to convert it.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Linus Lüssing <linus.luessing@web.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 25a6e6b84fba601eff7c28d30da8ad7cfbef0d43 ]
Allocating skbs when sending out neighbour discovery messages
currently uses sock_alloc_send_skb() based on a per net namespace
socket and thus share a socket wmem buffer space.
If a netdevice is temporarily unable to transmit due to carrier
loss or for other reasons, the queued up ndisc messages will cosnume
all of the wmem space and will thus prevent from any more skbs to
be allocated even for netdevices that are able to transmit packets.
The number of neighbour discovery messages sent is very limited,
use of alloc_skb() bypasses the socket wmem buffer size enforcement
while the manual call to skb_set_owner_w() maintains the socket
reference needed for the IPv6 output path.
This patch has orginally been posted by Eric Dumazet in a modified
form.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Fabio Estevam <festevam@gmail.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit f46078cfcd77fa5165bf849f5e568a7ac5fa569c ]
It is not allowed for an ipv6 packet to contain multiple fragmentation
headers. So discard packets which were already reassembled by
fragmentation logic and send back a parameter problem icmp.
The updates for RFC 6980 will come in later, I have to do a bit more
research here.
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 4b08a8f1bd8cb4541c93ec170027b4d0782dab52 ]
Because of the max_addresses check attackers were able to disable privacy
extensions on an interface by creating enough autoconfigured addresses:
<http://seclists.org/oss-sec/2012/q4/292>
But the check is not actually needed: max_addresses protects the
kernel to install too many ipv6 addresses on an interface and guards
addrconf_prefix_rcv to install further addresses as soon as this limit
is reached. We only generate temporary addresses in direct response of
a new address showing up. As soon as we filled up the maximum number of
addresses of an interface, we stop installing more addresses and thus
also stop generating more temp addresses.
Even if the attacker tries to generate a lot of temporary addresses
by announcing a prefix and removing it again (lifetime == 0) we won't
install more temp addresses, because the temporary addresses do count
to the maximum number of addresses, thus we would stop installing new
autoconfigured addresses when the limit is reached.
This patch fixes CVE-2013-0343 (but other layer-2 attacks are still
possible).
Thanks to Ding Tianhong to bring this topic up again.
Cc: Ding Tianhong <dingtianhong@huawei.com>
Cc: George Kargiotakis <kargig@void.gr>
Cc: P J P <ppandit@redhat.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 3e3be275851bc6fc90bfdcd732cd95563acd982b ]
In case a subtree did not match we currently stop backtracking and return
NULL (root table from fib_lookup). This could yield in invalid routing
table lookups when using subtrees.
Instead continue to backtrack until a valid subtree or node is found
and return this match.
Also remove unneeded NULL check.
Reported-by: Teco Boot <teco@inf-net.nl>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: David Lamparter <equinox@diac24.net>
Cc: <boutier@pps.univ-paris-diderot.fr>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|