Age | Commit message (Collapse) | Author |
|
The net_device might be not set on the skb when we try refcounting.
This leads to a null pointer dereference in xdst_queue_output().
It turned out that the refcount to the net_device is not needed
after all. The dst_entry has a refcount to the net_device before
we queue the skb, so it can't go away. Therefore we can remove the
refcount on queueing to fix the null pointer dereference.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Currently we don't initialize skb->protocol when transmitting data via
tcp, raw(with and without inclhdr) or udp+ufo or appending data directly
to the socket transmit queue (via ip6_append_data). This needs to be
done so that we can get the correct mtu in the xfrm layer.
Setting of skb->protocol happens only in functions where we also have
a transmitting socket and a new skb, so we don't overwrite old values.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
helper for safe determination of mtu") I switched the determination of
ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
never correct for ipv4 ipsec.
This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
("xfrm: introduce helper for safe determination of mtu").
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We need to choose the protocol family by skb->protocol. Otherwise we
call the wrong xfrm{4,6}_local_error handler in case an ipv6 sockets is
used in ipv4 mode, in which case we should call down to xfrm4_local_error
(ip6 sockets are a superset of ip4 ones).
We are called before before ip_output functions, so skb->protocol is
not reset.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
In xfrm6_local_error use inner_header if the packet was encapsulated.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
When pushing a new header before current one call skb_reset_inner_headers
to record the position of the inner headers in the various ipv6 tunnel
protocols.
We later need this to correctly identify the addresses needed to send
back an error in the xfrm layer.
This change is safe, because skb->protocol is always checked before
dereferencing data from the inner protocol.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
skb->sk socket can be of AF_INET or AF_INET6 address family. Thus we
always have to make sure we a referring to the correct interpretation
of skb->sk.
We only depend on header defines to query the mtu, so we don't introduce
a new dependency to ipv6 by this change.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
In xfrm4 and xfrm6 we need to take care about sockets of the other
address family. This could happen because a 6in4 or 4in6 tunnel could
get protected by ipsec.
Because we don't want to have a run-time dependency on ipv6 when only
using ipv4 xfrm we have to embed a pointer to the correct local_error
function in xfrm_state_afinet and look it up when returning an error
depending on the socket address family.
Thanks to vi0ss for the great bug report:
<https://bugzilla.kernel.org/show_bug.cgi?id=58691>
v2:
a) fix two more unsafe interpretations of skb->sk as ipv6 socket
(xfrm6_local_dontfrag and __xfrm6_output)
v3:
a) add an EXPORT_SYMBOL_GPL(xfrm_local_error) to fix a link error when
building ipv6 as a module (thanks to Steffen Klassert)
Reported-by: <vi0oss@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Since we set "len = total_len" in the beginning of tun_get_user(),
so we should compare the new len with 0, instead of total_len,
or the if statement always returns false.
Signed-off-by: Weiping Pan <wpan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the iproute2 command `bridge vlan show`, after switching from
rtgenmsg to ifinfomsg.
Let's start with a little history:
Feb 20: Vlad Yasevich got his VLAN-aware bridge patchset included in
the 3.9 merge window.
In the kernel commit 6cbdceeb, he added attribute support to
bridge GETLINK requests sent with rtgenmsg.
Mar 6th: Vlad got this iproute2 reference implementation of the bridge
vlan netlink interface accepted (iproute2 9eff0e5c)
Apr 25th: iproute2 switched from using rtgenmsg to ifinfomsg (63338dca)
http://patchwork.ozlabs.org/patch/239602/
http://marc.info/?t=136680900700007
Apr 28th: Linus released 3.9
Apr 30th: Stephen released iproute2 3.9.0
The `bridge vlan show` command haven't been working since the switch to
ifinfomsg, or in a released version of iproute2. Since the kernel side
only supports rtgenmsg, which iproute2 switched away from just prior to
the iproute2 3.9.0 release.
I haven't been able to find any documentation, about neither rtgenmsg
nor ifinfomsg, and in which situation to use which, but kernel commit
88c5b5ce seams to suggest that ifinfomsg should be used.
Fixing this in kernel will break compatibility, but I doubt that anybody
have been using it due to this bug in the user space reference
implementation, at least not without noticing this bug. That said the
functionality is still fully functional in 3.9, when reversing iproute2
commit 63338dca.
This could also be fixed in iproute2, but thats an ugly patch that would
reintroduce rtgenmsg in iproute2, and from searching in netdev it seams
like rtgenmsg usage is discouraged. I'm assuming that the only reason
that Vlad implemented the kernel side to use rtgenmsg, was because
iproute2 was using it at the time.
Signed-off-by: Asbjoern Sloth Toennesen <ast@fiberby.net>
Reviewed-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Initially I improperly set a boundary for maximum number of input
packets to process on NAPI poll ("work") so it might be more than
expected amount ("weight").
This was really harmless but seeing WARN_ON_ONCE on every device boot is
not nice. So trivial fix ("<" instead of "<=") is here.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Mischa Jonker <mjonker@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Using inner-id for tunnel id is not safe in some rare cases.
E.g. packets coming from multiple sources entering same tunnel
can have same id. Therefore on tunnel packet receive we
could have packets from two different stream but with same
source and dst IP with same ip-id which could confuse ip packet
reassembly.
Following patch reverts optimization from commit
490ab08127 (IP_GRE: Fix IP-Identification.)
CC: Jarno Rajahalme <jrajahalme@nicira.com>
CC: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Dmitry Kravkov says:
====================
Please consider applying the series of bnx2x fixes to net:
* statistics may cause FW assert
* missing fairness configuration in DCB flow
* memory leak in sriov related part
* Illegal PTE access
* Pagefault crash in shutdown flow with cnic
v1->v2
* fixed sparse error pointed by Joe Perches
* added missing signed-off from Sergei Shtylyov
v2->v3
* added missing signed-off from Sergei Shtylyov
* fixed formatting from Sergei Shtylyov
v3->v4
* patch 1/6: fixed declaration order
* patch 2/6 replaced with: protect flows using set_bit constraints
v4->v5
* patch 2/6: replace proprietary locking with semaphore
* droped 1/6: since adds redundant code from Benjamin Poirier
The following patchset contains four netfilter fixes, they are:
* Fix possible invalid access and mangling of the TCPMSS option in
xt_TCPMSS. This was spotted by Julian Anastasov.
* Fix possible off by one access and mangling of the TCP packet in
xt_TCPOPTSTRIP, also spotted by Julian Anastasov.
* Fix possible information leak due to missing initialization of one
padding field of several structures that are included in nfqueue and
nflog netlink messages, from Dan Carpenter.
* Fix TCP window tracking with Fast Open, from Yuchung Cheng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There might be a crash as during shutdown flow CNIC might try
to access resources already freed by bnx2x.
Change bnx2x_close() into dev_close() in __bnx2x_remove (shutdown flow)
to guarantee CNIC is notified of the device's change of status.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PTE write access error might occur in MF_ALLOWED mode when IOMMU
is active. The patch adds rmmod HSI indicating to MFW to stop
running queries which might trigger this failure.
Signed-off-by: Barak Witkowsky <barak@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ETS can be enabled as a result of DCB negotiation, then
fairness must be recalculated after each negotiation.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add locking to protect different statistics flows from
running simultaneously.
This in order to serialize statistics requests sent to FW,
otherwise two outstanding queries may cause FW assert.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove Andrew Gallatin, as he is no longer with Myricom. Add
Hyong-Youb Kim as the new maintainer. Update the website URL.
Signed-off-by: Hyong-Youb Kim <hykim@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The DMA sync should sync the whole receive buffer, not just
part of it. Fixes log messages dma_sync_check.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When dumping generic netlink families, only the first dump call
is locked with genl_lock(), which protects the list of families,
and thus subsequent calls can access the data without locking,
racing against family addition/removal. This can cause a crash.
Fix it - the locking needs to be conditional because the first
time around it's already locked.
A similar bug was reported to me on an old kernel (3.4.47) but
the exact scenario that happened there is no longer possible,
on those kernels the first round wasn't locked either. Looking
at the current code I found the race described above, which had
also existed on the old kernel.
Cc: stable@vger.kernel.org
Reported-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Probably this one is quite unlikely to be triggered, but it's more safe
to do the call_rcu() at the end after we have dropped the reference on
the asoc and freed sctp packet chunks. The reason why is because in
sctp_transport_destroy_rcu() the transport is being kfree()'d, and if
we're unlucky enough we could run into corrupted pointers. Probably
that's more of theoretical nature, but it's safer to have this simple fix.
Introduced by commit 8c98653f ("sctp: sctp_close: fix release of bindings
for deferred call_rcu's"). I also did the 8c98653f regression test and
it's fine that way.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The SCTP Quick failover draft [1] section 5.1, point 5 says that the cwnd
should be 1 MTU. So, instead of 1, set it to 1 MTU.
[1] https://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05
Reported-by: Karl Heiss <kheiss@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In stmmac_init_rx_buffers():
* add missing handling of dma_map_single() error
* remove superfluous unlikely() optimization while at it
Add stmmac_free_rx_buffers() helper and use it in dma_free_rx_skbufs().
In init_dma_desc_rings():
* add missing handling of kmalloc_array() errors
* fix handling of dma_alloc_coherent() and stmmac_init_rx_buffers() errors
* make function return an error value on error and 0 on success
In stmmac_open():
* add handling of init_dma_desc_rings() return value
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We met lockdep warning when enable and disable the bearer for commands such as:
tipc-config -netid=1234 -addr=1.1.3 -be=eth:eth0
tipc-config -netid=1234 -addr=1.1.3 -bd=eth:eth0
---------------------------------------------------
[ 327.693595] ======================================================
[ 327.693994] [ INFO: possible circular locking dependency detected ]
[ 327.694519] 3.11.0-rc3-wwd-default #4 Tainted: G O
[ 327.694882] -------------------------------------------------------
[ 327.695385] tipc-config/5825 is trying to acquire lock:
[ 327.695754] (((timer))#2){+.-...}, at: [<ffffffff8105be80>] del_timer_sync+0x0/0xd0
[ 327.696018]
[ 327.696018] but task is already holding lock:
[ 327.696018] (&(&b_ptr->lock)->rlock){+.-...}, at: [<ffffffffa02be58d>] bearer_disable+ 0xdd/0x120 [tipc]
[ 327.696018]
[ 327.696018] which lock already depends on the new lock.
[ 327.696018]
[ 327.696018]
[ 327.696018] the existing dependency chain (in reverse order) is:
[ 327.696018]
[ 327.696018] -> #1 (&(&b_ptr->lock)->rlock){+.-...}:
[ 327.696018] [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
[ 327.696018] [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
[ 327.696018] [<ffffffff810b4453>] lock_acquire+0x103/0x130
[ 327.696018] [<ffffffff814d65b1>] _raw_spin_lock_bh+0x41/0x80
[ 327.696018] [<ffffffffa02c5d48>] disc_timeout+0x18/0xd0 [tipc]
[ 327.696018] [<ffffffff8105b92a>] call_timer_fn+0xda/0x1e0
[ 327.696018] [<ffffffff8105bcd7>] run_timer_softirq+0x2a7/0x2d0
[ 327.696018] [<ffffffff8105379a>] __do_softirq+0x16a/0x2e0
[ 327.696018] [<ffffffff81053a35>] irq_exit+0xd5/0xe0
[ 327.696018] [<ffffffff81033005>] smp_apic_timer_interrupt+0x45/0x60
[ 327.696018] [<ffffffff814df4af>] apic_timer_interrupt+0x6f/0x80
[ 327.696018] [<ffffffff8100b70e>] arch_cpu_idle+0x1e/0x30
[ 327.696018] [<ffffffff810a039d>] cpu_idle_loop+0x1fd/0x280
[ 327.696018] [<ffffffff810a043e>] cpu_startup_entry+0x1e/0x20
[ 327.696018] [<ffffffff81031589>] start_secondary+0x89/0x90
[ 327.696018]
[ 327.696018] -> #0 (((timer))#2){+.-...}:
[ 327.696018] [<ffffffff810b33fe>] check_prev_add+0x43e/0x4b0
[ 327.696018] [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
[ 327.696018] [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
[ 327.696018] [<ffffffff810b4453>] lock_acquire+0x103/0x130
[ 327.696018] [<ffffffff8105bebd>] del_timer_sync+0x3d/0xd0
[ 327.696018] [<ffffffffa02c5855>] tipc_disc_delete+0x15/0x30 [tipc]
[ 327.696018] [<ffffffffa02be59f>] bearer_disable+0xef/0x120 [tipc]
[ 327.696018] [<ffffffffa02be74f>] tipc_disable_bearer+0x2f/0x60 [tipc]
[ 327.696018] [<ffffffffa02bfb32>] tipc_cfg_do_cmd+0x2e2/0x550 [tipc]
[ 327.696018] [<ffffffffa02c8c79>] handle_cmd+0x49/0xe0 [tipc]
[ 327.696018] [<ffffffff8143e898>] genl_family_rcv_msg+0x268/0x340
[ 327.696018] [<ffffffff8143ed30>] genl_rcv_msg+0x70/0xd0
[ 327.696018] [<ffffffff8143d4c9>] netlink_rcv_skb+0x89/0xb0
[ 327.696018] [<ffffffff8143e617>] genl_rcv+0x27/0x40
[ 327.696018] [<ffffffff8143d21e>] netlink_unicast+0x15e/0x1b0
[ 327.696018] [<ffffffff8143ddcf>] netlink_sendmsg+0x22f/0x400
[ 327.696018] [<ffffffff813f7836>] __sock_sendmsg+0x66/0x80
[ 327.696018] [<ffffffff813f7957>] sock_aio_write+0x107/0x120
[ 327.696018] [<ffffffff8117f76d>] do_sync_write+0x7d/0xc0
[ 327.696018] [<ffffffff8117fc56>] vfs_write+0x186/0x190
[ 327.696018] [<ffffffff811803e0>] SyS_write+0x60/0xb0
[ 327.696018] [<ffffffff814de852>] system_call_fastpath+0x16/0x1b
[ 327.696018]
[ 327.696018] other info that might help us debug this:
[ 327.696018]
[ 327.696018] Possible unsafe locking scenario:
[ 327.696018]
[ 327.696018] CPU0 CPU1
[ 327.696018] ---- ----
[ 327.696018] lock(&(&b_ptr->lock)->rlock);
[ 327.696018] lock(((timer))#2);
[ 327.696018] lock(&(&b_ptr->lock)->rlock);
[ 327.696018] lock(((timer))#2);
[ 327.696018]
[ 327.696018] *** DEADLOCK ***
[ 327.696018]
[ 327.696018] 5 locks held by tipc-config/5825:
[ 327.696018] #0: (cb_lock){++++++}, at: [<ffffffff8143e608>] genl_rcv+0x18/0x40
[ 327.696018] #1: (genl_mutex){+.+.+.}, at: [<ffffffff8143ed66>] genl_rcv_msg+0xa6/0xd0
[ 327.696018] #2: (config_mutex){+.+.+.}, at: [<ffffffffa02bf889>] tipc_cfg_do_cmd+0x39/ 0x550 [tipc]
[ 327.696018] #3: (tipc_net_lock){++.-..}, at: [<ffffffffa02be738>] tipc_disable_bearer+ 0x18/0x60 [tipc]
[ 327.696018] #4: (&(&b_ptr->lock)->rlock){+.-...}, at: [<ffffffffa02be58d>] bearer_disable+0xdd/0x120 [tipc]
[ 327.696018]
[ 327.696018] stack backtrace:
[ 327.696018] CPU: 2 PID: 5825 Comm: tipc-config Tainted: G O 3.11.0-rc3-wwd- default #4
[ 327.696018] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 327.696018] 00000000ffffffff ffff880037fa77a8 ffffffff814d03dd 0000000000000000
[ 327.696018] ffff880037fa7808 ffff880037fa77e8 ffffffff810b1c4f 0000000037fa77e8
[ 327.696018] ffff880037fa7808 ffff880037e4db40 0000000000000000 ffff880037e4e318
[ 327.696018] Call Trace:
[ 327.696018] [<ffffffff814d03dd>] dump_stack+0x4d/0xa0
[ 327.696018] [<ffffffff810b1c4f>] print_circular_bug+0x10f/0x120
[ 327.696018] [<ffffffff810b33fe>] check_prev_add+0x43e/0x4b0
[ 327.696018] [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
[ 327.696018] [<ffffffff81087a28>] ? sched_clock_cpu+0xd8/0x110
[ 327.696018] [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
[ 327.696018] [<ffffffff810b4453>] lock_acquire+0x103/0x130
[ 327.696018] [<ffffffff8105be80>] ? try_to_del_timer_sync+0x70/0x70
[ 327.696018] [<ffffffff8105bebd>] del_timer_sync+0x3d/0xd0
[ 327.696018] [<ffffffff8105be80>] ? try_to_del_timer_sync+0x70/0x70
[ 327.696018] [<ffffffffa02c5855>] tipc_disc_delete+0x15/0x30 [tipc]
[ 327.696018] [<ffffffffa02be59f>] bearer_disable+0xef/0x120 [tipc]
[ 327.696018] [<ffffffffa02be74f>] tipc_disable_bearer+0x2f/0x60 [tipc]
[ 327.696018] [<ffffffffa02bfb32>] tipc_cfg_do_cmd+0x2e2/0x550 [tipc]
[ 327.696018] [<ffffffff81218783>] ? security_capable+0x13/0x20
[ 327.696018] [<ffffffffa02c8c79>] handle_cmd+0x49/0xe0 [tipc]
[ 327.696018] [<ffffffff8143e898>] genl_family_rcv_msg+0x268/0x340
[ 327.696018] [<ffffffff8143ed30>] genl_rcv_msg+0x70/0xd0
[ 327.696018] [<ffffffff8143ecc0>] ? genl_lock+0x20/0x20
[ 327.696018] [<ffffffff8143d4c9>] netlink_rcv_skb+0x89/0xb0
[ 327.696018] [<ffffffff8143e608>] ? genl_rcv+0x18/0x40
[ 327.696018] [<ffffffff8143e617>] genl_rcv+0x27/0x40
[ 327.696018] [<ffffffff8143d21e>] netlink_unicast+0x15e/0x1b0
[ 327.696018] [<ffffffff81289d7c>] ? memcpy_fromiovec+0x6c/0x90
[ 327.696018] [<ffffffff8143ddcf>] netlink_sendmsg+0x22f/0x400
[ 327.696018] [<ffffffff813f7836>] __sock_sendmsg+0x66/0x80
[ 327.696018] [<ffffffff813f7957>] sock_aio_write+0x107/0x120
[ 327.696018] [<ffffffff813fe29c>] ? release_sock+0x8c/0xa0
[ 327.696018] [<ffffffff8117f76d>] do_sync_write+0x7d/0xc0
[ 327.696018] [<ffffffff8117fa24>] ? rw_verify_area+0x54/0x100
[ 327.696018] [<ffffffff8117fc56>] vfs_write+0x186/0x190
[ 327.696018] [<ffffffff811803e0>] SyS_write+0x60/0xb0
[ 327.696018] [<ffffffff814de852>] system_call_fastpath+0x16/0x1b
-----------------------------------------------------------------------
The problem is that the tipc_link_delete() will cancel the timer disc_timeout() when
the b_ptr->lock is hold, but the disc_timeout() still call b_ptr->lock to finish the
work, so the dead lock occurs.
We should unlock the b_ptr->lock when del the disc_timeout().
Remove link_timeout() still met the same problem, the patch:
http://article.gmane.org/gmane.network.tipc.general/4380
fix the problem, so no need to send patch for fix link_timeout() deadlock warming.
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix possibly wrong memcpy() bytes length since some CAN records received from
PCAN-USB could define a DLC field in range [9..15].
In that case, the real DLC value MUST be used to move forward the record pointer
but, only 8 bytes max. MUST be copied into the data field of the struct
can_frame object of the skb given to the network core.
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since commit ac4e4af1e59e1 ("macvtap: Consistently use rcu functions"),
Thomas gets two different warnings :
BUG: using smp_processor_id() in preemptible [00000000] code: vhost-45891/45892
caller is macvtap_do_read+0x45c/0x600 [macvtap]
CPU: 1 PID: 45892 Comm: vhost-45891 Not tainted 3.11.0-bisecttest #13
Call Trace:
([<00000000001126ee>] show_trace+0x126/0x144)
[<00000000001127d2>] show_stack+0xc6/0xd4
[<000000000068bcec>] dump_stack+0x74/0xd8
[<0000000000481066>] debug_smp_processor_id+0xf6/0x114
[<000003ff802e9a18>] macvtap_do_read+0x45c/0x600 [macvtap]
[<000003ff802e9c1c>] macvtap_recvmsg+0x60/0x88 [macvtap]
[<000003ff80318c5e>] handle_rx+0x5b2/0x800 [vhost_net]
[<000003ff8028f77c>] vhost_worker+0x15c/0x1c4 [vhost]
[<000000000015f3ac>] kthread+0xd8/0xe4
[<00000000006934a6>] kernel_thread_starter+0x6/0xc
[<00000000006934a0>] kernel_thread_starter+0x0/0xc
And
BUG: using smp_processor_id() in preemptible [00000000] code: vhost-45897/45898
caller is macvlan_start_xmit+0x10a/0x1b4 [macvlan]
CPU: 1 PID: 45898 Comm: vhost-45897 Not tainted 3.11.0-bisecttest #16
Call Trace:
([<00000000001126ee>] show_trace+0x126/0x144)
[<00000000001127d2>] show_stack+0xc6/0xd4
[<000000000068bdb8>] dump_stack+0x74/0xd4
[<0000000000481132>] debug_smp_processor_id+0xf6/0x114
[<000003ff802b72ca>] macvlan_start_xmit+0x10a/0x1b4 [macvlan]
[<000003ff802ea69a>] macvtap_get_user+0x982/0xbc4 [macvtap]
[<000003ff802ea92a>] macvtap_sendmsg+0x4e/0x60 [macvtap]
[<000003ff8031947c>] handle_tx+0x494/0x5ec [vhost_net]
[<000003ff8028f77c>] vhost_worker+0x15c/0x1c4 [vhost]
[<000000000015f3ac>] kthread+0xd8/0xe4
[<000000000069356e>] kernel_thread_starter+0x6/0xc
[<0000000000693568>] kernel_thread_starter+0x0/0xc
2 locks held by vhost-45897/45898:
#0: (&vq->mutex){+.+.+.}, at: [<000003ff8031903c>] handle_tx+0x54/0x5ec [vhost_net]
#1: (rcu_read_lock){.+.+..}, at: [<000003ff802ea53c>] macvtap_get_user+0x824/0xbc4 [macvtap]
In the first case, macvtap_put_user() calls macvlan_count_rx()
in a preempt-able context, and this is not allowed.
In the second case, macvtap_get_user() calls
macvlan_start_xmit() with BH enabled, and this is not allowed.
Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Bisected-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Included change:
- reassign pointers to data after skb reallocation to avoid kernel paging errors
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are several functions which might reallocate skb data. Currently
some places keep reusing their old ethhdr pointer regardless of whether
they became invalid after such a reallocation or not. This potentially
leads to kernel paging errors.
This patch fixes these by refetching the ethdr pointer after the
potential reallocations.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
|
|
Pablo Neira Ayuso says:
====================
The following patchset contains four netfilter fixes, they are:
* Fix possible invalid access and mangling of the TCPMSS option in
xt_TCPMSS. This was spotted by Julian Anastasov.
* Fix possible off by one access and mangling of the TCP packet in
xt_TCPOPTSTRIP, also spotted by Julian Anastasov.
* Fix possible information leak due to missing initialization of one
padding field of several structures that are included in nfqueue and
nflog netlink messages, from Dan Carpenter.
* Fix TCP window tracking with Fast Open, from Yuchung Cheng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the conntrack checks if the ending sequence of a packet
falls within the observed receive window. However it does so even
if it has not observe any packet from the remote yet and uses an
uninitialized receive window (td_maxwin).
If a connection uses Fast Open to send a SYN-data packet which is
dropped afterward in the network. The subsequent SYNs retransmits
will all fail this check and be discarded, leading to a connection
timeout. This is because the SYN retransmit does not contain data
payload so
end == initial sequence number (isn) + 1
sender->td_end == isn + syn_data_len
receiver->td_maxwin == 0
The fix is to only apply this check after td_maxwin is initialized.
Reported-by: Michael Chan <mcfchan@stanford.edu>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Fix inverted check when deleting an fdb entry.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixed the condition of extend_desc for jumbo frame.
There is no check routine for extend_desc in the stmmac_jumbo_frm function.
Even though extend_desc is set if dma_tx is used instead of dma_etx.
It causes kernel panic.
Signed-off-by: Byungho An <bh74.an@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a regression introduced by:
commit fe5c3561e6f0ac7c9546209f01351113c1b77ec8
Author: stephen hemminger <stephen@networkplumber.org>
Date: Sat Jul 13 10:18:18 2013 -0700
vxlan: add necessary locking on device removal
The problem is that vxlan_dellink(), which is called with RTNL lock
held, tries to flush the workqueue synchronously, but apparently
igmp_join and igmp_leave work need to hold RTNL lock too, therefore we
have a soft lockup!
As suggested by Stephen, probably the flush_workqueue can just be
removed and let the normal refcounting work. The workqueue has a
reference to device and socket, therefore the cleanups should work
correctly.
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David S. Miller <davem@davemloft.net>
Tested-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a regression introduced by:
commit 3fc2de2faba387218bdf9dbc6b13f513ac3b060a
Author: stephen hemminger <stephen@networkplumber.org>
Date: Thu Jul 18 08:40:15 2013 -0700
vxlan: fix igmp races
Before this commit, the old code was:
if (vxlan_group_used(vn, vxlan->default_dst.remote_ip))
ip_mc_join_group(sk, &mreq);
else
ip_mc_leave_group(sk, &mreq);
therefore we shoud check vxlan_group_used(), not its opposite,
for igmp_join.
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rename mib counter from "low latency" to "busy poll"
v1 also moved the counter to the ip MIB (suggested by Shawn Bohrer)
Eric Dumazet suggested that the current location is better.
So v2 just renames the counter to fit the new naming convention.
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduced in cf3c4c03060b688cbc389ebc5065ebcce5653e96
("8139cp: Add dma_mapping_error checking")
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Same behavior than 802.1q : finds the encapsulated protocol and
skip 32bit header.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix ipgre_header() (header_ops->create) to return the correct
amount of bytes pushed. Most callers of dev_hard_header() seem
to care only if it was success, but af_packet.c uses it as
offset to the skb to copy from userspace only once. In practice
this fixes packet socket sendto()/sendmsg() to gre tunnels.
Regression introduced in c54419321455631079c7d6e60bc732dd0c5914c5
("GRE: Refactor GRE tunneling code.")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:
====================
This is a batch of fixes intended for the 3.11 queue...
Regarding the mac80211 (and related) bits, Johannes says:
"I have a fix from Chris for an infinite loop along with fixes from
myself to prevent it entering the loop to start with (continue using
disabled channels, many thanks to Chris for his debug/test help) and a
workaround for broken APs that advertise a bad HT primary channel in
their beacons. Additionally, a fix for another attrbuf race in mac80211
and a fix to clean up properly while P2P GO interfaces go down."
Along with that...
Solomon Peachy corrects a range check in cw1200 that would lead to
a BUG_ON when starting AP mode.
Stanislaw Gruszka provides an iwl4965 patch to power-up the device
earlier (avoiding microcode errors), and another iwl4965 fix that
resets the firmware after turning rfkill off (resolving a bug in the
Red Hat Bugzilla).
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
|
|
In case a subtree did not match we currently stop backtracking and return
NULL (root table from fib_lookup). This could yield in invalid routing
table lookups when using subtrees.
Instead continue to backtrack until a valid subtree or node is found
and return this match.
Also remove unneeded NULL check.
Reported-by: Teco Boot <teco@inf-net.nl>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: David Lamparter <equinox@diac24.net>
Cc: <boutier@pps.univ-paris-diderot.fr>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While investigating about strange increase of retransmit rates
on hosts ~24 days after boot, Van found hystart was disabled
if ca->epoch_start was 0, as following condition is true
when tcp_time_stamp high order bit is set.
(s32)(tcp_time_stamp - ca->epoch_start) < HZ
Quoting Van :
At initialization & after every loss ca->epoch_start is set to zero so
I believe that the above line will turn off hystart as soon as the 2^31
bit is set in tcp_time_stamp & hystart will stay off for 24 days.
I think we've observed that cubic's restart is too aggressive without
hystart so this might account for the higher drop rate we observe.
Diagnosed-by: Van Jacobson <vanj@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
br_sysfs_if.c is for sysfs attributes of bridge ports, while br_sysfs_br.c
is for sysfs attributes of bridge itself. Correct the comment here.
Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 17a6e9f1aa9 ("tcp_cubic: fix clock dependency") added an
overflow error in bictcp_update() in following code :
/* change the unit from HZ to bictcp_HZ */
t = ((tcp_time_stamp + msecs_to_jiffies(ca->delay_min>>3) -
ca->epoch_start) << BICTCP_HZ) / HZ;
Because msecs_to_jiffies() being unsigned long, compiler does
implicit type promotion.
We really want to constrain (tcp_time_stamp - ca->epoch_start)
to a signed 32bit value, or else 't' has unexpected high values.
This bugs triggers an increase of retransmit rates ~24 days after
boot [1], as the high order bit of tcp_time_stamp flips.
[1] for hosts with HZ=1000
Big thanks to Van Jacobson for spotting this problem.
Diagnosed-by: Van Jacobson <vanj@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently we are reading an uninitialized value for the max_delay
variable when snooping an MLD query message of invalid length and would
update our timers with that.
Fixing this by simply ignoring such broken MLD queries (just like we do
for IGMP already).
This is a regression introduced by:
"bridge: disable snooping if there is no querier" (b00589af3b04)
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
AddressSanitizer [1] dynamic checker pointed a potential
out of bound access in leaf_walk_rcu()
We could allocate one more slot in tnode_new() to leave the prefetch()
in-place but it looks not worth the pain.
Bug added in commit 82cfbb008572b ("[IPV4] fib_trie: iterator recode")
[1] :
https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Otherwise, on neighbour creation, bond_neigh_init() will be called with a
foreign netdev.
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dev->ndo_neigh_setup() might need some of the values of neigh_parms, so
populate them before calling it.
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 91657eafb ("xfrm: take net hdr len into account for esp payload
size calculation") introduced a possible interger overflow in
esp{4,6}_get_mtu() handlers in case of x->props.mode equals
XFRM_MODE_TUNNEL. Thus, the following expression will overflow
unsigned int net_adj;
...
<case ipv{4,6} XFRM_MODE_TUNNEL>
net_adj = 0;
...
return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
net_adj) & ~(align - 1)) + (net_adj - 2);
where (net_adj - 2) would be evaluated as <foo> + (0 - 2) in an unsigned
context. Fix it by simply removing brackets as those operations here
do not need to have special precedence.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Benjamin Poirier <bpoirier@suse.de>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Benjamin Poirier <bpoirier@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|