Age | Commit message (Collapse) | Author |
|
[ Upstream commit 8a8e3d84b1719a56f9151909e80ea6ebc5b8e318 ]
commit 56b765b79 ("htb: improved accuracy at high rates")
broke the "linklayer atm" handling.
tc class add ... htb rate X ceil Y linklayer atm
The linklayer setting is implemented by modifying the rate table
which is send to the kernel. No direct parameter were
transferred to the kernel indicating the linklayer setting.
The commit 56b765b79 ("htb: improved accuracy at high rates")
removed the use of the rate table system.
To keep compatible with older iproute2 utils, this patch detects
the linklayer by parsing the rate table. It also supports future
versions of iproute2 to send this linklayer parameter to the
kernel directly. This is done by using the __reserved field in
struct tc_ratespec, to convey the choosen linklayer option, but
only using the lower 4 bits of this field.
Linklayer detection is limited to speeds below 100Mbit/s, because
at high rates the rtab is gets too inaccurate, so bad that
several fields contain the same values, this resembling the ATM
detect. Fields even start to contain "0" time to send, e.g. at
1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have
been more broken than we first realized.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4221f40513233fa8edeef7fc82e44163fde03b9b ]
Using inner-id for tunnel id is not safe in some rare cases.
E.g. packets coming from multiple sources entering same tunnel
can have same id. Therefore on tunnel packet receive we
could have packets from two different stream but with same
source and dst IP with same ip-id which could confuse ip packet
reassembly.
Following patch reverts optimization from commit
490ab08127 (IP_GRE: Fix IP-Identification.)
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
CC: Jarno Rajahalme <jrajahalme@nicira.com>
CC: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 33c6b1f6b154894321f5734e50c66621e9134e7e ]
netlink dump operations take module as parameter to hold
reference for entire netlink dump duration.
Currently it holds ref only on genl module which is not correct
when we use ops registered to genl from another module.
Following patch adds module pointer to genl_ops so that netlink
can hold ref count on it.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
CC: Jesse Gross <jesse@nicira.com>
CC: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2dfca312a91631311c1cf7c090246cc8103de038 upstream.
brcm80211 cannot handle sending frames with CCK rates as part of an
A-MPDU session. Other drivers may have issues too. Set the flag in all
drivers that have been tested with CCK rates.
This fixes a reported brcmsmac regression introduced in
commit ef47a5e4f1aaf1d0e2e6875e34b2c9595897bef6
"mac80211/minstrel_ht: fix cck rate sampling"
Reported-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d9d10a30964504af834d8d250a0c76d4ae91eb1e ]
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
pending data
[ Upstream commit 8822b64a0fa64a5dd1dfcf837c5b0be83f8c05d1 ]
We accidentally call down to ip6_push_pending_frames when uncorking
pending AF_INET data on a ipv6 socket. This results in the following
splat (from Dave Jones):
skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:126!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth
+netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c
CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37
task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000
RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65
RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282
RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006
RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520
RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800
R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800
FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4
ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6
ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0
Call Trace:
[<ffffffff8159a9aa>] skb_push+0x3a/0x40
[<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0
[<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140
[<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0
[<ffffffff81694660>] ? udplite_getfrag+0x20/0x20
[<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0
[<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0
[<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40
[<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20
[<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0
[<ffffffff816f5d54>] tracesys+0xdd/0xe2
Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55
RIP [<ffffffff816e759c>] skb_panic+0x63/0x65
RSP <ffff8801e6431de8>
This patch adds a check if the pending data is of address family AF_INET
and directly calls udp_push_ending_frames from udp_v6_push_pending_frames
if that is the case.
This bug was found by Dave Jones with trinity.
(Also move the initialization of fl6 below the AF_INET check, even if
not strictly necessary.)
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Dave Jones <davej@redhat.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8965779d2c0e6ab246c82a405236b1fb2adae6b2, with
some bits from commit b7b1bfce0bb68bd8f6e62a28295922785cc63781
("ipv6: split duplicate address detection and router solicitation timer")
to get the __ipv6_get_lladdr() used by this patch. ]
dingtianhong reported the following deadlock detected by lockdep:
======================================================
[ INFO: possible circular locking dependency detected ]
3.4.24.05-0.1-default #1 Not tainted
-------------------------------------------------------
ksoftirqd/0/3 is trying to acquire lock:
(&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
but task is already holding lock:
(&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&mc->mca_lock){+.+...}:
[<ffffffff810a8027>] validate_chain+0x637/0x730
[<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
[<ffffffff810a8734>] lock_acquire+0x114/0x150
[<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
[<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
[<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
[<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
[<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
[<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
[<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
[<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
[<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
[<ffffffff813d9360>] dev_change_flags+0x40/0x70
[<ffffffff813ea627>] do_setlink+0x237/0x8a0
[<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
[<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
[<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
[<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
[<ffffffff81403e20>] netlink_unicast+0x140/0x180
[<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
[<ffffffff813c4252>] sock_sendmsg+0x112/0x130
[<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
[<ffffffff813c5544>] sys_sendmsg+0x44/0x70
[<ffffffff814feab9>] system_call_fastpath+0x16/0x1b
-> #0 (&ndev->lock){+.+...}:
[<ffffffff810a798e>] check_prev_add+0x3de/0x440
[<ffffffff810a8027>] validate_chain+0x637/0x730
[<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
[<ffffffff810a8734>] lock_acquire+0x114/0x150
[<ffffffff814f6c82>] rt_read_lock+0x42/0x60
[<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
[<ffffffff8149b036>] mld_newpack+0xb6/0x160
[<ffffffff8149b18b>] add_grhead+0xab/0xc0
[<ffffffff8149d03b>] add_grec+0x3ab/0x460
[<ffffffff8149d14a>] mld_send_report+0x5a/0x150
[<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
[<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
[<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
[<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
[<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
[<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
[<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
[<ffffffff8106c7de>] kthread+0xae/0xc0
[<ffffffff814fff74>] kernel_thread_helper+0x4/0x10
actually we can just hold idev->lock before taking pmc->mca_lock,
and avoid taking idev->lock again when iterating idev->addr_list,
since the upper callers of mld_newpack() already take
read_lock_bh(&idev->lock).
Reported-by: dingtianhong <dingtianhong@huawei.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Chen Weilong <chenweilong@huawei.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If CONFIG_NET_NS is not set then __net_init is the same as __init and
__net_exit is the same as __exit. These functions will be removed from
memory after the module loads or is removed. Functions that are exported
for use by other functions should never be labeled for removal.
Bug introduced by commit c54419321455631079c
("GRE: Refactor GRE tunneling code.")
Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
|
|
If hci_dev_open fails we need to ensure that the corresponding
mgmt_set_powered command gets an appropriate response. This patch fixes
the missing response by adding a new mgmt_set_powered_failed function
that's used to indicate a power on failure to mgmt. Since a situation
with the device being rfkilled may require special handling in user
space the patch uses a new dedicated mgmt status code for this.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: stable@vger.kernel.org
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
commit 56b765b79 ("htb: improved accuracy at high rates")
broke the "overhead xxx" handling, as well as the "linklayer atm"
attribute.
tc class add ... htb rate X ceil Y linklayer atm overhead 10
This patch restores the "overhead xxx" handling, for htb, tbf
and act_police
The "linklayer atm" thing needs a separate fix.
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vimalkumar <j.vimal@gmail.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In some cases after deleting a policy from the SPD the policy would
remain in the dst/flow/route cache for an extended period of time
which caused problems for SELinux as its dynamic network access
controls key off of the number of XFRM policy and state entries.
This patch corrects this problem by forcing a XFRM garbage collection
whenever a policy is sucessfully removed.
Reported-by: Ondrej Moris <omoris@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812:
[ ip6tables -m addrtype ]
When I tried to use in the nat/PREROUTING it messes up the
routing cache even if the rule didn't matched at all.
[..]
If I remove the --limit-iface-in from the non-working scenario, so just
use the -m addrtype --dst-type LOCAL it works!
This happens when LOCAL type matching is requested with --limit-iface-in,
and the default ipv6 route is via the interface the packet we test
arrived on.
Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation
creates an unwanted cached entry, and the packet won't make it to the
real/expected destination.
Silently ignoring --limit-iface-in makes the routing work but it breaks
rule matching (--dst-type LOCAL with limit-iface-in is supposed to only
match if the dst address is configured on the incoming interface;
without --limit-iface-in it will match if the address is reachable
via lo).
The test should call ipv6_chk_addr() instead. However, this would add
a link-time dependency on ipv6.
There are two possible solutions:
1) Revert the commit that moved ipt_addrtype to xt_addrtype,
and put ipv6 specific code into ip6t_addrtype.
2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions.
While the former might seem preferable, Pablo pointed out that there
are more xt modules with link-time dependeny issues regarding ipv6,
so lets go for 2).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
|
|
Pablo Neira Ayuso says:
====================
The following patchset contains three Netfilter fixes and update
for the MAINTAINER file for your net tree, they are:
* Fix crash if nf_log_packet is called from conntrack, in that case
both interfaces are NULL, from Hans Schillstrom. This bug introduced
with the logging netns support in the previous merge window.
* Fix compilation of nf_log and nf_queue without CONFIG_PROC_FS,
from myself. This bug was introduced in the previous merge window
with the new netns support for the netfilter logging infrastructure.
* Fix possible crash in xt_TCPOPTSTRIP due to missing sanity
checkings to validate that the TCP header is well-formed, from
myself. I can find this bug in 2.6.25, probably it's been there
since the beginning. I'll pass this to -stable.
* Update MAINTAINER file to point to new nf trees at git.kernel.org,
remove Harald and use M: instead of P: (now obsolete tag) to
keep Jozsef in the list of people.
Please, consider pulling this. Thanks!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Document rx vs tx status concurrency requirements.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Since (69b34fb netfilter: xt_LOG: add net namespace support
for xt_LOG), we hit this:
[ 4224.708977] BUG: unable to handle kernel NULL pointer dereference at 0000000000000388
[ 4224.709074] IP: [<ffffffff8147f699>] ipt_log_packet+0x29/0x270
when callling log functions from conntrack both in and out
are NULL i.e. the net pointer is invalid.
Adding struct net *net in call to nf_logfn() will secure that
there always is a vaild net ptr.
Reported as netfilter's bugzilla bug 818:
https://bugzilla.netfilter.org/show_bug.cgi?id=818
Reported-by: Ronald <ronald645@gmail.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
We have seen multiple NULL dereferences in __inet6_lookup_established()
After analysis, I found that inet6_sk() could be NULL while the
check for sk_family == AF_INET6 was true.
Bug was added in linux-2.6.29 when RCU lookups were introduced in UDP
and TCP stacks.
Once an IPv6 socket, using SLAB_DESTROY_BY_RCU is inserted in a hash
table, we no longer can clear pinet6 field.
This patch extends logic used in commit fcbdf09d9652c891
("net: fix nulls list corruptions in sk_prot_alloc")
TCP/UDP/UDPLite IPv6 protocols provide their own .clear_sk() method
to make sure we do not clear pinet6 field.
At socket clone phase, we do not really care, as cloning the parent (non
NULL) pinet6 is not adding a fatal race.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes race between inet_frag_lru_move() and inet_frag_lru_add()
which was introduced in commit 3ef0eb0db4bf92c6d2510fe5c4dc51852746f206
("net: frag, move LRU list maintenance outside of rwlock")
One cpu already added new fragment queue into hash but not into LRU.
Other cpu found it in hash and tries to move it to the end of LRU.
This leads to NULL pointer dereference inside of list_move_tail().
Another possible race condition is between inet_frag_lru_move() and
inet_frag_lru_del(): move can happens after deletion.
This patch initializes LRU list head before adding fragment into hash and
inet_frag_lru_move() doesn't touches it if it's empty.
I saw this kernel oops two times in a couple of days.
[119482.128853] BUG: unable to handle kernel NULL pointer dereference at (null)
[119482.132693] IP: [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.136456] PGD 2148f6067 PUD 215ab9067 PMD 0
[119482.140221] Oops: 0000 [#1] SMP
[119482.144008] Modules linked in: vfat msdos fat 8021q fuse nfsd auth_rpcgss nfs_acl nfs lockd sunrpc ppp_async ppp_generic bridge slhc stp llc w83627ehf hwmon_vid snd_hda_codec_hdmi snd_hda_codec_realtek kvm_amd k10temp kvm snd_hda_intel snd_hda_codec edac_core radeon snd_hwdep ath9k snd_pcm ath9k_common snd_page_alloc ath9k_hw snd_timer snd soundcore drm_kms_helper ath ttm r8169 mii
[119482.152692] CPU 3
[119482.152721] Pid: 20, comm: ksoftirqd/3 Not tainted 3.9.0-zurg-00001-g9f95269 #132 To Be Filled By O.E.M. To Be Filled By O.E.M./RS880D
[119482.161478] RIP: 0010:[<ffffffff812ede89>] [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.166004] RSP: 0018:ffff880216d5db58 EFLAGS: 00010207
[119482.170568] RAX: 0000000000000000 RBX: ffff88020882b9c0 RCX: dead000000200200
[119482.175189] RDX: 0000000000000000 RSI: 0000000000000880 RDI: ffff88020882ba00
[119482.179860] RBP: ffff880216d5db58 R08: ffffffff8155c7f0 R09: 0000000000000014
[119482.184570] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88020882ba00
[119482.189337] R13: ffffffff81c8d780 R14: ffff880204357f00 R15: 00000000000005a0
[119482.194140] FS: 00007f58124dc700(0000) GS:ffff88021fcc0000(0000) knlGS:0000000000000000
[119482.198928] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[119482.203711] CR2: 0000000000000000 CR3: 00000002155f0000 CR4: 00000000000007e0
[119482.208533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[119482.213371] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[119482.218221] Process ksoftirqd/3 (pid: 20, threadinfo ffff880216d5c000, task ffff880216d3a9a0)
[119482.223113] Stack:
[119482.228004] ffff880216d5dbd8 ffffffff8155dcda 0000000000000000 ffff000200000001
[119482.233038] ffff8802153c1f00 ffff880000289440 ffff880200000014 ffff88007bc72000
[119482.238083] 00000000000079d5 ffff88007bc72f44 ffffffff00000002 ffff880204357f00
[119482.243090] Call Trace:
[119482.248009] [<ffffffff8155dcda>] ip_defrag+0x8fa/0xd10
[119482.252921] [<ffffffff815a8013>] ipv4_conntrack_defrag+0x83/0xe0
[119482.257803] [<ffffffff8154485b>] nf_iterate+0x8b/0xa0
[119482.262658] [<ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40
[119482.267527] [<ffffffff815448e4>] nf_hook_slow+0x74/0x130
[119482.272412] [<ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40
[119482.277302] [<ffffffff8155d068>] ip_rcv+0x268/0x320
[119482.282147] [<ffffffff81519992>] __netif_receive_skb_core+0x612/0x7e0
[119482.286998] [<ffffffff81519b78>] __netif_receive_skb+0x18/0x60
[119482.291826] [<ffffffff8151a650>] process_backlog+0xa0/0x160
[119482.296648] [<ffffffff81519f29>] net_rx_action+0x139/0x220
[119482.301403] [<ffffffff81053707>] __do_softirq+0xe7/0x220
[119482.306103] [<ffffffff81053868>] run_ksoftirqd+0x28/0x40
[119482.310809] [<ffffffff81074f5f>] smpboot_thread_fn+0xff/0x1a0
[119482.315515] [<ffffffff81074e60>] ? lg_local_lock_cpu+0x40/0x40
[119482.320219] [<ffffffff8106d870>] kthread+0xc0/0xd0
[119482.324858] [<ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40
[119482.329460] [<ffffffff816c32dc>] ret_from_fork+0x7c/0xb0
[119482.334057] [<ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40
[119482.338661] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08
[119482.343787] RIP [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.348675] RSP <ffff880216d5db58>
[119482.353493] CR2: 0000000000000000
Oops happened on this path:
ip_defrag() -> ip_frag_queue() -> inet_frag_lru_move() -> list_move_tail() -> __list_del_entry()
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS updates from Al Viro,
Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).
7kloc removed.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
don't bother with deferred freeing of fdtables
proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
proc: Make the PROC_I() and PDE() macros internal to procfs
proc: Supply a function to remove a proc entry by PDE
take cgroup_open() and cpuset_open() to fs/proc/base.c
ppc: Clean up scanlog
ppc: Clean up rtas_flash driver somewhat
hostap: proc: Use remove_proc_subtree()
drm: proc: Use remove_proc_subtree()
drm: proc: Use minor->index to label things, not PDE->name
drm: Constify drm_proc_list[]
zoran: Don't print proc_dir_entry data in debug
reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
proc: Supply an accessor for getting the data from a PDE's parent
airo: Use remove_proc_subtree()
rtl8192u: Don't need to save device proc dir PDE
rtl8187se: Use a dir under /proc/net/r8180/
proc: Add proc_mkdir_data()
proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
proc: Move PDE_NET() to fs/proc/proc_net.c
...
|
|
Pull networking updates from David Miller:
"Highlights (1721 non-merge commits, this has to be a record of some
sort):
1) Add 'random' mode to team driver, from Jiri Pirko and Eric
Dumazet.
2) Make it so that any driver that supports configuration of multiple
MAC addresses can provide the forwarding database add and del
calls by providing a default implementation and hooking that up if
the driver doesn't have an explicit set of handlers. From Vlad
Yasevich.
3) Support GSO segmentation over tunnels and other encapsulating
devices such as VXLAN, from Pravin B Shelar.
4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.
5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
Dukkipati.
6) In the PHY layer, allow supporting wake-on-lan in situations where
the PHY registers have to be written for it to be configured.
Use it to support wake-on-lan in mv643xx_eth.
From Michael Stapelberg.
7) Significantly improve firewire IPV6 support, from YOSHIFUJI
Hideaki.
8) Allow multiple packets to be sent in a single transmission using
network coding in batman-adv, from Martin Hundebøll.
9) Add support for T5 cxgb4 chips, from Santosh Rastapur.
10) Generalize the VXLAN forwarding tables so that there is more
flexibility in configurating various aspects of the endpoints.
From David Stevens.
11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
from Dmitry Kravkov.
12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
Neira Ayuso.
13) Start adding networking selftests.
14) In situations of overload on the same AF_PACKET fanout socket, or
per-cpu packet receive queue, minimize drop by distributing the
load to other cpus/fanouts. From Willem de Bruijn and Eric
Dumazet.
15) Add support for new payload offset BPF instruction, from Daniel
Borkmann.
16) Convert several drivers over to mdoule_platform_driver(), from
Sachin Kamat.
17) Provide a minimal BPF JIT image disassembler userspace tool, from
Daniel Borkmann.
18) Rewrite F-RTO implementation in TCP to match the final
specification of it in RFC4138 and RFC5682. From Yuchung Cheng.
19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
you like netlink, so I implemented netlink dumping of netlink
sockets.") From Andrey Vagin.
20) Remove ugly passing of rtnetlink attributes into rtnl_doit
functions, from Thomas Graf.
21) Allow userspace to be able to see if a configuration change occurs
in the middle of an address or device list dump, from Nicolas
Dichtel.
22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
Frederic Sowa.
23) Increase accuracy of packet length used by packet scheduler, from
Jason Wang.
24) Beginning set of changes to make ipv4/ipv6 fragment handling more
scalable and less susceptible to overload and locking contention,
from Jesper Dangaard Brouer.
25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
instead. From Hong Zhiguo.
26) Optimize route usage in IPVS by avoiding reference counting where
possible, from Julian Anastasov.
27) Convert IPVS schedulers to RCU, also from Julian Anastasov.
28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
Eitzenberger.
29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
nfnetlink_log, and nfnetlink_queue. From Gao feng.
30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.
31) Support several new r8169 chips, from Hayes Wang.
32) Support tokenized interface identifiers in ipv6, from Daniel
Borkmann.
33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.
34) Add 802.1ad vlan offload support, from Patrick McHardy.
35) Support mmap() based netlink communication, also from Patrick
McHardy.
36) Support HW timestamping in mlx4 driver, from Amir Vadai.
37) Rationalize AF_PACKET packet timestamping when transmitting, from
Willem de Bruijn and Daniel Borkmann.
38) Bring parity to what's provided by /proc/net/packet socket dumping
and the info provided by netlink socket dumping of AF_PACKET
sockets. From Nicolas Dichtel.
39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
Poirier"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
filter: fix va_list build error
af_unix: fix a fatal race with bit fields
bnx2x: Prevent memory leak when cnic is absent
bnx2x: correct reading of speed capabilities
net: sctp: attribute printl with __printf for gcc fmt checks
netlink: kconfig: move mmap i/o into netlink kconfig
netpoll: convert mutex into a semaphore
netlink: Fix skb ref counting.
net_sched: act_ipt forward compat with xtables
mlx4_en: fix a build error on 32bit arches
Revert "bnx2x: allow nvram test to run when device is down"
bridge: avoid OOPS if root port not found
drivers: net: cpsw: fix kernel warn on cpsw irq enable
sh_eth: use random MAC address if no valid one supplied
3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
tg3: fix to append hardware time stamping flags
unix/stream: fix peeking with an offset larger than data in queue
unix/dgram: fix peeking with an offset larger than data in queue
unix/dgram: peek beyond 0-sized skbs
openvswitch: Remove unneeded ovs_netdev_get_ifindex()
...
|
|
Using bit fields is dangerous on ppc64/sparc64, as the compiler [1]
uses 64bit instructions to manipulate them.
If the 64bit word includes any atomic_t or spinlock_t, we can lose
critical concurrent changes.
This is happening in af_unix, where unix_sk(sk)->gc_candidate/
gc_maybe_cycle/lock share the same 64bit word.
This leads to fatal deadlock, as one/several cpus spin forever
on a spinlock that will never be available again.
A safer way would be to use a long to store flags.
This way we are sure compiler/arch wont do bad things.
As we own unix_gc_lock spinlock when clearing or setting bits,
we can use the non atomic __set_bit()/__clear_bit().
recursion_level can share the same 64bit location with the spinlock,
as it is set only with this spinlock held.
[1] bug fixed in gcc-4.8.0 :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080
Reported-by: Ambrose Feinstein <ambrose@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
"Usual stuff, mostly comment fixes, typo fixes, printk fixes and small
code cleanups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (45 commits)
mm: Convert print_symbol to %pSR
gfs2: Convert print_symbol to %pSR
m32r: Convert print_symbol to %pSR
iostats.txt: add easy-to-find description for field 6
x86 cmpxchg.h: fix wrong comment
treewide: Fix typo in printk and comments
doc: devicetree: Fix various typos
docbook: fix 8250 naming in device-drivers
pata_pdc2027x: Fix compiler warning
treewide: Fix typo in printks
mei: Fix comments in drivers/misc/mei
treewide: Fix typos in kernel messages
pm44xx: Fix comment for "CONFIG_CPU_IDLE"
doc: Fix typo "CONFIG_CGROUP_CGROUP_MEMCG_SWAP"
mmzone: correct "pags" to "pages" in comment.
kernel-parameters: remove outdated 'noresidual' parameter
Remove spurious _H suffixes from ifdef comments
sound: Remove stray pluses from Kconfig file
radio-shark: Fix printk "CONFIG_LED_CLASS"
doc: put proper reference to CONFIG_MODULE_SIG_ENFORCE
...
|
|
Don't use create_proc_read_entry() as that is deprecated, but rather use
proc_create_data() and seq_file instead.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Jouni Malinen <j@w1.fi>
cc: John W. Linville <linville@tuxdriver.com>
cc: Johannes Berg <johannes@sipsolutions.net>
cc: linux-wireless@vger.kernel.org
cc: netdev@vger.kernel.org
cc: devel@driverdev.osuosl.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
|
|
Instead of feeding net_secret[] at boot time, defer the init
at the point first socket is created.
This permits some platforms to use better entropy sources than
the ones available at boot time.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:
====================
The following patchset contains relevant updates for the Netfilter
tree, they are:
* Enhancements for ipset: Add the counter extension for sets, this
information can be used from the iptables set match, to change
the matching behaviour. Jozsef required to add the extension
infrastructure and moved the existing timeout support upon it.
This also includes a change in net/sched/em_ipset to adapt it to
the new extension structure.
* Enhancements for performance boosting in nfnetlink_queue: Add new
configuration flags that allows user-space to receive big packets (GRO)
and to disable checksumming calculation. This were proposed by Eric
Dumazet during the Netfilter Workshop 2013 in Copenhagen. Florian
Westphal was kind enough to find the time to materialize the proposal.
* A sparse fix from Simon, he noticed it in the SCTP NAT helper, the fix
required a change in the interface of sctp_end_cksum.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change the type of the crc32 parameter of sctp_end_cksum()
from __be32 to __u32 to reflect that fact that it is passed
to cpu_to_le32().
There are five in-tree users of sctp_end_cksum().
The following four had warnings flagged by sparse which are
no longer present with this change.
net/netfilter/ipvs/ip_vs_proto_sctp.c:sctp_nat_csum()
net/netfilter/ipvs/ip_vs_proto_sctp.c:sctp_csum_check()
net/sctp/input.c:sctp_rcv_checksum()
net/sctp/output.c:sctp_packet_transmit()
The fifth user is net/netfilter/nf_nat_proto_sctp.c:sctp_manip_pkt().
It has been updated to pass a __u32 instead of a __be32,
the value in question was already calculated in cpu byte-order.
net/netfilter/nf_nat_proto_sctp.c:sctp_manip_pkt() has also
been updated to assign the return value of sctp_end_cksum()
directly to a variable of type __le32, matching the
type of the return value. Previously the return value
was assigned to a variable of type __be32 and then that variable
was finally assigned to another variable of type __le32.
Problems flagged by sparse.
Compile and sparse tested only.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
skb_gso_segment is expensive, so it would be nice if we could
avoid it in the future. However, userspace needs to be prepared
to receive larger-than-mtu-packets (which will also have incorrect
l3/l4 checksums), so we cannot simply remove it.
The plan is to add a per-queue feature flag that userspace can
set when binding the queue.
The problem is that in nf_queue, we only have a queue number,
not the queue context/configuration settings.
This patch should have no impact other than the skb_gso_segment
call now being in a function that has access to the queue config
data.
A new size attribute in nf_queue_entry is needed so
nfnetlink_queue can duplicate the entry of the gso skb
when segmenting the skb while also copying the route key.
The follow up patch adds switch to disable skb_gso_segment when
queue config says so.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Increase fragmentation hash bucket size to 1024 from old 64 elems.
After we increased the frag mem limits commit c2a93660 (net: increase
fragment memory usage limits) the hash size of 64 elements is simply
too small. Also considering the mem limit is per netns and the hash
table is shared for all netns.
For the embedded people, note that this increase will change the hash
table/array from using approx 1 Kbytes to 16 Kbytes.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
|
|
All genl callbacks are serialized by genl-mutex. This can become
bottleneck in multi threaded case.
Following patch adds an parameter to genl_family so that a
particular family can get concurrent netlink callback without
genl_lock held.
New rw-sem is used to protect genl callback from genl family unregister.
in case of parallel_ops genl-family read-lock is taken for callbacks and
write lock is taken for register or unregistration for any family.
In case of locked genl family semaphore and gel-mutex is locked for
any openration.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:
====================
The following patchset contains fixes for recently applied
Netfilter/IPVS updates to the net-next tree, most relevantly
they are:
* Fix sparse warnings introduced in the RCU conversion, from
Julian Anastasov.
* Fix wrong endianness in the size field of IPVS sync messages,
from Simon Horman.
* Fix missing if checking in nf_xfrm_me_harder, from Dan Carpenter.
* Fix off by one access in the IPVS SCTP tracking code, again from
Dan Carpenter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
|
|
Remove my soon bouncing email address.
Also remove the "Contact:" line in file header.
The MAINTAINERS file is a better place to find the
contact person anyway.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some service fields are in network order:
- netmask: used once in network order and also as prefix len for IPv6
- port
Other parameters are in host order:
- struct ip_vs_flags: flags and mask moved between user and kernel only
- sync state: moved between user and kernel only
- syncid: sent over network as single octet
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Conflicts:
drivers/net/ethernet/emulex/benet/be_main.c
drivers/net/ethernet/intel/igb/igb_main.c
drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
include/net/scm.h
net/batman-adv/routing.c
net/ipv4/tcp_input.c
The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
cleanup in net-next to now pass cred structs around.
The be2net driver had a bug fix in 'net' that overlapped with the VLAN
interface changes by Patrick McHardy in net-next.
An IGB conflict existed because in 'net' the build_skb() support was
reverted, and in 'net-next' there was a comment style fix within that
code.
Several batman-adv conflicts were resolved by making sure that all
calls to batadv_is_my_mac() are changed to have a new bat_priv first
argument.
Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
rewrite in 'net-next', mostly overlapping changes.
Thanks to Stephen Rothwell and Antonio Quartulli for help with several
of these merge resolutions.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
struct sctp_packet is currently embedded into sctp_transport or
sits on the stack as 'singleton' in sctp_outq_flush(). Therefore,
its member 'malloced' is always 0, thus a kfree() is never called.
Because of that, we can just remove this code.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dl_next member in struct request_sock doesn't need to be first.
We expect to insert a "struct common_sock" or a subset of it,
so this claim had to be verified.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
|
|
Allow rate control modules to pass a rate selection table to mac80211
and the driver. This allows drivers to fetch the most recent rate
selection from the sta pointer for already buffered frames. This allows
rate control to respond faster to sudden link changes and it is also a
step towards adding minstrel_ht support to drivers like iwlwifi.
When a driver sets IEEE80211_HW_SUPPORTS_RC_TABLE, mac80211 will not
fill info->control.rates with rates from the rate table (to preserve
explicit overrides by the rate control module). The driver then
explicitly calls ieee80211_get_tx_rates to merge overrides from
info->control.rates with defaults from the sta rate table.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Some protocols need a more reliable connection to complete
successful in reasonable time. This patch adds a user-space
API to indicate the wireless driver that a critical protocol
is about to commence and when it is done, using nl80211 primitives
NL80211_CMD_CRIT_PROTOCOL_START and NL80211_CRIT_PROTOCOL_STOP.
There can be only on critical protocol session started per
registered cfg80211 device.
The driver can support this by implementing the cfg80211 callbacks
.crit_proto_start() and .crit_proto_stop(). Examples of protocols
that can benefit from this are DHCP, EAPOL, APIPA. Exactly how the
link can/should be made more reliable is up to the driver. Things
to consider are avoid scanning, no multi-channel operations, and
alter coexistence schemes.
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Some driver implementations need to know whether mandatory
admission control is required by the AP for some ACs. Add
a parameter to the TX queue parameters indicating this.
As there's currently no support for admission control in
mac80211's AP implementation, it's only ever set for the
client implementation.
Signed-off-by: Alexander Bondar <alexander.bondar@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
|
|
Commit 257b5358b32f ("scm: Capture the full credentials of the scm
sender") changed the credentials passing code to pass in the effective
uid/gid instead of the real uid/gid.
Obviously this doesn't matter most of the time (since normally they are
the same), but it results in differences for suid binaries when the wrong
uid/gid ends up being used.
This just undoes that (presumably unintentional) part of the commit.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The "reason" can come from skb->data[] and it hasn't been capped so it
can be from 0-255 instead of j |