aboutsummaryrefslogtreecommitdiff
path: root/net/netfilter
AgeCommit message (Collapse)Author
2013-05-11ipvs: ip_vs_sip_fill_param() BUG: bad check of return valueHans Schillstrom
commit f7a1dd6e3ad59f0cfd51da29dfdbfd54122c5916 upstream. The reason for this patch is crash in kmemdup caused by returning from get_callid with uniialized matchoff and matchlen. Removing Zero check of matchlen since it's done by ct_sip_get_header() BUG: unable to handle kernel paging request at ffff880457b5763f IP: [<ffffffff810df7fc>] kmemdup+0x2e/0x35 PGD 27f6067 PUD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: xt_state xt_helper nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle xt_connmark xt_conntrack ip6_tables nf_conntrack_ftp ip_vs_ftp nf_nat xt_tcpudp iptable_mangle xt_mark ip_tables x_tables ip_vs_rr ip_vs_lblcr ip_vs_pe_sip ip_vs nf_conntrack_sip nf_conntrack bonding igb i2c_algo_bit i2c_core CPU 5 Pid: 0, comm: swapper/5 Not tainted 3.9.0-rc5+ #5 /S1200KP RIP: 0010:[<ffffffff810df7fc>] [<ffffffff810df7fc>] kmemdup+0x2e/0x35 RSP: 0018:ffff8803fea03648 EFLAGS: 00010282 RAX: ffff8803d61063e0 RBX: 0000000000000003 RCX: 0000000000000003 RDX: 0000000000000003 RSI: ffff880457b5763f RDI: ffff8803d61063e0 RBP: ffff8803fea03658 R08: 0000000000000008 R09: 0000000000000011 R10: 0000000000000011 R11: 00ffffffff81a8a3 R12: ffff880457b5763f R13: ffff8803d67f786a R14: ffff8803fea03730 R15: ffffffffa0098e90 FS: 0000000000000000(0000) GS:ffff8803fea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff880457b5763f CR3: 0000000001a0c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/5 (pid: 0, threadinfo ffff8803ee18c000, task ffff8803ee18a480) Stack: ffff8803d822a080 000000000000001c ffff8803fea036c8 ffffffffa000937a ffffffff81f0d8a0 000000038135fdd5 ffff880300000014 ffff880300110000 ffffffff150118ac ffff8803d7e8a000 ffff88031e0118ac 0000000000000000 Call Trace: <IRQ> [<ffffffffa000937a>] ip_vs_sip_fill_param+0x13a/0x187 [ip_vs_pe_sip] [<ffffffffa007b209>] ip_vs_sched_persist+0x2c6/0x9c3 [ip_vs] [<ffffffff8107dc53>] ? __lock_acquire+0x677/0x1697 [<ffffffff8100972e>] ? native_sched_clock+0x3c/0x7d [<ffffffff8100972e>] ? native_sched_clock+0x3c/0x7d [<ffffffff810649bc>] ? sched_clock_cpu+0x43/0xcf [<ffffffffa007bb1e>] ip_vs_schedule+0x181/0x4ba [ip_vs] ... Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26netfilter: Mark SYN/ACK packets as invalid from original directionJozsef Kadlecsik
commit 64f509ce71b08d037998e93dd51180c19b2f464c upstream. Clients should not send such packets. By accepting them, we open up a hole by wich ephemeral ports can be discovered in an off-path attack. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26netfilter: Validate the sequence number of dataless ACK packets as wellJozsef Kadlecsik
commit 4a70bbfaef0361d27272629d1a250a937edcafe4 upstream. We spare nothing by not validating the sequence number of dataless ACK packets and enabling it makes harder off-path attacks. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-21netfilter: xt_limit: have r->cost != 0 case workJan Engelhardt
commit 82e6bfe2fbc4d48852114c4f979137cd5bf1d1a8 upstream. Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when a running state is saved to userspace and then reinstated from there. Make sure that private xt_limit area is initialized with correct values. Otherwise, random matchings due to use of uninitialized memory. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-21netfilter: limit, hashlimit: avoid duplicated inlineFlorian Westphal
commit 7a909ac70f6b0823d9f23a43f19598d4b57ac901 upstream. credit_cap can be set to credit, which avoids inlining user2credits twice. Also, remove inline keyword and let compiler decide. old: 684 192 0 876 36c net/netfilter/xt_limit.o 4927 344 32 5303 14b7 net/netfilter/xt_hashlimit.o now: 668 192 0 860 35c net/netfilter/xt_limit.o 4793 344 32 5169 1431 net/netfilter/xt_hashlimit.o Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-21netfilter: nf_ct_expect: fix possible access to uninitialized timerPablo Neira Ayuso
commit 2614f86490122bf51eb7c12ec73927f1900f4e7d upstream. In __nf_ct_expect_check, the function refresh_timer returns 1 if a matching expectation is found and its timer is successfully refreshed. This results in nf_ct_expect_related returning 0. Note that at this point: - the passed expectation is not inserted in the expectation table and its timer was not initialized, since we have refreshed one matching/existing expectation. - nf_ct_expect_alloc uses kmem_cache_alloc, so the expectation timer is in some undefined state just after the allocation, until it is appropriately initialized. This can be a problem for the SIP helper during the expectation addition: ... if (nf_ct_expect_related(rtp_exp) == 0) { if (nf_ct_expect_related(rtcp_exp) != 0) nf_ct_unexpect_related(rtp_exp); ... Note that nf_ct_expect_related(rtp_exp) may return 0 for the timer refresh case that is detailed above. Then, if nf_ct_unexpect_related(rtcp_exp) returns != 0, nf_ct_unexpect_related(rtp_exp) is called, which does: spin_lock_bh(&nf_conntrack_lock); if (del_timer(&exp->timeout)) { nf_ct_unlink_expect(exp); nf_ct_expect_put(exp); } spin_unlock_bh(&nf_conntrack_lock); Note that del_timer always returns false if the timer has been initialized. However, the timer was not initialized since setup_timer was not called, therefore, the expectation timer remains in some undefined state. If I'm not missing anything, this may lead to the removal an unexistent expectation. To fix this, the optimization that allows refreshing an expectation is removed. Now nf_conntrack_expect_related looks more consistent to me since it always add the expectation in case that it returns success. Thanks to Patrick McHardy for participating in the discussion of this patch. I think this may be the source of the problem described by: http://marc.info/?l=netfilter-devel&m=134073514719421&w=2 Reported-by: Rafal Fitt <rafalf@aplusc.com.pl> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-21netfilter: ipset: timeout fixing bug broke SET target special timeout valueJozsef Kadlecsik
commit a73f89a61f92b364f0b4a3be412b5b70553afc23 upstream. The patch "127f559 netfilter: ipset: fix timeout value overflow bug" broke the SET target when no timeout was specified. Reported-by: Jean-Philippe Menil <jean-philippe.menil@univ-nantes.fr> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-21netfilter: ipset: fix timeout value overflow bugJozsef Kadlecsik
commit 127f559127f5175e4bec3dab725a34845d956591 upstream. Large timeout parameters could result wrong timeout values due to an overflow at msec to jiffies conversion (reported by Andreas Herz) [ This patch was mangled by Pablo Neira Ayuso since David Laight and Eric Dumazet noticed that we were using hardcoded 1000 instead of MSEC_PER_SEC to calculate the timeout ] Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-21netfilter: nf_conntrack: fix racy timer handling with reliable eventsPablo Neira Ayuso
commit 5b423f6a40a0327f9d40bc8b97ce9be266f74368 upstream. Existing code assumes that del_timer returns true for alive conntrack entries. However, this is not true if reliable events are enabled. In that case, del_timer may return true for entries that were just inserted in the dying list. Note that packets / ctnetlink may hold references to conntrack entries that were just inserted to such list. This patch fixes the issue by adding an independent timer for event delivery. This increases the size of the ecache extension. Still we can revisit this later and use variable size extensions to allocate this area on demand. Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-21ipvs: fix oops in ip_vs_dst_event on rmmodJulian Anastasov
commit 283283c4da91adc44b03519f434ee1e7e91d6fdb upstream. After commit 39f618b4fd95ae243d940ec64c961009c74e3333 (3.4) "ipvs: reset ipvs pointer in netns" we can oops in ip_vs_dst_event on rmmod ip_vs because ip_vs_control_cleanup is called after the ipvs_core_ops subsys is unregistered and net->ipvs is NULL. Fix it by exiting early from ip_vs_dst_event if ipvs is NULL. It is safe because all services and dests for the net are already freed. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-02ipvs: fix info leak in getsockopt(IP_VS_SO_GET_TIMEOUT)Mathias Krause
[ Upstream commit 2d8a041b7bfe1097af21441cb77d6af95f4f4680 ] If at least one of CONFIG_IP_VS_PROTO_TCP or CONFIG_IP_VS_PROTO_UDP is not set, __ip_vs_get_timeouts() does not fully initialize the structure that gets copied to userland and that for leaks up to 12 bytes of kernel stack. Add an explicit memset(0) before passing the structure to __ip_vs_get_timeouts() to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Wensong Zhang <wensong@linux-vs.org> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-16netfilter: ipset: fix hash size checking in kernelJozsef Kadlecsik
The hash size must fit both into u32 (jhash) and the max value of size_t. The missing checking could lead to kernel crash, bug reported by Seblu. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30netfilter: xt_CT: fix wrong checking in the timeout assignment pathPablo Neira Ayuso
The current checking always succeeded. We have to check the first character of the string to check that it's empty, thus, skipping the timeout path. This fixes the use of the CT target without the timeout option. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-30ipvs: kernel oops - do_ip_vs_get_ctlHans Schillstrom
Change order of init so netns init is ready when register ioctl and netlink. Ver2 Whitespace fixes and __init added. Reported-by: "Ryan O'Hara" <rohara@redhat.com> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-30ipvs: take care of return value from protocol init_netnsHans Schillstrom
ip_vs_create_timeout_table() can return NULL All functions protocol init_netns is affected of this patch. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-30ipvs: null check of net->ipvs in lblc(r) shedulersHans Schillstrom
Avoid crash when registering shedulers after the IPVS core initialization for netns fails. Do this by checking for present core (net->ipvs). Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-26ipvs: reset ipvs pointer in netnsJulian Anastasov
Make sure net->ipvs is reset on netns cleanup or failed initialization. It is needed for IPVS applications to know that IPVS core is not loaded in netns. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-26ipvs: add check in ftp for initialized coreJulian Anastasov
Avoid crash when registering ip_vs_ftp after the IPVS core initialization for netns fails. Do this by checking for present core (net->ipvs). Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-25ipvs: fix crash in ip_vs_control_net_cleanup on unloadJulian Anastasov
commit 14e405461e664b777e2a5636e10b2ebf36a686ec (2.6.39) ("Add __ip_vs_control_{init,cleanup}_sysctl()") introduced regression due to wrong __net_init for __ip_vs_control_cleanup_sysctl. This leads to crash when the ip_vs module is unloaded. Fix it by changing __net_init to __net_exit for the function that is already renamed to ip_vs_control_net_cleanup_sysctl. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-25ipvs: Verify that IP_VS protocol has been registeredSasha Levin
The registration of a protocol might fail, there were no checks and all registrations were assumed to be correct. This lead to NULL ptr dereferences when apps tried registering. For example: [ 1293.226051] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 1293.227038] IP: [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] PGD 391de067 PUD 6c20b067 PMD 0 [ 1293.227038] Oops: 0000 [#1] PREEMPT SMP [ 1293.227038] CPU 1 [ 1293.227038] Pid: 19609, comm: trinity Tainted: G W 3.4.0-rc1-next-20120405-sasha-dirty #57 [ 1293.227038] RIP: 0010:[<ffffffff822aacb0>] [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] RSP: 0018:ffff880038c1dd18 EFLAGS: 00010286 [ 1293.227038] RAX: ffffffffffffffc0 RBX: 0000000000001500 RCX: 0000000000010000 [ 1293.227038] RDX: 0000000000000000 RSI: ffff88003a2d5888 RDI: 0000000000000282 [ 1293.227038] RBP: ffff880038c1dd48 R08: 0000000000000000 R09: 0000000000000000 [ 1293.227038] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a2d5668 [ 1293.227038] R13: ffff88003a2d5988 R14: ffff8800696a8ff8 R15: 0000000000000000 [ 1293.227038] FS: 00007f01930d9700(0000) GS:ffff88007ce00000(0000) knlGS:0000000000000000 [ 1293.227038] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1293.227038] CR2: 0000000000000018 CR3: 0000000065dfc000 CR4: 00000000000406e0 [ 1293.227038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1293.227038] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1293.227038] Process trinity (pid: 19609, threadinfo ffff880038c1c000, task ffff88002dc73000) [ 1293.227038] Stack: [ 1293.227038] ffff880038c1dd48 00000000fffffff4 ffff8800696aada0 ffff8800694f5580 [ 1293.227038] ffffffff8369f1e0 0000000000001500 ffff880038c1dd98 ffffffff822a716b [ 1293.227038] 0000000000000000 ffff8800696a8ff8 0000000000000015 ffff8800694f5580 [ 1293.227038] Call Trace: [ 1293.227038] [<ffffffff822a716b>] ip_vs_app_inc_new+0xdb/0x180 [ 1293.227038] [<ffffffff822a7258>] register_ip_vs_app_inc+0x48/0x70 [ 1293.227038] [<ffffffff822b2fea>] __ip_vs_ftp_init+0xba/0x140 [ 1293.227038] [<ffffffff821c9060>] ops_init+0x80/0x90 [ 1293.227038] [<ffffffff821c90cb>] setup_net+0x5b/0xe0 [ 1293.227038] [<ffffffff821c9416>] copy_net_ns+0x76/0x100 [ 1293.227038] [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190 [ 1293.227038] [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80 [ 1293.227038] [<ffffffff810afd1f>] sys_unshare+0xff/0x290 [ 1293.227038] [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 1293.227038] [<ffffffff82665539>] system_call_fastpath+0x16/0x1b [ 1293.227038] Code: 89 c7 e8 34 91 3b 00 89 de 66 c1 ee 04 31 de 83 e6 0f 48 83 c6 22 48 c1 e6 04 4a 8b 14 26 49 8d 34 34 48 8d 42 c0 48 39 d6 74 13 <66> 39 58 58 74 22 48 8b 48 40 48 8d 41 c0 48 39 ce 75 ed 49 8d [ 1293.227038] RIP [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] RSP <ffff880038c1dd18> [ 1293.227038] CR2: 0000000000000018 [ 1293.379284] ---[ end trace 364ab40c7011a009 ]--- [ 1293.381182] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_netGao feng
in function nf_conntrack_init_net,when nf_conntrack_timeout_init falied, we should call nf_conntrack_ecache_fini to do rollback. but the current code calls nf_conntrack_timeout_fini. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-09netfilter: nf_ct_tcp: don't scale the size of the window up twiceChangli Gao
For a picked up connection, the window win is scaled twice: one is by the initialization code, and the other is by the sender updating code. I use the temporary variable swin instead of modifying the variable win. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-03netfilter: nf_conntrack: fix count leak in error path of __nf_conntrack_allocPablo Neira Ayuso
We have to decrement the conntrack counter if we fail to access the zone extension. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-03netfilter: xt_CT: fix missing put timeout object in error pathPablo Neira Ayuso
The error path misses putting the timeout object. This patch adds new function xt_ct_tg_timeout_put() to put the timeout object. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-03netfilter: xt_CT: allocation has to be GFP_ATOMIC under rcu_read_lock sectionPablo Neira Ayuso
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-03Merge branch 'master' of git://1984.lsi.us.es/netDavid S. Miller
2012-04-03netfilter: xt_CT: remove a compile warningPablo Neira Ayuso
If CONFIG_NF_CONNTRACK_TIMEOUT=n we have following warning : CC [M] net/netfilter/xt_CT.o net/netfilter/xt_CT.c: In function ‘xt_ct_tg_check_v1’: net/netfilter/xt_CT.c:284: warning: label ‘err4’ defined but not used Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Provide device string properly for USB i2400m wimax devices, also don't OOPS when providing firmware string. From Phil Sutter. 2) Add support for sh_eth SH7734 chips, from Nobuhiro Iwamatsu. 3) Add another device ID to USB zaurus driver, from Guan Xin. 4) Loop index start in pool vector iterator is wrong causing MAC to not get configured in bnx2x driver, fix from Dmitry Kravkov. 5) EQL driver assumes HZ=100, fix from Eric Dumazet. 6) Now that skb_add_rx_frag() can specify the truesize increment separately, do so in f_phonet and cdc_phonet, also from Eric Dumazet. 7) virtio_net accidently uses net_ratelimit() not only on the kernel warning but also the statistic bump, fix from Rick Jones. 8) ip_route_input_mc() uses fixed init_net namespace, oops, use dev_net(dev) instead. Fix from Benjamin LaHaise. 9) dev_forward_skb() needs to clear the incoming interface index of the SKB so that it looks like a new incoming packet, also from Benjamin LaHaise. 10) iwlwifi mistakenly initializes a channel entry as 2GHZ instead of 5GHZ, fix from Stanislav Yakovlev. 11) Missing kmalloc() return value checks in orinoco, from Santosh Nayak. 12) ath9k doesn't check for HT capabilities in the right way, it is checking ht_supported instead of the ATH9K_HW_CAP_HT flag. Fix from Sujith Manoharan. 13) Fix x86 BPF JIT emission of 16-bit immediate field of AND instructions, from Feiran Zhuang. 14) Avoid infinite loop in GARP code when registering sysfs entries. From David Ward. 15) rose protocol uses memcpy instead of memcmp in a device address comparison, oops. Fix from Daniel Borkmann. 16) Fix build of lpc_eth due to dev_hw_addr_rancom() interface being renamed to eth_hw_addr_random(). From Roland Stigge. 17) Make ipv6 RTM_GETROUTE interpret RTA_IIF attribute the same way that ipv4 does. Fix from Shmulik Ladkani. 18) via-rhine has an inverted bit test, causing suspend/resume regressions. Fix from Andreas Mohr. 19) RIONET assumes 4K page size, fix from Akinobu Mita. 20) Initialization of imask register in sky2 is buggy, because bits are "or'd" into an uninitialized local variable. Fix from Lino Sanfilippo. 21) Fix FCOE checksum offload handling, from Yi Zou. 22) Fix VLAN processing regression in e1000, from Jiri Pirko. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) sky2: dont overwrite settings for PHY Quick link tg3: Fix 5717 serdes powerdown problem net: usb: cdc_eem: fix mtu net: sh_eth: fix endian check for architecture independent usb/rtl8150 : Remove duplicated definitions rionet: fix page allocation order of rionet_active via-rhine: fix wait-bit inversion. ipv6: Fix RTM_GETROUTE's interpretation of RTA_IIF to be consistent with ipv4 net: lpc_eth: Fix rename of dev_hw_addr_random net/netfilter/nfnetlink_acct.c: use linux/atomic.h rose_dev: fix memcpy-bug in rose_set_mac_address Fix non TBI PHY access; a bad merge undid bug fix in a previous commit. net/garp: avoid infinite loop if attribute already exists x86 bpf_jit: fix a bug in emitting the 16-bit immediate operand of AND bonding: emit event when bonding changes MAC mac80211: fix oper channel timestamp updation ath9k: Use HW HT capabilites properly MAINTAINERS: adding maintainer for ipw2x00 net: orinoco: add error handling for failed kmalloc(). net/wireless: ipw2x00: fix a typo in wiphy struct initilization ...
2012-04-01net/netfilter/nfnetlink_acct.c: use linux/atomic.hAndrew Morton
There's no known problem here, but this is one of only two non-arch files in the kernel which use asm/atomic.h instead of linux/atomic.h. Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-28Merge tag 'split-asm_system_h-for-linus-20120328' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...
2012-03-28Remove all #inclusions of asm/system.hDavid Howells
Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-23netfilter: nf_conntrack: permanently attach timeout policy to conntrackPablo Neira Ayuso
We need to permanently attach the timeout policy to the conntrack, otherwise we may apply the custom timeout policy inconsistently. Without this patch, the following example: nfct timeout add test inet icmp timeout 100 iptables -I PREROUTING -t raw -p icmp -s 1.1.1.1 -j CT --timeout test Will only apply the custom timeout policy to outgoing packets from 1.1.1.1, but not to reply packets from 2.2.2.2 going to 1.1.1.1. To fix this issue, this patch modifies the current logic to attach the timeout policy when the first packet is seen (which is when the conntrack entry is created). Then, we keep using the attached timeout policy until the conntrack entry is destroyed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23netfilter: xt_CT: fix assignation of the generic protocol trackerPablo Neira Ayuso
`iptables -p all' uses 0 to match all protocols, while the conntrack subsystem uses 255. We still need `-p all' to attach the custom timeout policies for the generic protocol tracker. Moreover, we may use `iptables -p sctp' while the SCTP tracker is not loaded. In that case, we have to default on the generic protocol tracker. Another possibility is `iptables -p ip' that should be supported as well. This patch makes sure we validate all possible scenarios. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23netfilter: xt_CT: missing rcu_read_lock section in timeout assignmentPablo Neira Ayuso
Fix a dereference to pointer without rcu_read_lock held. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23netfilter: cttimeout: fix dependency with l4protocol conntrack modulePablo Neira Ayuso
This patch introduces nf_conntrack_l4proto_find_get() and nf_conntrack_l4proto_put() to fix module dependencies between timeout objects and l4-protocol conntrack modules. Thus, we make sure that the module cannot be removed if it is used by any of the cttimeout objects. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-22netfilter: xt_LOG: use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6Pablo Neira Ayuso
This fixes the following linking error: xt_LOG.c:(.text+0x789b1): undefined reference to `ip6t_ext_hdr' ifdefs have to use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6. Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking merge from David Miller: "1) Move ixgbe driver over to purely page based buffering on receive. From Alexander Duyck. 2) Add receive packet steering support to e1000e, from Bruce Allan. 3) Convert TCP MD5 support over to RCU, from Eric Dumazet. 4) Reduce cpu usage in handling out-of-order TCP packets on modern systems, also from Eric Dumazet. 5) Support the IP{,V6}_UNICAST_IF socket options, making the wine folks happy, from Erich Hoover. 6) Support VLAN trunking from guests in hyperv driver, from Haiyang Zhang. 7) Support byte-queue-limtis in r8169, from Igor Maravic. 8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but was never properly implemented, Jiri Benc fixed that. 9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang. 10) Support kernel side dump filtering by ctmark in netfilter ctnetlink, from Pablo Neira Ayuso. 11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker. 12) Add new peek socket options to assist with socket migration, from Pavel Emelyanov. 13) Add sch_plug packet scheduler whose queue is controlled by userland daemons using explicit freeze and release commands. From Shriram Rajagopalan. 14) Fix FCOE checksum offload handling on transmit, from Yi Zou." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits) Fix pppol2tp getsockname() Remove printk from rds_sendmsg ipv6: fix incorrent ipv6 ipsec packet fragment cpsw: Hook up default ndo_change_mtu. net: qmi_wwan: fix build error due to cdc-wdm dependecy netdev: driver: ethernet: Add TI CPSW driver netdev: driver: ethernet: add cpsw address lookup engine support phy: add am79c874 PHY support mlx4_core: fix race on comm channel bonding: send igmp report for its master fs_enet: Add MPC5125 FEC support and PHY interface selection net: bpf_jit: fix BPF_S_LDX_B_MSH compilation net: update the usage of CHECKSUM_UNNECESSARY fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso ixgbe: Fix issues with SR-IOV loopback when flow control is disabled net/hyperv: Fix the code handling tx busy ixgbe: fix namespace issues when FCoE/DCB is not enabled rtlwifi: Remove unused ETH_ADDR_LEN defines igbvf: Use ETH_ALEN ... Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.
2012-03-20Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events changes for v3.4 from Ingo Molnar: - New "hardware based branch profiling" feature both on the kernel and the tooling side, on CPUs that support it. (modern x86 Intel CPUs with the 'LBR' hardware feature currently.) This new feature is basically a sophisticated 'magnifying glass' for branch execution - something that is pretty difficult to extract from regular, function histogram centric profiles. The simplest mode is activated via 'perf record -b', and the result looks like this in perf report: $ perf record -b any_call,u -e cycles:u branchy $ perf report -b --sort=symbol 52.34% [.] main [.] f1 24.04% [.] f1 [.] f3 23.60% [.] f1 [.] f2 0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow 0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn 0.01% [k] _IO_vfprintf_internal [k] strchrnul 0.01% [k] __printf [k] _IO_vfprintf_internal 0.01% [k] main [k] __printf This output shows from/to branch columns and shows the highest percentage (from,to) jump combinations - i.e. the most likely taken branches in the system. "branches" can also include function calls and any other synchronous and asynchronous transitions of the instruction pointer that are not 'next instruction' - such as system calls, traps, interrupts, etc. This feature comes with (hopefully intuitive) flat ascii and TUI support in perf report. - Various 'perf annotate' visual improvements for us assembly junkies. It will now recognize function calls in the TUI and by hitting enter you can follow the call (recursively) and back, amongst other improvements. - Multiple threads/processes recording support in perf record, perf stat, perf top - which is activated via a comma-list of PIDs: perf top -p 21483,21485 perf stat -p 21483,21485 -ddd perf record -p 21483,21485 - Support for per UID views, via the --uid paramter to perf top, perf report, etc. For example 'perf top --uid mingo' will only show the tasks that I am running, excluding other users, root, etc. - Jump label restructurings and improvements - this includes the factoring out of the (hopefully much clearer) include/linux/static_key.h generic facility: struct static_key key = STATIC_KEY_INIT_FALSE; ... if (static_key_false(&key)) do unlikely code else do likely code ... static_key_slow_inc(); ... static_key_slow_inc(); ... The static_key_false() branch will be generated into the code with as little impact to the likely code path as possible. the static_key_slow_*() APIs flip the branch via live kernel code patching. This facility can now be used more widely within the kernel to micro-optimize hot branches whose likelihood matches the static-key usage and fast/slow cost patterns. - SW function tracer improvements: perf support and filtering support. - Various hardenings of the perf.data ABI, to make older perf.data's smoother on newer tool versions, to make new features integrate more smoothly, to support cross-endian recording/analyzing workflows better, etc. - Restructuring of the kprobes code, the splitting out of 'optprobes', and a corner case bugfix. - Allow the tracing of kernel console output (printk). - Improvements/fixes to user-space RDPMC support, allowing user-space self-profiling code to extract PMU counts without performing any system calls, while playing nice with the kernel side. - 'perf bench' improvements - ... and lots of internal restructurings, cleanups and fixes that made these features possible. And, as usual this list is incomplete as there were also lots of other improvements * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits) perf report: Fix annotate double quit issue in branch view mode perf report: Remove duplicate annotate choice in branch view mode perf/x86: Prettify pmu config literals perf report: Enable TUI in branch view mode perf report: Auto-detect branch stack sampling mode perf record: Add HEADER_BRANCH_STACK tag perf record: Provide default branch stack sampling mode option perf tools: Make perf able to read files from older ABIs perf tools: Fix ABI compatibility bug in print_event_desc() perf tools: Enable reading of perf.data files from different ABI rev perf: Add ABI reference sizes perf report: Add support for taken branch sampling perf record: Add support for sampling taken branch perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK x86/kprobes: Split out optprobe related code to kprobes-opt.c x86/kprobes: Fix a bug which can modify kernel code permanently x86/kprobes: Fix instruction recovery on optimized path perf: Add callback to flush branch_stack on context switch perf: Disable PERF_SAMPLE_BRANCH_* when not supported perf/x86: Add LBR software filter support for Intel CPUs ...
2012-03-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-03-17netfilter: ctnetlink: fix race between delete and timeout expirationPablo Neira Ayuso
Kerin Millar reported hardlockups while running `conntrackd -c' in a busy firewall. That system (with several processors) was acting as backup in a primary-backup setup. After several tries, I found a race condition between the deletion operation of ctnetlink and timeout expiration. This patch fixes this problem. Tested-by: Kerin Millar <kerframil@gmail.com> Reported-by: Kerin Millar <kerframil@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-12Merge branch 'perf/urgent' into perf/coreIngo Molnar
Merge reason: We are going to queue up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-03-07netfilter: xt_CT: allow to attach timeout policy + glue codePablo Neira Ayuso
This patch allows you to attach the timeout policy via the CT target, it adds a new revision of the target to ensure backward compatibility. Moreover, it also contains the glue code to stick the timeout object defined via nfnetlink_cttimeout to the given flow. Example usage (it requires installing the nfct tool and libnetfilter_cttimeout): 1) create the timeout policy: nfct timeout add tcp-policy0 inet tcp \ established 1000 close 10 time_wait 10 last_ack 10 2) attach the timeout policy to the packet: iptables -I PREROUTING -t raw -p tcp -j CT --timeout tcp-policy0 You have to install the following user-space software: a) libnetfilter_cttimeout: git://git.netfilter.org/libnetfilter_cttimeout b) nfct: git://git.netfilter.org/nfct You also have to get iptables with -j CT --timeout support. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_ext: add timeout extensionPablo Neira Ayuso
This patch adds the timeout extension, which allows you to attach specific timeout policies to flows. This extension is only used by the template conntrack. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: add cttimeout infrastructure for fine timeout tuningPablo Neira Ayuso
This patch adds the infrastructure to add fine timeout tuning over nfnetlink. Now you can use the NFNL_SUBSYS_CTNETLINK_TIMEOUT subsystem to create/delete/dump timeout objects that contain some specific timeout policy for one flow. The follow up patches will allow you attach timeout policy object to conntrack via the CT target and the conntrack extension infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_conntrack: pass timeout array to l4->new and l4->packetPablo Neira Ayuso
This patch defines a new interface for l4 protocol trackers: unsigned int *(*get_timeouts)(struct net *net); that is used to return the array of unsigned int that contains the timeouts that will be applied for this flow. This is passed to the l4proto->new(...) and l4proto->packet(...) functions to specify the timeout policy. This interface allows per-net global timeout configuration (although only DCCP supports this by now) and it will allow custom custom timeout configuration by means of follow-up patches. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_gre: add unsigned int array to define timeoutsPablo Neira Ayuso
This patch adds an array to define the default GRE timeouts. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_tcp: move retransmission and unacknowledged timeout to arrayPablo Neira Ayuso
This patch moves the retransmission and unacknowledged timeouts to the tcp_timeouts array. This change is required by follow-up patches. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_udp[lite]: convert UDP[lite] timeouts to arrayPablo Neira Ayuso
Use one array to store the UDP timeouts instead of two variables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: ctnetlink: fix lockep splatsHans Schillstrom
net/netfilter/nf_conntrack_proto.c:70 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 3 locks held by conntrack/3235: nfnl_lock+0x17/0x20 netlink_dump+0x32/0x240 ctnetlink_dump_table+0x3e/0x170 [nf_conntrack_netlink] stack backtrace: Pid: 3235, comm: conntrack Tainted: G W 3.2.0+ #511 Call Trace: [<ffffffff8108ce45>] lockdep_rcu_suspicious+0xe5/0x100 [<ffffffffa00ec6e1>] __nf_ct_l4proto_find+0x81/0xb0 [nf_conntrack] [<ffffffffa0115675>] ctnetlink_fill_info+0x215/0x5f0 [nf_conntrack_netlink] [<ffffffffa0115dc1>] ctnetlink_dump_table+0xd1/0x170 [nf_conntrack_netlink] [<ffffffff815fbdbf>] netlink_dump+0x7f/0x240 [<ffffffff81090f9d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff815fd34f>] netlink_dump_start+0xdf/0x190 [<ffffffffa0111490>] ? ctnetlink_change_nat_seq_adj+0x160/0x160 [nf_conntrack_netlink] [<ffffffffa0115cf0>] ? ctnetlink_get_conntrack+0x2a0/0x2a0 [nf_conntrack_netlink] [<ffffffffa0115ad9>] ctnetlink_get_conntrack+0x89/0x2a0 [nf_conntrack_netlink] [<ffffffff81603a47>] nfnetlink_rcv_msg+0x467/0x5f0 [<ffffffff81603a7c>] ? nfnetlink_rcv_msg+0x49c/0x5f0 [<ffffffff81603922>] ? nfnetlink_rcv_msg+0x342/0x5f0 [<ffffffff81071b21>] ? get_parent_ip+0x11/0x50 [<ffffffff816035e0>] ? nfnetlink_subsys_register+0x60/0x60 [<ffffffff815fed49>] netlink_rcv_skb+0xa9/0xd0 [<ffffffff81603475>] nfnetlink_rcv+0x15/0x20 [<ffffffff815fe70e>] netlink_unicast+0x1ae/0x1f0 [<ffffffff815fea16>] netlink_sendmsg+0x2c6/0x320 [<ffffffff815b2a87>] sock_sendmsg+0x117/0x130 [<ffffffff81125093>] ? might_fault+0x53/0xb0 [<ffffffff811250dc>] ? might_fault+0x9c/0xb0 [<ffffffff81125093>] ? might_fault+0x53/0xb0 [<ffffffff815b5991>] ? move_addr_to_kernel+0x71/0x80 [<ffffffff815b644e>] sys_sendto+0xfe/0x130 [<ffffffff815b5c94>] ? sys_bind+0xb4/0xd0 [<ffffffff817a8a0e>] ? retint_swapgs+0xe/0x13 [<ffffffff817afcd2>] system_call_fastpath+0x16/0x1b Reported-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>