Age | Commit message (Collapse) | Author |
|
|
|
|
|
Some stacks apparently send packets with SYN|URG set. Linux accepts
these packets, so TCP conntrack should to.
Pointed out by Martijn Posthuma <posthuma@sangine.com>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Paranoia: instance_put() might have freed the inst pointer when we
spin_unlock_bh().
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Stop reference leaking in nfulnl_log_packet(). If we start a timer we
are already taking another reference.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Eliminate possible NULL pointer dereference in nfulnl_recv_config().
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Fix the nasty NULL dereference on multiple packets per netlink message.
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000004
printing eip:
f8a4b3bf
*pde = 00000000
Oops: 0002 [#1]
SMP
Modules linked in: nfnetlink_log ipt_ttl ipt_REDIRECT xt_tcpudp iptable_nat nf_nat nf_conntrack
_ipv4 xt_state ipt_ipp2p xt_NFLOG xt_hashlimit ip6_tables iptable_filter xt_multiport xt_mark i
pt_set iptable_raw xt_MARK iptable_mangle ip_tables cls_fw cls_u32 sch_esfq sch_htb ip_set_ipma
p ip_set ipt_ULOG x_tables dm_snapshot dm_mirror loop e1000 parport_pc parport e100 floppy ide_
cd cdrom
CPU: 0
EIP: 0060:[<f8a4b3bf>] Not tainted VLI
EFLAGS: 00010206 (2.6.20 #5)
EIP is at __nfulnl_send+0x24/0x51 [nfnetlink_log]
eax: 00000000 ebx: f2b5cbc0 ecx: c03f5f54 edx: c03f4000
esi: f2b5cbc8 edi: c03f5f54 ebp: f8a4b3ec esp: c03f5f30
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, ti=c03f4000 task=c03bece0 task.ti=c03f4000)
Stack: f2b5cbc0 f8a4b401 00000100 c0444080 c012af49 00000000 f6f19100 f6f19000
c1707800 c03f5f54 c03f5f54 00000123 00000021 c03e8d08 c0426380 00000009
c0126932 00000000 00000046 c03e9980 c03e6000 0047b007 c01269bd 00000000
Call Trace:
[<f8a4b401>] nfulnl_timer+0x15/0x25 [nfnetlink_log]
[<c012af49>] run_timer_softirq+0x10a/0x164
[<c0126932>] __do_softirq+0x60/0xba
[<c01269bd>] do_softirq+0x31/0x35
[<c0104f6e>] do_IRQ+0x62/0x74
[<c01036cb>] common_interrupt+0x23/0x28
[<c0101018>] default_idle+0x0/0x3f
[<c0101045>] default_idle+0x2d/0x3f
[<c01010fa>] cpu_idle+0xa0/0xb9
[<c03fb7f5>] start_kernel+0x1a8/0x1ac
[<c03fb293>] unknown_bootoption+0x0/0x181
=======================
Code: 5e 5f 5b 5e 5f 5d c3 53 89 c3 8d 40 1c 83 7b 1c 00 74 05 e8 2c ee 6d c7 83 7b 14 00 75 04
31 c0 eb 34 83 7b 10 01 76 09 8b 43 18 <66> c7 40 04 03 00 8b 53 34 8b 43 14 b9 40 00 00 00 e8
08 9a 84
EIP: [<f8a4b3bf>] __nfulnl_send+0x24/0x51 [nfnetlink_log] SS:ESP 0068:c03f5f30
<0>Kernel panic - not syncing: Fatal exception in interrupt
<0>Rebooting in 5 seconds..
Panic no more!
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
physoutdev is only set on purely bridged packet, when nfnetlink_log is used
in the OUTPUT/FORWARD/POSTROUTING hooks on packets forwarded from or to a
bridge it crashes when trying to dereference skb->nf_bridge->physoutdev.
Reported by Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
ESTABLISHED
The individual fragments of a packet reassembled by conntrack have the
conntrack reference from the reassembled packet attached, but nfctinfo
is not copied. This leaves it initialized to 0, which unfortunately is
the value of IP_CT_ESTABLISHED.
The result is that all IPv6 fragments are tracked as ESTABLISHED,
allowing them to bypass a usual ruleset which accepts ESTABLISHED
packets early.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
related to packet queueing.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
When the packet counter of a connection is zero a division by zero
occurs in div64_64(). Fix that by using zero as average value, which
is correct as long as the packet counter didn't overflow, at which
point we have lost anyway.
Based on patch from Jonas Berlin <xkr47@outerspace.dyndns.org>,
with suggestions from KOVACS Krisztian <hidden@balabit.hu>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
IP_CT_TCP_FLAG_CLOSE_INIT is a flag and should have a value of 0x4 instead
of 0x3, which is IP_CT_TCP_FLAG_WINDOW_SCALE | IP_CT_TCP_FLAG_SACK_PERM.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
When IPv6 connection tracking splits up a defragmented packet into
its original fragments, the packets are taken from a list and are
passed to the network stack with skb->next still set. This causes
dev_hard_start_xmit to treat them as GSO fragments, resulting in
a use after free when connection tracking handles the next fragment.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
With the introduction of x_tables we accidentally broke compatibility
by defining IPT_TABLE_MAXNAMELEN to XT_FUNCTION_MAXNAMELEN instead of
XT_TABLE_MAXNAMELEN, which is two bytes larger.
On most architectures it doesn't really matter since we don't have
any tables with names that long in the kernel and the structure
layout didn't change because of alignment requirements of following
members. On CRIS however (and other architectures that don't align
data) this changed the structure layout and thus broke compatibility
with old iptables binaries.
Changing it back will break compatibility with binaries compiled
against recent kernels again, but since the breakage has only been
there for three releases this seems like the better choice.
Spotted by Jonas Berlin <xkr47@outerspace.dyndns.org>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The included patch translates arpt_counters to xt_counters, making
userspace arptables compile against recent kernels.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Check that status flags are available in the netlink message received
to create a new conntrack.
Fixes a crash in ctnetlink_create_conntrack when the CTA_STATUS attribute
is not present.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
xt_physdev depends on bridge netfilter, which is a boolean, but can still
be built modular because of special handling in the bridge makefile. Add
a dependency on BRIDGE to prevent XT_MATCH_PHYSDEV=y, BRIDGE=m.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Invoking load_module() before param_sysfs_init() is called crashes in
mod_sysfs_setup(), since the kset in module_subsys is not initialized yet.
In my case, net-pf-1 is getting modprobed as a result of hotplug trying to
create a UNIX socket. Calls to hotplug begin after the topology_init
initcall.
Another patch for the same symptom (module_subsys-initialize-earlier.patch)
moves param_sysfs_init() to the subsys initcalls, but this is still not
early enough in the boot process in some cases. In particular,
topology_init() causes /sbin/hotplug to run, which requests net-pf-1 (the
UNIX socket protocol) which can be compiled as a module. Moving
param_sysfs_init() to the postcore initcalls fixes this particular race,
but there might well be other cases where a usermodehelper causes a module
to load earlier still.
The patch makes load_module() return an error rather than crashing the
kernel if invoked before module_subsys is initialized.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
With CONFIG_PHYSICAL_START set to a non default values the i386
boot_ioremap code calculated its pte index wrong and users of boot_ioremap
have their areas incorrectly mapped (for me SRAT table not mapped during
early boot). This patch removes the addr < BOOT_PTE_PTRS constraint.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
These pte loops all assume the passed in address is HPAGE
aligned, make sure that is actually true.
[ This also includes other hugepage bug fixes for sparc64
that occurred between 2.6.16 to 2.6.20 ]
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
ANK says: "It is rarely used, that's wy it was not noticed.
But in the places, where it is used, it should be disaster."
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The ipv6_fl_socklist from listening socket is inadvertently shared
with new socket created for connection. This leads to a variety of
interesting, but fatal, bugs. For example, removing one of the
sockets may lead to the other socket's encountering a page fault
when the now freed list is referenced.
The fix is to not share the flow label list with the new socket.
Signed-off-by: Masayuki Nakagawa <nakagawa.msy@ncos.nec.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Hello, Just discussed this Patrick...
We have two users of trie_leaf_remove, fn_trie_flush and fn_trie_delete
both are holding RTNL. So there shouldn't be need for this preempt stuff.
This is assumed to a leftover from an older RCU-take.
> Mhh .. I think I just remembered something - me incorrectly suggesting
> to add it there while we were talking about this at OLS :) IIRC the
> idea was to make sure tnode_free (which at that time didn't use
> call_rcu) wouldn't free memory while still in use in a rcu read-side
> critical section. It should have been synchronize_rcu of course,
> but with tnode_free using call_rcu it seems to be completely
> unnecessary. So I guess we can simply remove it.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
I noticed that in xfrm_state_add we look for the larval SA in a few
places without checking for protocol match. So when using both
AH and ESP, whichever one gets added first, deletes the larval SA.
It seems AH always gets added first and ESP is always the larval
SA's protocol since the xfrm->tmpl has it first. Thus causing the
additional km_query()
Adding the check eliminates accidental double SA creation.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
|
|
|
|
Fix a compile error on powerpc.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
|
|
User supplied len < 0 can cause leak of kernel memory.
Use unsigned compare instead.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
I came across this bug in http://bugzilla.kernel.org/show_bug.cgi?id=8155
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Based on a patch from Don Howard <dhoward@redhat.com>
When calling write() with a buffer larger than 512 bytes, the
driver's write buffer overflows, allowing to overwrite the EIP and
execute arbitrary code with kernel privileges.
In read(), there exists a similar problem, but coming from the device.
A malicous or buggy device sending more than 512 bytes can overflow
of the driver's read buffer, with the same effects as above.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
From: Michael S. Tsirkin <mst@mellanox.co.il>
mthca_table_find() will return the wrong address when the table entry
being searched for is exactly at the beginning of a sglist entry
(other than the first), because it uses >= when it should use >.
Example: assume we have 2 entries in scatterlist, 4K each, offset is 4K.
The current code will return first entry + 4K when we really want
the second entry.
In particular this means mapping an FMR on a memfree HCA may end up
writing the page table into the wrong place, leading to memory
corruption and also causing the HCA to use an incorrect address
translation table.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
When ipoib_ib_dev_flush() is called because of a port event, the
driver needs to rejoin all multicast groups, since the flush will call
ipoib_mcast_dev_flush() (via ipoib_ib_dev_down()). Otherwise no
(non-broadcast) multicast groups will be rejoined until the networking
core calls ->set_multicast_list again, and so multicast reception will
be broken for potentially a long time.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
We discovered a problem when running IPoIB applications on multiple
CPUs on an Altix system. Many messages such as:
ib_mthca 0002:01:00.0: SQ 000014 full (19941644 head, 19941707 tail, 64 max, 0 nreq)
appear in syslog, and the driver wedges up.
Apparently this is because writes to the doorbells from different CPUs
reach the device out of order. The following patch adds mmiowb() calls
after doorbell rings to ensure the doorbell writes are ordered.
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The second argument to free_npages() was being incorrectly
calculated, which would thus access far past the end of the
arena->map[] bitmap.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Repeated -j20 kernel builds on a G5 Quad running an SMP PREEMPT kernel
would often collapse within a day, some exec failing with "Bad address".
In each case examined, load_elf_binary was doing a kernel_read, but
generic_file_aio_read's access_ok saw current->thread.fs.seg as USER_DS
instead of KERNEL_DS.
objdump of filemap.o shows gcc 4.1.0 emitting "mr r5,r13 ... ld r9,416(r5)"
here for get_paca()->__current, instead of the expected and much more usual
"ld r9,416(r13)"; I've seen other gcc4s do the same, but perhaps not gcc3s.
So, if the task is preempted and rescheduled on a different cpu in between
the mr and the ld, r5 will be looking at a different paca_struct from the
one it's now on, pick up the wrong __current, and perhaps the wrong seg.
Presumably much worse could happen elsewhere, though that split is rare.
Other architectures appear to be safe (x86_64's read_pda is more limiting
than get_paca), but ppc64 needs to force "current" into one instruction.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
CRC-32 checking during ULE decapsulation always failed on x86_64 systems due
to the size of a variable used to store CRC. This bug was discovered on
Fedora Core 6 with kernel-2.6.18-1.2849. The i386 counterpart has no such
problem. This patch has been tested on 64-bit system as well as 32-bit system.
Signed-off-by: Ang Way Chuang <wcang@nrg.cs.usm.my>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
On 2/28/07, KOVACS Krisztian <hidden@balabit.hu> wrote:
>
> Hi,
>
> While reading TCP minisock code I've found this suspiciously looking
> code fragment:
>
> - 8< -
> struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req, struct sk_buff *skb)
> {
> struct sock *newsk = inet_csk_clone(sk, req, GFP_ATOMIC);
>
> if (newsk != NULL) {
> const struct inet_request_sock *ireq = inet_rsk(req);
> struct tcp_request_sock *treq = tcp_rsk(req);
> struct inet_connection_sock *newicsk = inet_csk(sk);
> struct tcp_sock *newtp;
> - 8< -
>
> The above code initializes newicsk to inet_csk(sk), isn't that supposed
> to be inet_csk(newsk)? As far as I can tell this might leave
> icsk_ack.last_seg_size zero even if we do have received data.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Without this patch, the device will not be detected after firmware download
on big endian systems.
Signed-off-by: Jin-Bong lee <jinbong.lee@samsung.com>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Reading /proc/net/anycast6 when there is no anycast address
on an interface results in an ever-increasing inet6_dev reference
count, as well as a reference to the netdevice you can't get rid of.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This patch fixes a bug in Linux IPv6 stack which caused anycast address
to be added to a device prior DAD has been completed. This led to
incorrect reference count which resulted in infinite wait for
unregister_netdevice completion on interface removal.
Signed-off-by: Michal Wrobel <xmxwx@asn.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Based almost entirely upon a patch by Joerg Friedrich
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The header may have moved when trimming.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
CT based mach64 cards were reported to hang on sparc64 boxes when
compiled with gcc-4.1.x and later.
Looking at this piece of code, it's no surprise. A critical
delay was implemented as an empty for() loop, and gcc 4.0.x
and previous did not optimize it away, so we did get a delay.
But gcc-4.1.x and later can optimize it away, and we get crashes.
Use a real udelay() to fix this. Fix verified on SunBlade100.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
1. EL3WINDOW is always 1 when lock is not held.
2. The second argument of el3_interrupt is 'void *dev_id',
not 'struct el3_private *lp'.
Adrian Bunk:
backported to 2.6.16
Signed-off-by: Komuro <komurojun-mbn@nifty.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Adds missing call to phys_to_virt() in the
lib/swiotlb.c:swiotlb_sync_sg() function. Without this change, a kernel
panic will always occur whenever a SWIOTLB bounce buffer from a
scatter-gather list gets synced.
Signed-off-by: David Moore <dcm@acm.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
It looks like there is a bug in init_reap_node() in slab.c that can cause
multiple oops's on certain ES7000 configurations. The variable reap_node
is defined per cpu, but only initialized on a single CPU. This causes an
oops in next_reap_node() when __get_cpu_var(reap_node) returns the wrong
value. Fix is below.
Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
psmouse_show_int_attr() and psmouse_set_int_attr() were accessing
unsigned int fields as unsigned long, which gave garbage on x86_64.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|