Age | Commit message (Collapse) | Author |
|
commit f39c1bfb5a03e2d255451bff05be0d7255298fa4 upstream.
Commit 43cedbf0e8dfb9c5610eb7985d5f21263e313802 (SUNRPC: Ensure that
we grab the XPRT_LOCK before calling xprt_alloc_slot) is causing
hangs in the case of NFS over UDP mounts.
Since neither the UDP or the RDMA transport mechanism use dynamic slot
allocation, we can skip grabbing the socket lock for those transports.
Add a new rpc_xprt_op to allow switching between the TCP and UDP/RDMA
case.
Note that the NFSv4.1 back channel assigns the slot directly
through rpc_run_bc_task, so we can ignore that case.
Reported-by: Dick Streefland <dick.streefland@altium.nl>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit acbb219d5f53821b2d0080d047800410c0420ea1 ]
When tearing down a net namespace, ipv4 mr_table structures are freed
without first deactivating their timers. This can result in a crash in
run_timer_softirq.
This patch mimics the corresponding behaviour in ipv6.
Locking and synchronization seem to be adequate.
We are about to kfree mrt, so existing code should already make sure that
no other references to mrt are pending or can be created by incoming traffic.
The functions invoked here do not cause new references to mrt or other
race conditions to be created.
Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive.
Both ipmr_expire_process (whose completion we may have to wait in
del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock
or other synchronizations when needed, and they both only modify mrt.
Tested in Linux 3.4.8.
Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 99469c32f79a32d8481f87be0d3c66dad286f4ec ]
Avoid to use synchronize_rcu in l2tp_tunnel_free because context may be
atomic.
Signed-off-by: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 20e1db19db5d6b9e4e83021595eab0dc8f107bef ]
Non-root user-space processes can send Netlink messages to other
processes that are well-known for being subscribed to Netlink
asynchronous notifications. This allows ilegitimate non-root
process to send forged messages to Netlink subscribers.
The userspace process usually verifies the legitimate origin in
two ways:
a) Socket credentials. If UID != 0, then the message comes from
some ilegitimate process and the message needs to be dropped.
b) Netlink portID. In general, portID == 0 means that the origin
of the messages comes from the kernel. Thus, discarding any
message not coming from the kernel.
However, ctnetlink sets the portID in event messages that has
been triggered by some user-space process, eg. conntrack utility.
So other processes subscribed to ctnetlink events, eg. conntrackd,
know that the event was triggered by some user-space action.
Neither of the two ways to discard ilegitimate messages coming
from non-root processes can help for ctnetlink.
This patch adds capability validation in case that dst_pid is set
in netlink_sendmsg(). This approach is aggressive since existing
applications using any Netlink bus to deliver messages between
two user-space processes will break. Note that the exception is
NETLINK_USERSOCK, since it is reserved for netlink-to-netlink
userspace communication.
Still, if anyone wants that his Netlink bus allows netlink-to-netlink
userspace, then they can set NL_NONROOT_SEND. However, by default,
I don't think it makes sense to allow to use NETLINK_ROUTE to
communicate two processes that are sending no matter what information
that is not related to link/neighbouring/routing. They should be using
NETLINK_USERSOCK instead for that.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit e0e3cea46d31d23dc40df0a49a7a2c04fe8edfea ]
Pablo Neira Ayuso discovered that avahi and
potentially NetworkManager accept spoofed Netlink messages because of a
kernel bug. The kernel passes all-zero SCM_CREDENTIALS ancillary data
to the receiver if the sender did not provide such data, instead of not
including any such data at all or including the correct data from the
peer (as it is the case with AF_UNIX).
This bug was introduced in commit 16e572626961
(af_unix: dont send SCM_CREDENTIALS by default)
This patch forces passing credentials for netlink, as
before the regression.
Another fix would be to not add SCM_CREDENTIALS in
netlink messages if not provided by the sender, but it
might break some programs.
With help from Florian Weimer & Petr Matousek
This issue is designated as CVE-2012-3520
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit c0de08d04215031d68fa13af36f347a6cfa252ca ]
If a packet is emitted on one socket in one group of fanout sockets,
it is transmitted again. It is thus read again on one of the sockets
of the fanout group. This result in a loop for software which
generate packets when receiving one.
This retransmission is not the intended behavior: a fanout group
must behave like a single socket. The packet should not be
transmitted on a socket if it originates from a socket belonging
to the same fanout group.
This patch fixes the issue by changing the transmission check to
take fanout group info account.
Reported-by: Aleksandr Kotov <a1k@mail.ru>
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 43da5f2e0d0c69ded3d51907d9552310a6b545e8 ]
The implementation of dev_ifconf() for the compat ioctl interface uses
an intermediate ifc structure allocated in userland for the duration of
the syscall. Though, it fails to initialize the padding bytes inserted
for alignment and that for leaks four bytes of kernel stack. Add an
explicit memset(0) before filling the structure to avoid the info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 2d8a041b7bfe1097af21441cb77d6af95f4f4680 ]
If at least one of CONFIG_IP_VS_PROTO_TCP or CONFIG_IP_VS_PROTO_UDP is
not set, __ip_vs_get_timeouts() does not fully initialize the structure
that gets copied to userland and that for leaks up to 12 bytes of kernel
stack. Add an explicit memset(0) before passing the structure to
__ip_vs_get_timeouts() to avoid the info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 7b07f8eb75aa3097cdfd4f6eac3da49db787381d ]
The CCID3 code fails to initialize the trailing padding bytes of struct
tfrc_tx_info added for alignment on 64 bit architectures. It that for
potentially leaks four bytes kernel stack via the getsockopt() syscall.
Add an explicit memset(0) before filling the structure to avoid the
info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 3592aaeb80290bda0f2cf0b5456c97bfc638b192 ]
The LLC code wrongly returns 0, i.e. "success", when the socket is
zapped. Together with the uninitialized uaddrlen pointer argument from
sys_getsockname this leads to an arbitrary memory leak of up to 128
bytes kernel stack via the getsockname() syscall.
Return an error instead when the socket is zapped to prevent the info
leak. Also remove the unnecessary memset(0). We don't directly write to
the memory pointed by uaddr but memcpy() a local structure at the end of
the function that is properly initialized.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 792039c73cf176c8e39a6e8beef2c94ff46522ed ]
The L2CAP code fails to initialize the l2_bdaddr_type member of struct
sockaddr_l2 and the padding byte added for alignment. It that for leaks
two bytes kernel stack via the getsockname() syscall. Add an explicit
memset(0) before filling the structure to avoid the info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 9344a972961d1a6d2c04d9008b13617bcb6ec2ef ]
The RFCOMM code fails to initialize the trailing padding byte of struct
sockaddr_rc added for alignment. It that for leaks one byte kernel stack
via the getsockname() syscall. Add an explicit memset(0) before filling
the structure to avoid the info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit f9432c5ec8b1e9a09b9b0e5569e3c73db8de432a ]
The RFCOMM code fails to initialize the two padding bytes of struct
rfcomm_dev_list_req inserted for alignment before copying it to
userland. Additionally there are two padding bytes in each instance of
struct rfcomm_dev_info. The ioctl() that for disclosures two bytes plus
dev_num times two bytes uninitialized kernel heap memory.
Allocate the memory using kzalloc() to fix this issue.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 9ad2de43f1aee7e7274a4e0d41465489299e344b ]
The RFCOMM code fails to initialize the key_size member of struct
bt_security before copying it to userland -- that for leaking one
byte kernel stack. Initialize key_size with 0 to avoid the info
leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 3f68ba07b1da811bf383b4b701b129bfcb2e4988 ]
The HCI code fails to initialize the hci_channel member of struct
sockaddr_hci and that for leaks two bytes kernel stack via the
getsockname() syscall. Initialize hci_channel with 0 to avoid the
info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit e15ca9a0ef9a86f0477530b0f44a725d67f889ee ]
The HCI code fails to initialize the two padding bytes of struct
hci_ufilter before copying it to userland -- that for leaking two
bytes kernel stack. Add an explicit memset(0) before filling the
structure to avoid the info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 3c0c5cfdcd4d69ffc4b9c0907cec99039f30a50a ]
The ATM code fails to initialize the two padding bytes of struct
sockaddr_atmpvc inserted for alignment. Add an explicit memset(0)
before filling the structure to avoid the info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit e862f1a9b7df4e8196ebec45ac62295138aa3fc2 ]
The ATM code fails to initialize the two padding bytes of struct
sockaddr_atmpvc inserted for alignment. Add an explicit memset(0)
before filling the structure to avoid the info leak.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 4acd4945cd1e1f92b20d14e349c6c6a52acbd42d ]
Cong Wang reports that lockdep detected suspicious RCU usage while
enabling IPV6 forwarding:
[ 1123.310275] ===============================
[ 1123.442202] [ INFO: suspicious RCU usage. ]
[ 1123.558207] 3.6.0-rc1+ #109 Not tainted
[ 1123.665204] -------------------------------
[ 1123.768254] include/linux/rcupdate.h:430 Illegal context switch in RCU read-side critical section!
[ 1123.992320]
[ 1123.992320] other info that might help us debug this:
[ 1123.992320]
[ 1124.307382]
[ 1124.307382] rcu_scheduler_active = 1, debug_locks = 0
[ 1124.522220] 2 locks held by sysctl/5710:
[ 1124.648364] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81768498>] rtnl_trylock+0x15/0x17
[ 1124.882211] #1: (rcu_read_lock){.+.+.+}, at: [<ffffffff81871df8>] rcu_lock_acquire+0x0/0x29
[ 1125.085209]
[ 1125.085209] stack backtrace:
[ 1125.332213] Pid: 5710, comm: sysctl Not tainted 3.6.0-rc1+ #109
[ 1125.441291] Call Trace:
[ 1125.545281] [<ffffffff8109d915>] lockdep_rcu_suspicious+0x109/0x112
[ 1125.667212] [<ffffffff8107c240>] rcu_preempt_sleep_check+0x45/0x47
[ 1125.781838] [<ffffffff8107c260>] __might_sleep+0x1e/0x19b
[...]
[ 1127.445223] [<ffffffff81757ac5>] call_netdevice_notifiers+0x4a/0x4f
[...]
[ 1127.772188] [<ffffffff8175e125>] dev_disable_lro+0x32/0x6b
[ 1127.885174] [<ffffffff81872d26>] dev_forward_change+0x30/0xcb
[ 1128.013214] [<ffffffff818738c4>] addrconf_forward_change+0x85/0xc5
[...]
addrconf_forward_change() uses RCU iteration over the netdev list,
which is unnecessary since it already holds the RTNL lock. We also
cannot reasonably require netdevice notifier functions not to sleep.
Reported-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 7f5c3e3a80e6654cf48dfba7cf94f88c6b505467 ]
Here's a quote of the comment about the BUG macro from asm-generic/bug.h:
Don't use BUG() or BUG_ON() unless there's really no way out; one
example might be detecting data structure corruption in the middle
of an operation that can't be backed out of. If the (sub)system
can somehow continue operating, perhaps with reduced functionality,
it's probably not BUG-worthy.
If you're tempted to BUG(), think again: is completely giving up
really the *only* solution? There are usually better options, where
users don't need to reboot ASAP and can mostly shut down cleanly.
In our case, the status flag of a ring buffer slot is managed from both sides,
the kernel space and the user space. This means that even though the kernel
side might work as expected, the user space screws up and changes this flag
right between the send(2) is triggered when the flag is changed to
TP_STATUS_SENDING and a given skb is destructed after some time. Then, this
will hit the BUG macro. As David suggested, the best solution is to simply
remove this statement since it cannot be used for kernel side internal
consistency checks. I've tested it and the system still behaves /stable/ in
this case, so in accordance with the above comment, we should rather remove it.
Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 7364e445f62825758fa61195d237a5b8ecdd06ec ]
Do not leak memory by updating pointer with potentially NULL realloc return value.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 696ecdc10622d86541f2e35cc16e15b6b3b1b67e ]
gact_rand array is accessed by gact->tcfg_ptype whose value
is assumed to less than MAX_RAND, but any range checks are
not performed.
So add a check in tcf_gact_init(). And in tcf_gact(), we can
reduce a branch.
Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 1485348d2424e1131ea42efc033cbd9366462b01 ]
Cache the device gso_max_segs in sock::sk_gso_max_segs and use it to
limit the size of TSO skbs. This avoids the need to fall back to
software GSO for local TCP senders.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 30b678d844af3305cda5953467005cebb5d7b687 ]
A peer (or local user) may cause TCP to use a nominal MSS of as little
as 88 (actual MSS of 76 with timestamps). Given that we have a
sufficiently prodigious local sender and the peer ACKs quickly enough,
it is nevertheless possible to grow the window for such a connection
to the point that we will try to send just under 64K at once. This
results in a single skb that expands to 861 segments.
In some drivers with TSO support, such an skb will require hundreds of
DMA descriptors; a substantial fraction of a TX ring or even more than
a full ring. The TX queue selected for the skb may stall and trigger
the TX watchdog repeatedly (since the problem skb will be retried
after the TX reset). This particularly affects sfc, for which the
issue is designated as CVE-2012-3412.
Therefore:
1. Add the field net_device::gso_max_segs holding the device-specific
limit.
2. In netif_skb_features(), if the number of segments is too high then
mask out GSO features to force fall back to software GSO.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 8f321f853ea33330c7141977cd34804476e2e07e upstream.
If remote device sends bogus RFC option with invalid length,
undefined options values are used. Fix this by using defaults when
remote misbehaves.
This also fixes the following warning reported by gcc 4.7.0:
net/bluetooth/l2cap_core.c: In function 'l2cap_config_rsp':
net/bluetooth/l2cap_core.c:3302:13: warning: 'rfc.max_pdu_size' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.max_pdu_size' was declared here
net/bluetooth/l2cap_core.c:3298:25: warning: 'rfc.monitor_timeout' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.monitor_timeout' was declared here
net/bluetooth/l2cap_core.c:3297:25: warning: 'rfc.retrans_timeout' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.retrans_timeout' was declared here
net/bluetooth/l2cap_core.c:3295:2: warning: 'rfc.mode' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.mode' was declared here
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit d10f27a750312ed5638c876e4bd6aa83664cccd8 upstream.
The rpc server tries to ensure that there will be room to send a reply
before it receives a request.
It does this by tracking, in xpt_reserved, an upper bound on the total
size of the replies that is has already committed to for the socket.
Currently it is adding in the estimate for a new reply *before* it
checks whether there is space available. If it finds that there is not
space, it then subtracts the estimate back out.
This may lead the subsequent svc_xprt_enqueue to decide that there is
space after all.
The results is a svc_recv() that will repeatedly return -EAGAIN, causing
server threads to loop without doing any actual work.
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit f06f00a24d76e168ecb38d352126fd203937b601 upstream.
svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply.
However, the XPT_CLOSE won't be acted on immediately. Meanwhile other
threads could send further replies before the socket is really shut
down. This can manifest as data corruption: for example, if a truncated
read reply is followed by another rpc reply, that second reply will look
to the client like further read data.
Symptoms were data corruption preceded by svc_tcp_sendto logging
something like
kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket
Reported-by: Malahal Naineni <malahal@us.ibm.com>
Tested-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit be1e44441a560c43c136a562d49a1c9623c91197 upstream.
Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is
consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if
sk_tcplen would lead us to expect n pages of data).
svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen.
This is OK, since both functions are serialized by XPT_BUSY. However,
that means the inconsistency must be repaired before dropping XPT_BUSY.
Therefore we should be ensuring that svc_tcp_save_pages repairs the
problem before exiting svc_tcp_recv_record on error.
Symptoms were a BUG() in svc_tcp_clear_pages.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 276bdb82dedb290511467a5a4fdbe9f0b52dce6f upstream.
ccid_hc_rx_getsockopt() and ccid_hc_tx_getsockopt() might be called with
a NULL ccid pointer leading to a NULL pointer dereference. This could
lead to a privilege escalation if the attacker is able to map page 0 and
prepare it with a fake ccid_ops pointer.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit a9ea3ed9b71cc3271dd59e76f65748adcaa76422 upstream.
Some devices e.g. some Android based phones don't do SDP search before
pairing and cancel legacy pairing when ACL is disconnected.
PIN Code Request event which changes ACL timeout to HCI_PAIRING_TIMEOUT
is only received after remote user entered PIN.
In that case no L2CAP is connected so default HCI_DISCONN_TIMEOUT
(2 seconds) is being used to timeout ACL connection. This results in
problems with legacy pairing as remote user has only few seconds to
enter PIN before ACL is disconnected.
Increase disconnect timeout for incomming connection to
HCI_PAIRING_TIMEOUT if SSP is disabled and no linkey exists.
To avoid keeping ACL alive for too long after SDP search set ACL
timeout back to HCI_DISCONN_TIMEOUT when L2CAP is connected.
2012-07-19 13:24:43.413521 < HCI Command: Create Connection (0x01|0x0005) plen 13
bdaddr 00:02:72:D6:6A:3F ptype 0xcc18 rswitch 0x01 clkoffset 0x0000
Packet type: DM1 DM3 DM5 DH1 DH3 DH5
2012-07-19 13:24:43.425224 > HCI Event: Command Status (0x0f) plen 4
Create Connection (0x01|0x0005) status 0x00 ncmd 1
2012-07-19 13:24:43.885222 > HCI Event: Role Change (0x12) plen 8
status 0x00 bdaddr 00:02:72:D6:6A:3F role 0x01
Role: Slave
2012-07-19 13:24:44.054221 > HCI Event: Connect Complete (0x03) plen 11
status 0x00 handle 42 bdaddr 00:02:72:D6:6A:3F type ACL encrypt 0x00
2012-07-19 13:24:44.054313 < HCI Command: Read Remote Supported Features (0x01|0x001b) plen 2
handle 42
2012-07-19 13:24:44.055176 > HCI Event: Page Scan Repetition Mode Change (0x20) plen 7
bdaddr 00:02:72:D6:6A:3F mode 0
2012-07-19 13:24:44.056217 > HCI Event: Max Slots Change (0x1b) plen 3
handle 42 slots 5
2012-07-19 13:24:44.059218 > HCI Event: Command Status (0x0f) plen 4
Read Remote Supported Features (0x01|0x001b) status 0x00 ncmd 0
2012-07-19 13:24:44.062192 > HCI Event: Command Status (0x0f) plen 4
Unknown (0x00|0x0000) status 0x00 ncmd 1
2012-07-19 13:24:44.067219 > HCI Event: Read Remote Supported Features (0x0b) plen 11
status 0x00 handle 42
Features: 0xbf 0xfe 0xcf 0xfe 0xdb 0xff 0x7b 0x87
2012-07-19 13:24:44.067248 < HCI Command: Read Remote Extended Features (0x01|0x001c) plen 3
handle 42 page 1
2012-07-19 13:24:44.071217 > HCI Event: Command Status (0x0f) plen 4
Read Remote Extended Features (0x01|0x001c) status 0x00 ncmd 1
2012-07-19 13:24:44.076218 > HCI Event: Read Remote Extended Features (0x23) plen 13
status 0x00 handle 42 page 1 max 1
Features: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
2012-07-19 13:24:44.076249 < HCI Command: Remote Name Request (0x01|0x0019) plen 10
bdaddr 00:02:72:D6:6A:3F mode 2 clkoffset 0x0000
2012-07-19 13:24:44.081218 > HCI Event: Command Status (0x0f) plen 4
Remote Name Request (0x01|0x0019) status 0x00 ncmd 1
2012-07-19 13:24:44.105214 > HCI Event: Remote Name Req Complete (0x07) plen 255
status 0x00 bdaddr 00:02:72:D6:6A:3F name 'uw000951-0'
2012-07-19 13:24:44.105284 < HCI Command: Authentication Requested (0x01|0x0011) plen 2
handle 42
2012-07-19 13:24:44.111207 > HCI Event: Command Status (0x0f) plen 4
Authentication Requested (0x01|0x0011) status 0x00 ncmd 1
2012-07-19 13:24:44.112220 > HCI Event: Link Key Request (0x17) plen 6
bdaddr 00:02:72:D6:6A:3F
2012-07-19 13:24:44.112249 < HCI Command: Link Key Request Negative Reply (0x01|0x000c) plen 6
bdaddr 00:02:72:D6:6A:3F
2012-07-19 13:24:44.115215 > HCI Event: Command Complete (0x0e) plen 10
Link Key Request Negative Reply (0x01|0x000c) ncmd 1
status 0x00 bdaddr 00:02:72:D6:6A:3F
2012-07-19 13:24:44.116215 > HCI Event: PIN Code Request (0x16) plen 6
bdaddr 00:02:72:D6:6A:3F
2012-07-19 13:24:48.099184 > HCI Event: Auth Complete (0x06) plen 3
status 0x13 handle 42
Error: Remote User Terminated Connection
2012-07-19 13:24:48.179182 > HCI Event: Disconn Complete (0x05) plen 4
status 0x00 handle 42 reason 0x13
Reason: Remote User Terminated Connection
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
[bwh: Backported to 3.2:
- Adjust context
- hci_conn_ssp_enabled() is not defined; open-code the condition]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 1f6fc43e621167492ed4b7f3b4269c584c3d6ccc upstream.
libertas currently calls cfg80211_disconnected() when it is being
brought down. This causes an event to be allocated, but since the
wdev is already removed from the rdev by the time that the event
processing work executes, the event is never processed or freed.
http://article.gmane.org/gmane.linux.kernel.wireless.general/95666
Fix this leak, and other possible situations, by processing the event
queue when a device is being unregistered. Thanks to Johannes Berg for
the suggestion.
Signed-off-by: Daniel Drake <dsd@laptop.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit dd4c9260e7f23f2e951cbfb2726e468c6d30306c upstream.
The mesh path timer needs to be canceled when
leaving the mesh as otherwise it could fire
after the interface has been removed already.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 59ea33a68a9083ac98515e4861c00e71efdc49a1 ]
Back in 2006, commit 1a2449a87b ("[I/OAT]: TCP recv offload to I/OAT")
added support for receive offloading to IOAT dma engine if available.
The code in tcp_rcv_established() tries to perform early DMA copy if
applicable. It however does so without checking whether the userspace
task is actually expecting the data in the buffer.
This is not a problem under normal circumstances, but there is a corner
case where this doesn't work -- and that's when MSG_TRUNC flag to
recvmsg() is used.
If the IOAT dma engine is not used, the code properly checks whether
there is a valid ucopy.task and the socket is owned by userspace, but
misses the check in the dmaengine case.
This problem can be observed in real trivially -- for example 'tbench' is a
good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing
IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they
have been already early-copied in tcp_rcv_established() using dma engine.
This patch introduces the same check we are performing in the simple
iovec copy case to the IOAT case as well. It fixes the indefinite
recvmsg(MSG_TRUNC) hangs.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit b1beb681cba5358f62e6187340660ade226a5fcc ]
When device flags are set using rtnetlink, IFF_PROMISC and IFF_ALLMULTI
flags are handled specially. Function dev_change_flags sets IFF_PROMISC and
IFF_ALLMULTI bits in dev->gflags according to the passed value but
do_setlink passes a result of rtnl_dev_combine_flags which takes those bits
from dev->flags.
This can be easily trigerred by doing:
tcpdump -i eth0 &
ip l s up eth0
ip sets IFF_UP flag in ifi_flags and ifi_change, which is combined with
IFF_PROMISC by rtnl_dev_combine_flags, causing __dev_change_flags to set
IFF_PROMISC in gflags.
Reported-by: Max Matveev <makc@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 42493570100b91ef663c4c6f0c0fdab238f9d3c2 ]
TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int. But
patch "tcp: Add TCP_USER_TIMEOUT socket option"(dca43c75) didn't check the negative
values. If a user assign -1 to it, the socket will set successfully and wait
for 4294967295 miliseconds. This patch add a negative value check to avoid
this issue.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 8b72ff6484fe303e01498b58621810a114f3cf09 ]
gcc really should warn about these !
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 89d7ae34cdda4195809a5a987f697a517a2a3177 ]
As reported by Alan Cox, and verified by Lin Ming, when a user
attempts to add a CIPSO option to a socket using the CIPSO_V4_TAG_LOCAL
tag the kernel dies a terrible death when it attempts to follow a NULL
pointer (the skb argument to cipso_v4_validate() is NULL when called via
the setsockopt() syscall).
This patch fixes this by first checking to ensure that the skb is
non-NULL before using it to find the incoming network interface. In
the unlikely case where the skb is NULL and the user attempts to add
a CIPSO option with the _TAG_LOCAL tag we return an error as this is
not something we want to allow.
A simple reproducer, kindly supplied by Lin Ming, although you must
have the CIPSO DOI #3 configure on the system first or you will be
caught early in cipso_v4_validate():
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/ip.h>
#include <linux/in.h>
#include <string.h>
struct local_tag {
char type;
char length;
char info[4];
};
struct cipso {
char type;
char length;
char doi[4];
struct local_tag local;
};
int main(int argc, char **argv)
{
int sockfd;
struct cipso cipso = {
.type = IPOPT_CIPSO,
.length = sizeof(struct cipso),
.local = {
.type = 128,
.length = sizeof(struct local_tag),
},
};
memset(cipso.doi, 0, 4);
cipso.doi[3] = 3;
sockfd = socket(AF_INET, SOCK_DGRAM, 0);
#define SOL_IP 0
setsockopt(sockfd, SOL_IP, IP_OPTIONS,
&cipso, sizeof(struct cipso));
return 0;
}
CC: Lin Ming <mlin@ss.pku.edu.cn>
Reported-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 96f80d123eff05c3cd4701463786b87952a6c3ac ]
unregister_netdevice_notifier() must be called before
unregister_pernet_subsys() to avoid accessing already freed
pernet memory. This fixes the following oops when doing rmmod:
Call Trace:
[<ffffffffa0f802bd>] caif_device_notify+0x4d/0x5a0 [caif]
[<ffffffff81552ba9>] unregister_netdevice_notifier+0xb9/0x100
[<ffffffffa0f86dcc>] caif_device_exit+0x1c/0x250 [caif]
[<ffffffff810e7734>] sys_delete_module+0x1a4/0x300
[<ffffffff810da82d>] ? trace_hardirqs_on_caller+0x15d/0x1e0
[<ffffffff813517de>] ? trace_hardirqs_on_thunk+0x3a/0x3
[<ffffffff81696bad>] system_call_fastpath+0x1a/0x1f
RIP
[<ffffffffa0f7f561>] caif_get+0x51/0xb0 [caif]
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 2eebc1e188e9e45886ee00662519849339884d6d ]
A few days ago Dave Jones reported this oops:
[22766.294255] general protection fault: 0000 [#1] PREEMPT SMP
[22766.295376] CPU 0
[22766.295384] Modules linked in:
[22766.387137] ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90
ffff880147c03a74
[22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000
[22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000,
[22766.387137] Stack:
[22766.387140] ffff880147c03a10
[22766.387140] ffffffffa169f2b6
[22766.387140] ffff88013ed95728
[22766.387143] 0000000000000002
[22766.387143] 0000000000000000
[22766.387143] ffff880003fad062
[22766.387144] ffff88013c120000
[22766.387144]
[22766.387145] Call Trace:
[22766.387145] <IRQ>
[22766.387150] [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0
[sctp]
[22766.387154] [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp]
[22766.387157] [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp]
[22766.387161] [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0
[22766.387163] [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210
[22766.387166] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387168] [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0
[22766.387169] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387171] [<ffffffff81590a07>] ip_local_deliver+0x47/0x80
[22766.387172] [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680
[22766.387174] [<ffffffff81590c54>] ip_rcv+0x214/0x320
[22766.387176] [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910
[22766.387178] [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910
[22766.387180] [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40
[22766.387182] [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0
[22766.387183] [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440
[22766.387185] [<ffffffff81559280>] napi_skb_finish+0x70/0xa0
[22766.387187] [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130
[22766.387218] [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e]
[22766.387242] [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e]
[22766.387266] [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e]
[22766.387268] [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0
[22766.387270] [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130
[22766.387273] [<ffffffff810734d0>] __do_softirq+0xe0/0x420
[22766.387275] [<ffffffff8169826c>] call_softirq+0x1c/0x30
[22766.387278] [<ffffffff8101db15>] do_softirq+0xd5/0x110
[22766.387279] [<ffffffff81073bc5>] irq_exit+0xd5/0xe0
[22766.387281] [<ffffffff81698b03>] do_IRQ+0x63/0xd0
[22766.387283] [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f
[22766.387283] <EOI>
[22766.387284]
[22766.387285] [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b
[22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48
89 e5 48 83
ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00
48 89 fb
49 89 f5 66 c1 c0 08 66 39 46 02
[22766.387307]
[22766.387307] RIP
[22766.387311] [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp]
[22766.387311] RSP <ffff880147c039b0>
[22766.387142] ffffffffa16ab120
[22766.599537] ---[ end trace 3f6dae82e37b17f5 ]---
[22766.601221] Kernel panic - not syncing: Fatal exception in interrupt
It appears from his analysis and some staring at the code that this is likely
occuring because an association is getting freed while still on the
sctp_assoc_hashtable. As a result, we get a gpf when traversing the hashtable
while a freed node corrupts part of the list.
Nominally I would think that an mibalanced refcount was responsible for this,
but I can't seem to find any obvious imbalance. What I did note however was
that the two places where we create an association using
sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths
which free a newly created association after calling sctp_primitive_ASSOCIATE.
sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which
issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to
the aforementioned hash table. the sctp command interpreter that process side
effects has not way to unwind previously processed commands, so freeing the
association from the __sctp_connect or sctp_sendmsg error path would lead to a
freed association remaining on this hash table.
I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init,
which allows us to proerly use hlist_unhashed to check if the node is on a
hashlist safely during a delete. That in turn alows us to safely call
sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths
before freeing them, regardles of what the associations state is on the hash
list.
I noted, while I was doing this, that the __sctp_unhash_endpoint was using
hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes
pointers to make that function work properly, so I fixed that up in a simmilar
fashion.
I attempted to test this using a virtual guest running the SCTP_RR test from
netperf in a loop while running the trinity fuzzer, both in a loop. I wasn't
able to recreate the problem prior to this fix, nor was I able to trigger the
failure after (neither of which I suppose is suprising). Given the trace above
however, I think its likely that this is what we hit.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: davej@redhat.com
CC: davej@redhat.com
CC: "David S. Miller" <davem@davemloft.net>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Sridhar Samudrala <sri@us.ibm.com>
CC: linux-sctp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 7ac2908e4b2edaec60e9090ddb4d9ceb76c05e7d ]
Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=44461
Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit bec4596b4e6770c7037f21f6bd27567b152dc0d6 upstream.
drop_monitor calls several sleeping functions while in atomic context.
BUG: sleeping function called from invalid context at mm/slub.c:943
in_atomic(): 1, irqs_disabled(): 0, pid: 2103, name: kworker/0:2
Pid: 2103, comm: kworker/0:2 Not tainted 3.5.0-rc1+ #55
Call Trace:
[<ffffffff810697ca>] __might_sleep+0xca/0xf0
[<ffffffff811345a3>] kmem_cache_alloc_node+0x1b3/0x1c0
[<ffffffff8105578c>] ? queue_delayed_work_on+0x11c/0x130
[<ffffffff815343fb>] __alloc_skb+0x4b/0x230
[<ffffffffa00b0360>] ? reset_per_cpu_data+0x160/0x160 [drop_monitor]
[<ffffffffa00b022f>] reset_per_cpu_data+0x2f/0x160 [drop_monitor]
[<ffffffffa00b03ab>] send_dm_alert+0x4b/0xb0 [drop_monitor]
[<ffffffff810568e0>] process_one_work+0x130/0x4c0
[<ffffffff81058249>] worker_thread+0x159/0x360
[<ffffffff810580f0>] ? manage_workers.isra.27+0x240/0x240
[<ffffffff8105d403>] kthread+0x93/0xa0
[<ffffffff816be6d4>] kernel_thread_helper+0x4/0x10
[<ffffffff8105d370>] ? kthread_freezable_should_stop+0x80/0x80
[<ffffffff816be6d0>] ? gs_change+0xb/0xb
Rework the logic to call the sleeping functions in right context.
Use standard timer/workqueue api to let system chose any cpu to perform
the allocation and netlink send.
Also avoid a loop if reset_per_cpu_data() cannot allocate memory :
use mod_timer() to wait 1/10 second before next try.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 4fdcfa12843bca38d0c9deff70c8720e4e8f515f upstream.
I just noticed after some recent updates, that the init path for the drop
monitor protocol has a minor error. drop monitor maintains a per cpu structure,
that gets initalized from a single cpu. Normally this is fine, as the protocol
isn't in use yet, but I recently made a change that causes a failed skb
allocation to reschedule itself . Given the current code, the implication is
that this workqueue reschedule will take place on the wrong cpu. If drop
monitor is used early during the boot process, its possible that two cpus will
access a single per-cpu structure in parallel, possibly leading to data
corruption.
This patch fixes the situation, by storing the cpu number that a given instance
of this per-cpu data should be accessed from. In the case of a need for a
reschedule, the cpu stored in the struct is assigned the rescheule, rather than
the currently executing cpu
Tested successfully by myself.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 3885ca785a3618593226687ced84f3f336dc3860 upstream.
Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in
its smp protections. Specifically, its possible to replace data->skb while its
being written. This patch corrects that by making data->skb an rcu protected
variable. That will prevent it from being overwritten while a tracepoint is
modifying it.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit cde2e9a651b76d8db36ae94cd0febc82b637e5dd upstream.
Eric Dumazet pointed out this warning in the drop_monitor protocol to me:
[ 38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85
[ 38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch
[ 38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71
[ 38.352582] Call Trace:
[ 38.352592] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
[ 38.352599] [<ffffffff81063f2a>] __might_sleep+0xca/0xf0
[ 38.352606] [<ffffffff81655b16>] mutex_lock+0x26/0x50
[ 38.352610] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
[ 38.352616] [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90
[ 38.352621] [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170
[ 38.352625] [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40
[ 38.352630] [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0
[ 38.352636] [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0
[ 38.352640] [<ffffffff8154a600>] ? genl_rcv+0x30/0x30
[ 38.352645] [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0
[ 38.352649] [<ffffffff8154a5f0>] genl_rcv+0x20/0x30
[ 38.352653] [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0
[ 38.352658] [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310
[ 38.352663] [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130
[ 38.352668] [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0
[ 38.352673] [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0
[ 38.352677] [< |