aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-01-05bnx2x: Correct number of MSI-X vectors for VFsMichal Kalderon
Number of VFs in PCIe configuration space is zero-based. Driver incorrectly sets the number of VFs to be larger by one than what actually is feasible by HW, which might cause later VFs to fail to allocate their MSI-X interrupts. Signed-off-by: Michal Kalderon <michals@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-05bnx2x: limit number of interrupt vectors for 57711Dmitry Kravkov
Original straightforward division may lead to zeroing number of SB and null-pointer dereference when device is short of MSIX vectors or lacks MSIX capabilities. Reported-by: Vladislav Zolotarov <vladz@cloudius-systems.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Another set of small fixes for ARM, covering various areas. Laura fixed a long standing issue with virt_addr_valid() failing to handle holes in memory. Steve found a problem with dcache flushing for compound pages. I fixed another bug in footbridge stuff causing time to tick slowly, and also a problem with the AES code which can cause linker errors. A patch from Rob which fixes Xen problems induced by a lack of consistency in our naming of ioremap_cache() - which thankfully has very few users" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 7933/1: rename ioremap_cached to ioremap_cache ARM: fix "bad mode in ... handler" message for undefined instructions CRYPTO: Fix more AES build errors ARM: 7931/1: Correct virt_addr_valid ARM: 7923/1: mm: fix dcache flush logic for compound high pages ARM: fix footbridge clockevent device
2014-01-05Merge branches 'acpi-ac' and 'acpi-tpm'Rafael J. Wysocki
* acpi-ac: ACPI / AC: change notification handler type to ACPI_ALL_NOTIFY * acpi-tpm: ACPI / TPM: fix memory leak when walking ACPI namespace
2014-01-05ACPI / TPM: fix memory leak when walking ACPI namespaceJiang Liu
In function ppi_callback(), memory allocated by acpi_get_name() will get leaked when current device isn't the desired TPM device, so fix the memory leak. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-05ACPI / AC: change notification handler type to ACPI_ALL_NOTIFYAlexander Mezin
With kernel 3.13rc5 there are no AC adapter notifications on my laptop. Commit cc8ef5270734 "ACPI / AC: convert ACPI ac driver to platform bus" changed the driver to listen to device notifications only. However, AML code on my laptop notifies the driver with zero event. This patch changes the driver to listen to all events again. Fixes: cc8ef5270734 (ACPI / AC: convert ACPI ac driver to platform bus) References: https://bugzilla.kernel.org/show_bug.cgi?id=67821 Signed-off-by: Alexander Mezin <mezin.alexander@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-05ARM: 7933/1: rename ioremap_cached to ioremap_cacheRob Herring
ioremap_cache is more aligned with other architectures. There are only 2 users of this in the kernel: pxa2xx-flash and Xen. This fixes Xen build failures on arm64: drivers/tty/hvc/hvc_xen.c:233:2: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration] drivers/xen/grant-table.c:1174:3: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration] drivers/xen/xenbus/xenbus_probe.c:778:4: error: implicit declaration of function 'ioremap_cached' [-Werror=implicit-function-declaration] Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-01-05ARM: fix "bad mode in ... handler" message for undefined instructionsRussell King
The array was missing the final entry for the undefined instruction exception handler; this commit adds it. Cc: <stable@vger.kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-01-05CRYPTO: Fix more AES build errorsRussell King
Building a multi-arch kernel results in: arch/arm/crypto/built-in.o: In function `aesbs_xts_decrypt': sha1_glue.c:(.text+0x15c8): undefined reference to `bsaes_xts_decrypt' arch/arm/crypto/built-in.o: In function `aesbs_xts_encrypt': sha1_glue.c:(.text+0x1664): undefined reference to `bsaes_xts_encrypt' arch/arm/crypto/built-in.o: In function `aesbs_ctr_encrypt': sha1_glue.c:(.text+0x184c): undefined reference to `bsaes_ctr32_encrypt_blocks' arch/arm/crypto/built-in.o: In function `aesbs_cbc_decrypt': sha1_glue.c:(.text+0x19b4): undefined reference to `bsaes_cbc_encrypt' This code is already runtime-conditional on NEON being supported, so there's no point compiling it out depending on the minimum build architecture. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-01-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc bugfixes from David Miller: 1) Missing include can lead to build failure, from Kirill Tkhai. 2) Use dev_is_pci() where applicable, from Yijing Wang. 3) Enable irqs after we enable preemption in cpu startup path, from Kirill Tkhai. 4) Revert a __copy_{to,from}_user_inatomic change that broke iov_iter_copy_from_user_atomic() and thus several tests in xfstests and LTP. From Dave Kleikamp. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: Revert "sparc64: Fix __copy_{to,from}_user_inatomic defines." sparc64: smp_callin: Enable irqs after preemption is disabled sparc/PCI: Use dev_is_pci() to identify PCI devices sparc64: Fix build regression
2014-01-04Revert "sparc64: Fix __copy_{to,from}_user_inatomic defines."Dave Kleikamp
This reverts commit 145e1c0023585e0e8f6df22316308ec61c5066b2. This commit broke the behavior of __copy_from_user_inatomic when it is only partially successful. Instead of returning the number of bytes not copied, it now returns 1. This translates to the wrong value being returned by iov_iter_copy_from_user_atomic. xfstests generic/246 and LTP writev01 both fail on btrfs and nfs because of this. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04sparc64: smp_callin: Enable irqs after preemption is disabledKirill Tkhai
Most of other architectures have below suggested order. So lets do the same to fit generic idle loop scheme better. Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04sparc/PCI: Use dev_is_pci() to identify PCI devicesYijing Wang
Use dev_is_pci() instead of checking bus type directly. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04Linux 3.13-rc7v3.13-rc7Linus Torvalds
2014-01-04NFC: Fix target mode p2p link establishmentArron Wang
With commit e29a9e2ae165620d, we set the active_target pointer from nfc_dep_link_is_up() in order to support the case where the target detection and the DEP link setting are done atomically by the driver. That can only happen in initiator mode, so we need to check for that otherwise we fail to bring a p2p link in target mode. Signed-off-by: Arron Wang <arron.wang@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-01-03qlcnic: Fix bug in Tx completion pathShahed Shaikh
o Driver is using common tx_clean_lock for all Tx queues. This patch adds per queue tx_clean_lock. o Driver is not updating sw_consumer while processing Tx completion when interface is going down. Fixed in this patch. Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-03infiniband: make sure the src net is infiniband when create new linkHangbin Liu
When we create a new infiniband link with uninfiniband device, e.g. `ip link add link em1 type ipoib pkey 0x8001`. We will get a NULL pointer dereference cause other dev like Ethernet don't have struct ib_device. The code path is: rtnl_newlink |-- ipoib_new_child_link |-- __ipoib_vlan_add |-- ipoib_set_dev_features |-- ib_query_device Fix this bug by make sure the src net is infiniband when create new link. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-03{vxlan, inet6} Mark vxlan_dev flags with VXLAN_F_IPV6 properlyfan.du
Even if user doesn't supply the physical netdev to attach vxlan dev to, and at the same time user want to vxlan sit top of IPv6, mark vxlan_dev flags with VXLAN_F_IPV6 to create IPv6 based socket. Otherwise kernel crashes safely every time spitting below messages, Steps to reproduce: ip link add vxlan0 type vxlan id 42 group ff0e::110 ip link set vxlan0 up [ 62.656266] BUG: unable to handle kernel NULL pointer dereference[ 62.656320] ip (3008) used greatest stack depth: 3912 bytes left at 0000000000000046 [ 62.656423] IP: [<ffffffff816d822d>] ip6_route_output+0xbd/0xe0 [ 62.656525] PGD 2c966067 PUD 2c9a2067 PMD 0 [ 62.656674] Oops: 0000 [#1] SMP [ 62.656781] Modules linked in: vxlan netconsole deflate zlib_deflate af_key [ 62.657083] CPU: 1 PID: 2128 Comm: whoopsie Not tainted 3.12.0+ #182 [ 62.657083] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 [ 62.657083] task: ffff88002e2335d0 ti: ffff88002c94c000 task.ti: ffff88002c94c000 [ 62.657083] RIP: 0010:[<ffffffff816d822d>] [<ffffffff816d822d>] ip6_route_output+0xbd/0xe0 [ 62.657083] RSP: 0000:ffff88002fd038f8 EFLAGS: 00210296 [ 62.657083] RAX: 0000000000000000 RBX: ffff88002fd039e0 RCX: 0000000000000000 [ 62.657083] RDX: ffff88002fd0eb68 RSI: ffff88002fd0d278 RDI: ffff88002fd0d278 [ 62.657083] RBP: ffff88002fd03918 R08: 0000000002000000 R09: 0000000000000000 [ 62.657083] R10: 00000000000001ff R11: 0000000000000000 R12: 0000000000000001 [ 62.657083] R13: ffff88002d96b480 R14: ffffffff81c8e2c0 R15: 0000000000000001 [ 62.657083] FS: 0000000000000000(0000) GS:ffff88002fd00000(0063) knlGS:00000000f693b740 [ 62.657083] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 62.657083] CR2: 0000000000000046 CR3: 000000002c9d2000 CR4: 00000000000006e0 [ 62.657083] Stack: [ 62.657083] ffff88002fd03a40 ffffffff81c8e2c0 ffff88002fd039e0 ffff88002d96b480 [ 62.657083] ffff88002fd03958 ffffffff816cac8b ffff880019277cc0 ffff8800192b5d00 [ 62.657083] ffff88002d5bc000 ffff880019277cc0 0000000000001821 0000000000000001 [ 62.657083] Call Trace: [ 62.657083] <IRQ> [ 62.657083] [<ffffffff816cac8b>] ip6_dst_lookup_tail+0xdb/0xf0 [ 62.657083] [<ffffffff816caea0>] ip6_dst_lookup+0x10/0x20 [ 62.657083] [<ffffffffa0020c13>] vxlan_xmit_one+0x193/0x9c0 [vxlan] [ 62.657083] [<ffffffff8137b3b7>] ? account+0xc7/0x1f0 [ 62.657083] [<ffffffffa0021513>] vxlan_xmit+0xd3/0x400 [vxlan] [ 62.657083] [<ffffffff8161390d>] dev_hard_start_xmit+0x49d/0x5e0 [ 62.657083] [<ffffffff81613d29>] dev_queue_xmit+0x2d9/0x480 [ 62.657083] [<ffffffff817cb854>] ? _raw_write_unlock_bh+0x14/0x20 [ 62.657083] [<ffffffff81630565>] ? eth_header+0x35/0xe0 [ 62.657083] [<ffffffff8161bc5e>] neigh_resolve_output+0x11e/0x1e0 [ 62.657083] [<ffffffff816ce0e0>] ? ip6_fragment+0xad0/0xad0 [ 62.657083] [<ffffffff816cb465>] ip6_finish_output2+0x2f5/0x470 [ 62.657083] [<ffffffff816ce166>] ip6_finish_output+0x86/0xc0 [ 62.657083] [<ffffffff816ce218>] ip6_output+0x78/0xb0 [ 62.657083] [<ffffffff816eadd6>] mld_sendpack+0x256/0x2a0 [ 62.657083] [<ffffffff816ebd8c>] mld_ifc_timer_expire+0x17c/0x290 [ 62.657083] [<ffffffff816ebc10>] ? igmp6_timer_handler+0x80/0x80 [ 62.657083] [<ffffffff816ebc10>] ? igmp6_timer_handler+0x80/0x80 [ 62.657083] [<ffffffff81051065>] call_timer_fn+0x45/0x150 [ 62.657083] [<ffffffff816ebc10>] ? igmp6_timer_handler+0x80/0x80 [ 62.657083] [<ffffffff81052353>] run_timer_softirq+0x1f3/0x2a0 [ 62.657083] [<ffffffff8102dfd8>] ? lapic_next_event+0x18/0x20 [ 62.657083] [<ffffffff8109e36f>] ? clockevents_program_event+0x6f/0x110 [ 62.657083] [<ffffffff8104a2f6>] __do_softirq+0xd6/0x2b0 [ 62.657083] [<ffffffff8104a75e>] irq_exit+0x7e/0xa0 [ 62.657083] [<ffffffff8102ea15>] smp_apic_timer_interrupt+0x45/0x60 [ 62.657083] [<ffffffff817d3eca>] apic_timer_interrupt+0x6a/0x70 [ 62.657083] <EOI> [ 62.657083] [<ffffffff817d4a35>] ? sysenter_dispatch+0x7/0x1a [ 62.657083] Code: 4d 8b 85 a8 02 00 00 4c 89 e9 ba 03 04 00 00 48 c7 c6 c0 be 8d 81 48 c7 c7 48 35 a3 81 31 c0 e8 db 68 0e 00 49 8b 85 a8 02 00 00 <0f> b6 40 46 c0 e8 05 0f b6 c0 c1 e0 03 41 09 c4 e9 77 ff ff ff [ 62.657083] RIP [<ffffffff816d822d>] ip6_route_output+0xbd/0xe0 [ 62.657083] RSP <ffff88002fd038f8> [ 62.657083] CR2: 0000000000000046 [ 62.657083] ---[ end trace ba8a9583d7cd1934 ]--- [ 62.657083] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Fan Du <fan.du@windriver.com> Reported-by: Ryan Whelan <rcwhelan@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-03cxgb4: allow large buffer size to have page sizeThadeu Lima de Souza Cascardo
Since commit 52367a763d8046190754ab43743e42638564a2d1 ("cxgb4/cxgb4vf: Code cleanup to enable T4 Configuration File support"), we have failures like this during cxgb4 probe: cxgb4 0000:01:00.4: bad SGE FL page buffer sizes [65536, 65536] cxgb4: probe of 0000:01:00.4 failed with error -22 This happens whenever software parameters are used, without a configuration file. That happens when the hardware was already initialized (after kexec, or after csiostor is loaded). It happens that these values are acceptable, rendering fl_pg_order equal to 0, which is the case of a hard init when the page size is equal or larger than 65536. Accepting fl_large_pg equal to fl_small_pg solves the issue, and shouldn't cause any trouble besides a possible performance reduction when smaller pages are used. And that can be fixed by a configuration file. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-03Merge tag 'for-v3.13-fixes' of git://git.infradead.org/battery-2.6Linus Torvalds
Pull battery fixes from Anton Vorontsov: "Two fixes: - fix build error caused by max17042_battery conversion to the regmap API. - fix kernel oops when booting with wakeup_source_activate enabled" * tag 'for-v3.13-fixes' of git://git.infradead.org/battery-2.6: max17042_battery: Fix build errors caused by missing REGMAP_I2C config power_supply: Fix Oops from NULL pointer dereference from wakeup_source_activate
2014-01-03Merge tag 'pm+acpi-3.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and PM fixes and new device IDs from Rafael Wysocki: "These commits, except for one, are regression fixes and the remaining one fixes a divide error leading to a kernel panic. The majority of the regressions fixed here were introduced during the 3.12 cycle, one of them is from this cycle and one is older. Specifics: - VGA switcheroo was broken for some users as a result of the ACPI-based PCI hotplug (ACPIPHP) changes in 3.12, because some previously ignored hotplug events started to be handled. The fix causes them to be ignored again. - There are two more issues related to cpufreq's suspend/resume handling changes from the 3.12 cycle addressed by Viresh Kumar's fixes. - intel_pstate triggers a divide error in a timer function if the P-state information it needs is missing during initialization. This leads to kernel panics on nested KVM clients and is fixed by failing the initialization cleanly in those cases. - PCI initalization code changes during the 3.9 cycle uncovered BIOS issues related to ACPI wakeup notifications (some BIOSes send them for devices that aren't supposed to support ACPI wakeup). Work around them by installing an ACPI wakeup notify handler for all PCI devices with ACPI support. - The Calxeda cpuilde driver's probe function is tagged as __init, which is incorrect and causes a section mismatch to occur during build. Fix from Andre Przywara removes the __init tag from there. - During the 3.12 cycle ACPIPHP started to print warnings about missing _ADR for devices that legitimately don't have it. Fix from Toshi Kani makes it only print the warnings where they make sense" * tag 'pm+acpi-3.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPIPHP / radeon / nouveau: Fix VGA switcheroo problem related to hotplug intel_pstate: Fail initialization if P-state information is missing ARM/cpuidle: remove __init tag from Calxeda cpuidle probe function PCI / ACPI: Install wakeup notify handlers for all PCI devs with ACPI cpufreq: preserve user_policy across suspend/resume cpufreq: Clean up after a failing light-weight initialization ACPI / PCI / hotplug: Avoid warning when _ADR not present
2014-01-02netpoll: Fix missing TXQ unlock and and OOPS.David S. Miller
The VLAN tag handling code in netpoll_send_skb_on_dev() has two problems. 1) It exits without unlocking the TXQ. 2) It then tries to queue a NULL skb to npinfo->txq. Reported-by: Ahmed Tamrawi <atamrawi@iastate.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02ipv6: fix the use of pcpu_tstats in ip6_vti.cLi RongQing
when read/write the 64bit data, the correct lock should be hold. and we can use the generic vti6_get_stats to return stats, and not define a new one in ip6_vti.c Fixes: 87b6d218f3adb ("tunnel: implement 64 bits statistics") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02ipv6: fix the use of pcpu_tstats in ip6_tunnelLi RongQing
when read/write the 64bit data, the correct lock should be hold. Fixes: 87b6d218f3adb ("tunnel: implement 64 bits statistics") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02ipv6 addrconf: fix preferred lifetime state-changing behavior while ↵Yasushi Asano
valid_lft is infinity Fixed a problem with setting the lifetime of an IPv6 address. When setting preferred_lft to a value not zero or infinity, while valid_lft is infinity(0xffffffff) preferred lifetime is set to forever and does not update. Therefore preferred lifetime never becomes deprecated. valid lifetime and preferred lifetime should be set independently, even if valid lifetime is infinity, preferred lifetime must expire correctly (meaning it must eventually become deprecated) Signed-off-by: Yasushi Asano <yasushi.asano@jp.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02net: llc: fix use after free in llc_ui_recvmsgDaniel Borkmann
While commit 30a584d944fb fixes datagram interface in LLC, a use after free bug has been introduced for SOCK_STREAM sockets that do not make use of MSG_PEEK. The flow is as follow ... if (!(flags & MSG_PEEK)) { ... sk_eat_skb(sk, skb, false); ... } ... if (used + offset < skb->len) continue; ... where sk_eat_skb() calls __kfree_skb(). Therefore, cache original length and work on skb_len to check partial reads. Fixes: 30a584d944fb ("[LLX]: SOCK_DGRAM interface fixes") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02virtio-net: fix refill races during restoreJason Wang
During restoring, try_fill_recv() was called with neither napi lock nor napi disabled. This can lead two try_fill_recv() was called in the same time. Fix this by refilling before trying to enable napi. Fixes 0741bcb5584f9e2390ae6261573c4de8314999f2 (virtio: net: Add freeze, restore handlers to support S4). Cc: Amit Shah <amit.shah@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NICWei-Chun Chao
VM to VM GSO traffic is broken if it goes through VXLAN or GRE tunnel and the physical NIC on the host supports hardware VXLAN/GRE GSO offload (e.g. bnx2x and next-gen mlx4). Two issues - (VXLAN) VM traffic has SKB_GSO_DODGY and SKB_GSO_UDP_TUNNEL with SKB_GSO_TCP/UDP set depending on the inner protocol. GSO header integrity check fails in udp4_ufo_fragment if inner protocol is TCP. Also gso_segs is calculated incorrectly using skb->len that includes tunnel header. Fix: robust check should only be applied to the inner packet. (VXLAN & GRE) Once GSO header integrity check passes, NULL segs is returned and the original skb is sent to hardware. However the tunnel header is already pulled. Fix: tunnel header needs to be restored so that hardware can perform GSO properly on the original packet. Signed-off-by: Wei-Chun Chao <weichunc@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm bugfixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: nVMX: Unconditionally uninit the MMU on nested vmexit KVM: x86: Fix APIC map calculation after re-enabling
2014-01-02Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge patches from Andrew Morton: "Ten fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c drivers/dma/ioat/dma.c: check DMA mapping error in ioat_dma_self_test() mm/memory-failure.c: transfer page count from head page to tail page after split thp MAINTAINERS: set up proper record for Xilinx Zynq mm: remove bogus warning in copy_huge_pmd() memcg: fix memcg_size() calculation mm: fix use-after-free in sys_remap_file_pages mm: munlock: fix deadlock in __munlock_pagevec() mm: munlock: fix a bug where THP tail page is encountered
2014-01-02epoll: do not take the nested ep->mtx on EPOLL_CTL_DELJason Baron
The EPOLL_CTL_DEL path of epoll contains a classic, ab-ba deadlock. That is, epoll_ctl(a, EPOLL_CTL_DEL, b, x), will deadlock with epoll_ctl(b, EPOLL_CTL_DEL, a, x). The deadlock was introduced with commmit 67347fe4e632 ("epoll: do not take global 'epmutex' for simple topologies"). The acquistion of the ep->mtx for the destination 'ep' was added such that a concurrent EPOLL_CTL_ADD operation would see the correct state of the ep (Specifically, the check for '!list_empty(&f.file->f_ep_links') However, by simply not acquiring the lock, we do not serialize behind the ep->mtx from the add path, and thus may perform a full path check when if we had waited a little longer it may not have been necessary. However, this is a transient state, and performing the full loop checking in this case is not harmful. The important point is that we wouldn't miss doing the full loop checking when required, since EPOLL_CTL_ADD always locks any 'ep's that its operating upon. The reason we don't need to do lock ordering in the add path, is that we are already are holding the global 'epmutex' whenever we do the double lock. Further, the original posting of this patch, which was tested for the intended performance gains, did not perform this additional locking. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Nathan Zimmer <nzimmer@sgi.com> Cc: Eric Wong <normalperson@yhbt.net> Cc: Nelson Elhage <nelhage@nelhage.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to ↵Nobuhiro Iwamatsu
sh_ksyms_32.c Min_low_pfn and max_low_pfn were used in pfn_valid macro if defined CONFIG_FLATMEM. When the functions that use the pfn_valid is used in driver module, max_low_pfn and min_low_pfn is to undefined, and fail to build. ERROR: "min_low_pfn" [drivers/block/aoe/aoe.ko] undefined! ERROR: "max_low_pfn" [drivers/block/aoe/aoe.ko] undefined! make[2]: *** [__modpost] Error 1 make[1]: *** [modules] Error 2 This patch fix this problem. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com> Cc: Kuninori Morimoto <kuninori.morimoto.gx@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02drivers/dma/ioat/dma.c: check DMA mapping error in ioat_dma_self_test()Jiang Liu
Check DMA mapping return values in function ioat_dma_self_test() to get rid of following warning message. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1203 at lib/dma-debug.c:937 check_unmap+0x4c0/0x9a0() ioatdma 0000:00:04.0: DMA-API: device driver failed to check map error[device address=0x000000085191b000] [size=2000 bytes] [mapped as single] Modules linked in: ioatdma(+) mac_hid wmi acpi_pad lp parport hidd_generic usbhid hid ixgbe isci dca libsas ahci ptp libahci scsi_transport_sas meegaraid_sas pps_core mdio CPU: 0 PID: 1203 Comm: systemd-udevd Not tainted 3.13.0-rc4+ #8 Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIIN1.86B.0044.L09.1311181644 11/18/2013 Call Trace: dump_stack+0x4d/0x66 warn_slowpath_common+0x7d/0xa0 warn_slowpath_fmt+0x4c/0x50 check_unmap+0x4c0/0x9a0 debug_dma_unmap_page+0x81/0x90 ioat_dma_self_test+0x3d2/0x680 [ioatdma] ioat3_dma_self_test+0x12/0x30 [ioatdma] ioat_probe+0xf4/0x110 [ioatdma] ioat3_dma_probe+0x268/0x410 [ioatdma] ioat_pci_probe+0x122/0x1b0 [ioatdma] local_pci_probe+0x45/0xa0 pci_device_probe+0xd9/0x130 driver_probe_device+0x171/0x490 __driver_attach+0x93/0xa0 bus_for_each_dev+0x6b/0xb0 driver_attach+0x1e/0x20 bus_add_driver+0x1f8/0x2b0 driver_register+0x81/0x110 __pci_register_driver+0x60/0x70 ioat_init_module+0x89/0x1000 [ioatdma] do_one_initcall+0xe2/0x250 load_module+0x2313/0x2a00 SyS_init_module+0xd9/0x130 system_call_fastpath+0x1a/0x1f ---[ end trace 990c591681d27c31 ]--- Mapped at: debug_dma_map_page+0xbe/0x180 ioat_dma_self_test+0x1ab/0x680 [ioatdma] ioat3_dma_self_test+0x12/0x30 [ioatdma] ioat_probe+0xf4/0x110 [ioatdma] ioat3_dma_probe+0x268/0x410 [ioatdma] Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02mm/memory-failure.c: transfer page count from head page to tail page after ↵Naoya Horiguchi
split thp Memory failures on thp tail pages cause kernel panic like below: mce: [Hardware Error]: Machine check events logged MCE exception done on CPU 7 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [<ffffffff811b7cd1>] dequeue_hwpoisoned_huge_page+0x131/0x1e0 PGD bae42067 PUD ba47d067 PMD 0 Oops: 0000 [#1] SMP ... CPU: 7 PID: 128 Comm: kworker/7:2 Tainted: G M O 3.13.0-rc4-131217-1558-00003-g83b7df08e462 #25 ... Call Trace: me_huge_page+0x3e/0x50 memory_failure+0x4bb/0xc20 mce_process_work+0x3e/0x70 process_one_work+0x171/0x420 worker_thread+0x11b/0x3a0 ? manage_workers.isra.25+0x2b0/0x2b0 kthread+0xe4/0x100 ? kthread_create_on_node+0x190/0x190 ret_from_fork+0x7c/0xb0 ? kthread_create_on_node+0x190/0x190 ... RIP dequeue_hwpoisoned_huge_page+0x131/0x1e0 CR2: 0000000000000058 The reasoning of this problem is shown below: - when we have a memory error on a thp tail page, the memory error handler grabs a refcount of the head page to keep the thp under us. - Before unmapping the error page from processes, we split the thp, where page refcounts of both of head/tail pages don't change. - Then we call try_to_unmap() over the error page (which was a tail page before). We didn't pin the error page to handle the memory error, this error page is freed and removed from LRU list. - We never have the error page on LRU list, so the first page state check returns "unknown page," then we move to the second check with the saved page flag. - The saved page flag have PG_tail set, so the second page state check returns "hugepage." - We call me_huge_page() for freed error page, then we hit the above panic. The root cause is that we didn't move refcount from the head page to the tail page after split thp. So this patch suggests to do this. This panic was introduced by commit 524fca1e73 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages"). Note that we did have the same refcount problem before this commit, but it was just ignored because we had only first page state check which returned "unknown page." The commit changed the refcount problem from "doesn't work" to "kernel panic." Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@vger.kernel.org> [3.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02MAINTAINERS: set up proper record for Xilinx ZynqMichal Simek
Setup correct zynq entry. - Add missing cadence_ttc_timer maintainership - Add zynq wildcard - Add xilinx wildcard Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02mm: remove bogus warning in copy_huge_pmd()Mel Gorman
Sasha Levin reported the following warning being triggered WARNING: CPU: 28 PID: 35287 at mm/huge_memory.c:887 copy_huge_pmd+0x145/ 0x3a0() Call Trace: copy_huge_pmd+0x145/0x3a0 copy_page_range+0x3f2/0x560 dup_mmap+0x2c9/0x3d0 dup_mm+0xad/0x150 copy_process+0xa68/0x12e0 do_fork+0x96/0x270 SyS_clone+0x16/0x20 stub_clone+0x69/0x90 This warning was introduced by "mm: numa: Avoid unnecessary disruption of NUMA hinting during migration" for paranoia reasons but the warning is bogus. I was thinking of parallel races between NUMA hinting faults and forks but this warning would also be triggered by a parallel reclaim splitting a THP during a fork. Remote the bogus warning. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02memcg: fix memcg_size() calculationVladimir Davydov
The mem_cgroup structure contains nr_node_ids pointers to mem_cgroup_per_node objects, not the objects themselves. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02mm: fix use-after-free in sys_remap_file_pagesRik van Riel
remap_file_pages calls mmap_region, which may merge the VMA with other existing VMAs, and free "vma". This can lead to a use-after-free bug. Avoid the bug by remembering vm_flags before calling mmap_region, and not trying to dereference vma later. Signed-off-by: Rik van Riel <riel@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: PaX Team <pageexec@freemail.hu> Cc: Kees Cook <keescook@chromium.org> Cc: Michel Lespinasse <walken@google.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02mm: munlock: fix deadlock in __munlock_pagevec()Vlastimil Babka
Commit 7225522bb429 ("mm: munlock: batch non-THP page isolation and munlock+putback using pagevec" introduced __munlock_pagevec() to speed up munlock by holding lru_lock over multiple isolated pages. Pages that fail to be isolated are put_page()d immediately, also within the lock. This can lead to deadlock when __munlock_pagevec() becomes the holder of the last page pin and put_page() leads to __page_cache_release() which also locks lru_lock. The deadlock has been observed by Sasha Levin using trinity. This patch avoids the deadlock by deferring put_page() operations until lru_lock is released. Another pagevec (which is also used by later phases of the function is reused to gather the pages for put_page() operation. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: Michel Lespinasse <walken@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02mm: munlock: fix a bug where THP tail page is encounteredVlastimil Babka
Since commit ff6a6da60b89 ("mm: accelerate munlock() treatment of THP pages") munlock skips tail pages of a munlocked THP page. However, when the head page already has PageMlocked unset, it will not skip the tail pages. Commit 7225522bb429 ("mm: munlock: batch non-THP page isolation and munlock+putback using pagevec") has added a PageTransHuge() check which contains VM_BUG_ON(PageTail(page)). Sasha Levin found this triggered using trinity, on the first tail page of a THP page without PageMlocked flag. This patch fixes the issue by skipping tail pages also in the case when PageMlocked flag is unset. There is still a possibility of race with THP page split between clearing PageMlocked and determining how many pages to skip. The race might result in former tail pages not being skipped, which is however no longer a bug, as during the skip the PageTail flags are cleared. However this race also affects correctness of NR_MLOCK accounting, which is to be fixed in a separate patch. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: Michel Lespinasse <walken@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-02sctp: Remove outqueue empty stateVlad Yasevich
The SCTP outqueue structure maintains a data chunks that are pending transmission, the list of chunks that are pending a retransmission and a length of data in flight. It also tries to keep the emtpy state so that it can performe shutdown sequence or notify user. The problem is that the empy state is inconsistently tracked. It is possible to completely drain the queue without sending anything when using PR-SCTP. In this case, the empty state will not be correctly state as report by Jamal Hadi Salim <jhs@mojatatu.com>. This can cause an association to be perminantly stuck in the SHUTDOWN_PENDING state. Additionally, SCTP is incredibly inefficient when setting the empty state. Even though all the data is availaible in the outqueue structure, we ignore it and walk a list of trasnports. In the end, we can completely remove the extra empty state and figure out if the queue is empty by looking at 3 things: length of pending data, length of in-flight data, and exisiting of retransmit data. All of these are already in the strucutre. Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02qlcnic: Fix resource allocation for TX queuesManish Chopra
o TX queues allocation was getting distributed equally among all the functions of the port including VFs and PF. Which was leading to failure in PF's multiple TX queues creation. o Instead of dividing queues equally allocate one TX queue for each VF as VF doesn't support multiple TX queues. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02qlcnic: Fix loopback diagnostic testManish Chopra
o Adapter requires that if the port is in loopback mode no traffic should be flowing through that port, so on arrival of Link up AEN, do not advertise Link up to the stack until port is out of loopback mode Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02Merge tag 'gfs2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes Pull GFS2 fixes from Steven Whitehouse: "Here is a set of small fixes for GFS2. There is a fix to drop s_umount which is copied in from the core vfs, two patches relate to a hard to hit "use after free" and memory leak. Two patches related to using DIO and buffered I/O on the same file to ensure correct operation in relation to glock state changes. The final patch adds an RCU read lock to ensure correct locking on an error path" * tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes: GFS2: Fix unsafe dereference in dump_holder() GFS2: Wait for async DIO in glock state changes GFS2: Fix incorrect invalidation for DIO/buffered I/O GFS2: Fix slab memory leak in gfs2_bufdata GFS2: Fix use-after-free race when calling gfs2_remove_from_ail GFS2: don't hold s_umount over blkdev_put
2014-01-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Two small bug fixes and a follow-up to the CONFIG_NR_CPUS change. A kernel compiled with CONFIG_NR_CPUS=256 will waste quite a bit of memory for the per-cpu arrays. Under z/VM the maximum number of CPUs is 64, the code now limits the possible cpu mask to 64 if running under z/VM" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: obtain function handle in hotplug notifier s390/3270: fix allocation of tty3270_screen structure s390/smp: improve setup of possible cpu mask
2014-01-02Merge tag 'renesas-fixes3-for-v3.13' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes From Simon Horman: Third Round of Renesas ARM based SoC Fixes for v3.13 * sh7372 (SH-Mobile AP4) based Mackerel board r8a773a0 (SH-Mobile AG5) based kzm9g board r8a7740 (R-Mobile A1) based Armadillo board - Correct coherent DMA mask This resolves regressions introduced by 4dcfa60071b3d23f ("ARM: DMA-API: better handing of DMA masks for coherent allocations") in v3.12-rc1. * tag 'renesas-fixes3-for-v3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: mackerel: Fix coherent DMA mask ARM: shmobile: kzm9g: Fix coherent DMA mask ARM: shmobile: armadillo: Fix coherent DMA mask Signed-off-by: Olof Johansson <olof@lixom.net>
2014-01-02KVM: nVMX: Unconditionally uninit the MMU on nested vmexitJan Kiszka
Three reasons for doing this: 1. arch.walk_mmu points to arch.mmu anyway in case nested EPT wasn't in use. 2. this aligns VMX with SVM. But 3. is most important: nested_cpu_has_ept(vmcs12) queries the VMCS page, and if one guest VCPU manipulates the page of another VCPU in L2, we may be fooled to skip over the nested_ept_uninit_mmu_context, leaving mmu in nested state. That can crash the host later on if nested_ept_get_cr3 is invoked while L1 already left vmxon and nested.current_vmcs12 became NULL therefore. Cc: stable@kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-01-02GFS2: Fix unsafe dereference in dump_holder()Tetsuo Handa
GLOCK_BUG_ON() might call this function without RCU read lock. Make sure that RCU read lock is held when using task_struct returned from pid_task(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-01ipv6: fix the use of pcpu_tstats in sitLi RongQing
when read/write the 64bit data, the correct lock should be hold. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-01net: llc: fix order of evaluation in llc_conn_ac_inc_vr_by_1Daniel Borkmann
Function llc_conn_ac_inc_vr_by_1() evaluates via macro PDU_GET_NEXT_Vr() into ... llc_sk(sk)->vR = ++llc_sk(sk)->vR & 0xffffffffffffff7f ... but the order in which the side effects take place is undefined because there is no intervening sequence point. As llc_sk(sk)->vR is written in llc_sk(sk)->vR (assignment left-hand side) and written in ++llc_sk(sk)->vR & 0xffffffffffffff7f this might possibly yield undefined behavior. The final value of llc_sk(sk)->vR is ambiguous, because, depending on the order of expression evaluation, the increment may occur before, after, or interleaved with the assignment. In C, evaluating such an expression yields undefined behavior. Since we're doing the increment via PDU_GET_NEXT_Vr() macro and the only place it is being used is from llc_conn_ac_inc_vr_by_1(), in order to increment vR by 1 with a follow-up optimized modulo, rewrite the expression into ((vR + 1) & CONST) in order to fix this. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>