linux - Linux kernel source tree

Age	Commit message (Collapse)	Author
2014-10-30	arm64: debug: don't re-enable debug exceptions on return from el1_dbg	Will Deacon
	commit 1059c6bf8534acda249e7e65c81e7696fb074dc1 upstream. When returning from a debug exception taken from EL1, we unmask debug exceptions after handling the exception. This is crucial for debug exceptions taken from EL0, so that any kernel work on the ret_to_user path can be debugged by kgdb. However, when returning back to EL1 the only thing left to do is to restore the original register state before the exception return. If single-step has been enabled by the debug exception handler, we will get stuck in an infinite debug exception loop, since we will take the step exception as soon as we unmask debug exceptions. This patch avoids unmasking debug exceptions on the debug exception return path when the exception was taken from EL1. Fixes: 2a2830703a23 (arm64: debug: avoid accessing mdscr_el1 on fault paths where possible) Reported-by: David Long <dave.long@linaro.org> Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30	x86/intel/quark: Switch off CR4.PGE so TLB flush uses CR3 instead	Bryan O'Donoghue
	commit ee1b5b165c0a2f04d2107e634e51f05d0eb107de upstream. Quark x1000 advertises PGE via the standard CPUID method PGE bits exist in Quark X1000's PTEs. In order to flush an individual PTE it is necessary to reload CR3 irrespective of the PTE.PGE bit. See Quark Core_DevMan_001.pdf section 6.4.11 This bug was fixed in Galileo kernels, unfixed vanilla kernels are expected to crash and burn on this platform. Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1411514784-14885-1-git-send-email-pure.logic@nexus-software.ie Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30	x86,kvm,vmx: Preserve CR4 across VM entry	Andy Lutomirski
	commit d974baa398f34393db76be45f7d4d04fbdbb4a0a upstream. CR4 isn't constant; at least the TSD and PCE bits can vary. TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks like it's correct. This adds a branch and a read from cr4 to each vm entry. Because it is extremely likely that consecutive entries into the same vcpu will have the same host cr4 value, this fixes up the vmcs instead of restoring cr4 after the fact. A subsequent patch will add a kernel-wide cr4 shadow, reducing the overhead in the common case to just two memory reads and a branch. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30	KVM: s390: unintended fallthrough for external call	Christian Borntraeger
	commit f346026e55f1efd3949a67ddd1dcea7c1b9a615e upstream. We must not fallthrough if the conditions for external call are not met. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30	KVM: do not bias the generation number in kvm_current_mmio_generation	Paolo Bonzini
	commit 00f034a12fdd81210d58116326d92780aac5c238 upstream. The next patch will give a meaning (a la seqcount) to the low bit of the generation number. Ensure that it matches between kvm->memslots->generation and kvm_current_mmio_generation(). Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30	kvm: fix potentially corrupt mmio cache	David Matlack
	commit ee3d1570b58677885b4552bce8217fda7b226a68 upstream. vcpu exits and memslot mutations can run concurrently as long as the vcpu does not aquire the slots mutex. Thus it is theoretically possible for memslots to change underneath a vcpu that is handling an exit. If we increment the memslot generation number again after synchronize_srcu_expedited(), vcpus can safely cache memslot generation without maintaining a single rcu_dereference through an entire vm exit. And much of the x86/kvm code does not maintain a single rcu_dereference of the current memslots during each exit. We can prevent the following case: vcpu (CPU 0) \| thread (CPU 1) --------------------------------------------+-------------------------- 1 vm exit \| 2 srcu_read_unlock(&kvm->srcu) \| 3 decide to cache something based on \| old memslots \| 4 \| change memslots \| (increments generation) 5 \| synchronize_srcu(&kvm->srcu); 6 retrieve generation # from new memslots \| 7 tag cache with new memslot generation \| 8 srcu_read_unlock(&kvm->srcu) \| ... \| <action based on cache occurs even \| though the caching decision was based \| on the old memslots> \| ... \| <action continues to occur until next \| memslot generation change, which may \| be never> \| \| By incrementing the generation after synchronizing with kvm->srcu readers, we ensure that the generation retrieved in (6) will become invalid soon after (8). Keeping the existing increment is not strictly necessary, but we do keep it and just move it for consistency from update_memslots to install_new_memslots. It invalidates old cached MMIOs immediately, instead of having to wait for the end of synchronize_srcu_expedited, which makes the code more clearly correct in case CPU 1 is preempted right after synchronize_srcu() returns. To avoid halving the generation space in SPTEs, always presume that the low bit of the generation is zero when reconstructing a generation number out of an SPTE. This effectively disables MMIO caching in SPTEs during the call to synchronize_srcu_expedited. Using the low bit this way is somewhat like a seqcount---where the protected thing is a cache, and instead of retrying we can simply punt if we observe the low bit to be 1. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30	kvm: x86: fix stale mmio cache bug	David Matlack
	commit 56f17dd3fbc44adcdbc3340fe3988ddb833a47a7 upstream. The following events can lead to an incorrect KVM_EXIT_MMIO bubbling up to userspace: (1) Guest accesses gpa X without a memory slot. The gfn is cached in struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets the SPTE write-execute-noread so that future accesses cause EPT_MISCONFIGs. (2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION covering the page just accessed. (3) Guest attempts to read or write to gpa X again. On Intel, this generates an EPT_MISCONFIG. The memory slot generation number that was incremented in (2) would normally take care of this but we fast path mmio faults through quickly_check_mmio_pf(), which only checks the per-vcpu mmio cache. Since we hit the cache, KVM passes a KVM_EXIT_MMIO up to userspace. This patch fixes the issue by using the memslot generation number to validate the mmio cache. Signed-off-by: David Matlack <dmatlack@google.com> [xiaoguangrong: adjust the code to make it simpler for stable-tree fix.] Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Tested-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-09	vgaarb: Don't default exclusively to first video device with mem+io	Bruno Prémont
	commit 86fd887b7fe350819dae5b55e7fef05b511c8656 upstream. Commit 20cde694027e ("x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup()") moved boot video device detection from efifb to x86 and ia64 pci/fixup.c. For dual-GPU Apple computers above change represents a regression as code in efifb did forcefully override vga_default_device while the merge did not (vgaarb happens prior to PCI fixup). To improve on initial device selection by vgaarb (it cannot know if PCI device not behind bridges see/decode legacy VGA I/O or not), move the screen_info based check from pci_video_fixup() to vgaarb's init function and use it to refine/override decision taken while adding the individual PCI VGA devices. This way PCI fixup has no reason to adjust vga_default_device anymore but can depend on its value for flagging shadowed VBIOS. This has the nice benefit of removing duplicated code but does introduce a #if defined() block in vgaarb. Not all architectures have screen_info and would cause compile to fail without it. Link: https://bugzilla.kernel.org/show_bug.cgi?id=84461 Reported-and-Tested-By: Andreas Noever <andreas.noever@gmail.com> Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-09	x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup()	Bruno Prémont
	commit 20cde694027e7477cc532833e38ab9fcaa83fb64 upstream. Commit b4aa0163056b ("efifb: Implement vga_default_device() (v2)") added efifb vga_default_device() so EFI systems that do not load shadow VBIOS or setup VGA get proper value for boot_vga PCI sysfs attribute on the corresponding PCI device. Xorg doesn't detect devices when boot_vga=0, e.g., on some EFI systems such as MacBookAir2,1. Xorg detects the GPU and finds the DRI device but then bails out with "no devices detected". Note: When vga_default_device() is set boot_vga PCI sysfs attribute reflects its state. When unset this attribute is 1 whenever IORESOURCE_ROM_SHADOW flag is set. With introduction of sysfb/simplefb/simpledrm efifb is getting obsolete while having native drivers for the GPU also makes selecting sysfb/efifb optional. Remove the efifb implementation of vga_default_device() and initialize vgaarb's vga_default_device() with the PCI GPU that matches boot screen_info in pci_fixup_video(). [bhelgaas: remove unused "dev" in efifb_setup()] Fixes: b4aa0163056b ("efifb: Implement vga_default_device() (v2)") Tested-by: Anibal Francisco Martinez Cortina <linuxkid.zeuz@gmail.com> Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: DRA7: Add support for soc_is_dra74x() and soc_is_dra72x() variants	Rajendra Nayak
	commit af438fec6cb99fc2e2faf8b16b865af26ce722e6 upstream. Use the corresponding compatibles to identify the devices. Signed-off-by: Rajendra Nayak <rnayak@ti.com> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	arm: armv7: perf: fix armv7 ref-cycles error	Zhiqiang Zhang
	ref-cycles event is specially to Intel core, but can still used in arm architecture with the wrong return value with 3.10 stable. this patch fix the bug and make it return NOT SUPPORTED distinctly. In upstream this bug has been fixed by other way, which changes more than one file and more than 1000 lines. the primary commit is 6b7658ec8a100b608e59e3cde353434db51f5be0. besides we can not simply cherry-pick. Signed-off-by: Zhiqiang Zhang <zhangzhiqiang.zhang@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com Cc: Will Deacon <will.deacon@arm.com> Cc: Christopher Covington <cov@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds	John David Anglin
	commit d26a7730b5874a5fa6779c62f4ad7c5065a94723 upstream. In spite of what the GCC manual says, the -mfast-indirect-calls has never been supported in the 64-bit parisc compiler. Indirect calls have always been done using function descriptors irrespective of the -mfast-indirect-calls option. Recently, it was noticed that a function descriptor was always requested when the -mfast-indirect-calls option was specified. This caused problems when the option was used in application code and doesn't make any sense because the whole point of the option is to avoid using a function descriptor for indirect calls. Fixing this broke 64-bit kernel builds. I will fix GCC but for now we need the attached change. This results in the same kernel code as before. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	parisc: Implement new LWS CAS supporting 64 bit operations.	Guy Martin
	commit 89206491201cbd1571009b36292af781cef74c1b upstream. The current LWS cas only works correctly for 32bit. The new LWS allows for CAS operations of variable size. Signed-off-by: Guy Martin <gmsoft@tuxicoman.be> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	powerpc: Add smp_mb()s to arch_spin_unlock_wait()	Michael Ellerman
	commit 78e05b1421fa41ae8457701140933baa5e7d9479 upstream. Similar to the previous commit which described why we need to add a barrier to arch_spin_is_locked(), we have a similar problem with spin_unlock_wait(). We need a barrier on entry to ensure any spinlock we have previously taken is visibly locked prior to the load of lock->slock. It's also not clear if spin_unlock_wait() is intended to have ACQUIRE semantics. For now be conservative and add a barrier on exit to give it ACQUIRE semantics. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	powerpc: Add smp_mb() to arch_spin_is_locked()	Michael Ellerman
	commit 51d7d5205d3389a32859f9939f1093f267409929 upstream. The kernel defines the function spin_is_locked(), which can be used to check if a spinlock is currently locked. Using spin_is_locked() on a lock you don't hold is obviously racy. That is, even though you may observe that the lock is unlocked, it may become locked at any time. There is (at least) one exception to that, which is if two locks are used as a pair, and the holder of each checks the status of the other before doing any update. Assuming A and B are two locks, and COUNTER is a shared non-atomic value: The first CPU does: spin_lock(A) if spin_is_locked(B) # nothing else smp_mb() LOAD r = COUNTER r++ STORE COUNTER = r spin_unlock(A) And the second CPU does: spin_lock(B) if spin_is_locked(A) # nothing else smp_mb() LOAD r = COUNTER r++ STORE COUNTER = r spin_unlock(B) Although this is a strange locking construct, it should work. It seems to be understood, but not documented, that spin_is_locked() is not a memory barrier, so in the examples above and below the caller inserts its own memory barrier before acting on the result of spin_is_locked(). For now we assume spin_is_locked() is implemented as below, and we break it out in our examples: bool spin_is_locked(LOCK) { LOAD l = LOCK return l.locked } Our intuition is that there should be no problem even if the two code sequences run simultaneously such as: CPU 0 CPU 1 ================================================== spin_lock(A) spin_lock(B) LOAD b = B LOAD a = A if b.locked # true if a.locked # true # nothing # nothing spin_unlock(A) spin_unlock(B) If one CPU gets the lock before the other then it will do the update and the other CPU will back off: CPU 0 CPU 1 ================================================== spin_lock(A) LOAD b = B spin_lock(B) if b.locked # false LOAD a = A else if a.locked # true smp_mb() # nothing LOAD r1 = COUNTER spin_unlock(B) r1++ STORE COUNTER = r1 spin_unlock(A) However in reality spin_lock() itself is not indivisible. On powerpc we implement it as a load-and-reserve and store-conditional. Ignoring the retry logic for the lost reservation case, it boils down to: spin_lock(LOCK) { LOAD l = LOCK l.locked = true STORE LOCK = l ACQUIRE_BARRIER } The ACQUIRE_BARRIER is required to give spin_lock() ACQUIRE semantics as defined in memory-barriers.txt: This acts as a one-way permeable barrier. It guarantees that all memory operations after the ACQUIRE operation will appear to happen after the ACQUIRE operation with respect to the other components of the system. On modern powerpc systems we use lwsync for ACQUIRE_BARRIER. lwsync is also know as "lightweight sync", or "sync 1". As described in Power ISA v2.07 section B.2.1.1, in this scenario the lwsync is not the barrier itself. It instead causes the LOAD of LOCK to act as the barrier, preventing any loads or stores in the locked region from occurring prior to the load of LOCK. Whether this behaviour is in accordance with the definition of ACQUIRE semantics in memory-barriers.txt is open to discussion, we may switch to a different barrier in future. What this means in practice is that the following can occur: CPU 0 CPU 1 ================================================== LOAD a = A LOAD b = B a.locked = true b.locked = true LOAD b = B LOAD a = A STORE A = a STORE B = b if b.locked # false if a.locked # false else else smp_mb() smp_mb() LOAD r1 = COUNTER LOAD r2 = COUNTER r1++ r2++ STORE COUNTER = r1 STORE COUNTER = r2 # Lost update spin_unlock(A) spin_unlock(B) That is, the load of B can occur prior to the store that makes A visibly locked. And similarly for CPU 1. The result is both CPUs hold their lock and believe the other lock is unlocked. The easiest fix for this is to add a full memory barrier to the start of spin_is_locked(), so adding to our previous definition would give us: bool spin_is_locked(LOCK) { smp_mb() LOAD l = LOCK return l.locked } The new barrier orders the store to the lock we are locking vs the load of the other lock: CPU 0 CPU 1 ================================================== LOAD a = A LOAD b = B a.locked = true b.locked = true STORE A = a STORE B = b smp_mb() smp_mb() LOAD b = B LOAD a = A if b.locked # true if a.locked # true # nothing # nothing spin_unlock(A) spin_unlock(B) Although the above example is theoretical, there is code similar to this example in sem_lock() in ipc/sem.c. This commit in addition to the next commit appears to be a fix for crashes we are seeing in that code where we believe this race happens in practice. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	powerpc/perf: Fix ABIv2 kernel backtraces	Anton Blanchard
	commit 85101af13bb854a6572fa540df7c7201958624b9 upstream. ABIv2 kernels are failing to backtrace through the kernel. An example: 39.30% readseek2_proce [kernel.kallsyms] [k] find_get_entry \| --- find_get_entry __GI___libc_read The problem is in valid_next_sp() where we check that the new stack pointer is at least STACK_FRAME_OVERHEAD below the previous one. ABIv1 has a minimum stack frame size of 112 bytes consisting of 48 bytes and 64 bytes of parameter save area. ABIv2 changes that to 32 bytes with no paramter save area. STACK_FRAME_OVERHEAD is in theory the minimum stack frame size, but we over 240 uses of it, some of which assume that it includes space for the parameter area. We need to work through all our stack defines and rationalise them but let's fix perf now by creating STACK_FRAME_MIN_SIZE and using in valid_next_sp(). This fixes the issue: 30.64% readseek2_proce [kernel.kallsyms] [k] find_get_entry \| --- find_get_entry pagecache_get_page generic_file_read_iter new_sync_read vfs_read sys_read syscall_exit __GI___libc_read Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	sched: Fix unreleased llc_shared_mask bit during CPU hotplug	Wanpeng Li
	commit 03bd4e1f7265548832a76e7919a81f3137c44fd1 upstream. The following bug can be triggered by hot adding and removing a large number of xen domain0's vcpus repeatedly: BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group PGD 5a9d5067 PUD 13067 PMD 0 Oops: 0000 [#3] SMP [...] Call Trace: load_balance ? _raw_spin_unlock_irqrestore idle_balance __schedule schedule schedule_timeout ? lock_timer_base schedule_timeout_uninterruptible msleep lock_device_hotplug_sysfs online_store dev_attr_store sysfs_write_file vfs_write SyS_write system_call_fastpath Last level cache shared mask is built during CPU up and the build_sched_domain() routine takes advantage of it to setup the sched domain CPU topology. However, llc_shared_mask is not released during CPU disable, which leads to an invalid sched domainCPU topology. This patch fix it by releasing the llc_shared_mask correctly during CPU disable. Yasuaki also reported that this can happen on real hardware: https://lkml.org/lkml/2014/7/22/1018 His case is here: == Here is an example on my system. My system has 4 sockets and each socket has 15 cores and HT is enabled. In this case, each core of sockes is numbered as follows: \| CPU# Socket#0 \| 0-14 , 60-74 Socket#1 \| 15-29, 75-89 Socket#2 \| 30-44, 90-104 Socket#3 \| 45-59, 105-119 Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-44 and 90-104. When hot-removing socket#2 and #3, each core of sockets is numbered as follows: \| CPU# Socket#0 \| 0-14 , 60-74 Socket#1 \| 15-29, 75-89 But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains having 0x3fff80000001fffc0000000. After that, when hot-adding socket#2 and #3, each core of sockets is numbered as follows: \| CPU# Socket#0 \| 0-14 , 60-74 Socket#1 \| 15-29, 75-89 Socket#2 \| 30-59 Socket#3 \| 90-119 Then llc_shared_mask of CPU#30 becomes 0x3fff8000fffffffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-59 and 90-104. So the mask has the wrong value. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Tested-by: Linn Crosetto <linn@hp.com> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	x86/kaslr: Avoid the setup_data area when picking location	Kees Cook
	commit 0cacbfbeb5077b63d5d3cf6df88b14ac12ad584b upstream. The KASLR location-choosing logic needs to avoid the setup_data list memory areas as well. Without this, it would be possible to have the ASLR position stomp on the memory, ultimately causing the boot to fail. Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Pavel Machek <pavel@ucw.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140911161931.GA12001@www.outflux.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8	Dave Young
	commit 3eddc69ffeba092d288c386646bfa5ec0fce25fd upstream. 3.16 kernel boot fail with earlyprintk=efi, it keeps scrolling at the bottom line of screen. Bisected, the first bad commit is below: commit 86dfc6f339886559d80ee0d4bd20fe5ee90450f0 Author: Lv Zheng <lv.zheng@intel.com> Date: Fri Apr 4 12:38:57 2014 +0800 ACPICA: Tables: Fix table checksums verification before installation. I did some debugging by enabling both serial and efi earlyprintk, below is some debug dmesg, seems early_ioremap fails in scroll up function due to no free slot, see below dmesg output: WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:116 __early_ioremap+0x90/0x1c4() __early_ioremap(ed00c800, 00000c80) not found slot Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 3.17.0-rc1+ #204 Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS J61 v03.15 05/09/2013 Call Trace: dump_stack+0x4e/0x7a warn_slowpath_common+0x75/0x8e ? __early_ioremap+0x90/0x1c4 warn_slowpath_fmt+0x47/0x49 __early_ioremap+0x90/0x1c4 ? sprintf+0x46/0x48 early_ioremap+0x13/0x15 early_efi_map+0x24/0x26 early_efi_scroll_up+0x6d/0xc0 early_efi_write+0x1b0/0x214 call_console_drivers.constprop.21+0x73/0x7e console_unlock+0x151/0x3b2 ? vprintk_emit+0x49f/0x532 vprintk_emit+0x521/0x532 ? console_unlock+0x383/0x3b2 printk+0x4f/0x51 acpi_os_vprintf+0x2b/0x2d acpi_os_printf+0x43/0x45 acpi_info+0x5c/0x63 ? __acpi_map_table+0x13/0x18 ? acpi_os_map_iomem+0x21/0x147 acpi_tb_print_table_header+0x177/0x186 acpi_tb_install_table_with_override+0x4b/0x62 acpi_tb_install_standard_table+0xd9/0x215 ? early_ioremap+0x13/0x15 ? __acpi_map_table+0x13/0x18 acpi_tb_parse_root_table+0x16e/0x1b4 acpi_initialize_tables+0x57/0x59 acpi_table_init+0x50/0xce acpi_boot_table_init+0x1e/0x85 setup_arch+0x9b7/0xcc4 start_kernel+0x94/0x42d ? early_idt_handlers+0x120/0x120 x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0xf3/0x100 Quote reply from Lv.zheng about the early ioremap slot usage in this case: """ In early_efi_scroll_up(), 2 mapping entries will be used for the src/dst screen buffer. In drivers/acpi/acpica/tbutils.c, we've improved the early table loading code in acpi_tb_parse_root_table(). We now need 2 mapping entries: 1. One mapping entry is used for RSDT table mapping. Each RSDT entry contains an address for another ACPI table. 2. For each entry in RSDP, we need another mapping entry to map the table to perform necessary check/override before installing it. When acpi_tb_parse_root_table() prints something through EFI earlyprintk console, we'll have 4 mapping entries used. The current 4 slots setting of early_ioremap() seems to be too small for such a use case. """ Thus increase the slot to 8 in this patch to fix this issue. boot-time mappings become 512 page with this patch. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	x86/xen: don't copy bogus duplicate entries into kernel page tables	Stefan Bader
	commit 0b5a50635fc916cf46e3de0b819a61fc3f17e7ee upstream. When RANDOMIZE_BASE (KASLR) is enabled; or the sum of all loaded modules exceeds 512 MiB, then loading modules fails with a warning (and hence a vmalloc allocation failure) because the PTEs for the newly-allocated vmalloc address space are not zero. WARNING: CPU: 0 PID: 494 at linux/mm/vmalloc.c:128 vmap_page_range_noflush+0x2a1/0x360() This is caused by xen_setup_kernel_pagetables() copying level2_kernel_pgt into level2_fixmap_pgt, overwriting many non-present entries. Without KASLR, the normal kernel image size only covers the first half of level2_kernel_pgt and module space starts after that. L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[ 0..255]->kernel [256..511]->module [511]->level2_fixmap_pgt[ 0..505]->module This allows 512 MiB of of module vmalloc space to be used before having to use the corrupted level2_fixmap_pgt entries. With KASLR enabled, the kernel image uses the full PUD range of 1G and module space starts in the level2_fixmap_pgt. So basically: L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[0..511]->kernel [511]->level2_fixmap_pgt[0..505]->module And now no module vmalloc space can be used without using the corrupt level2_fixmap_pgt entries. Fix this by properly converting the level2_fixmap_pgt entries to MFNs, and setting level1_fixmap_pgt as read-only. A number of comments were also using the the wrong L3 offset for level2_kernel_pgt. These have been corrected. Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	KVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags	Christian Borntraeger
	commit 1951497d90d6754201af3e65241a06f9ef6755cd upstream. commit 0944fe3f4a32 ("s390/mm: implement software referenced bits") triggered another paging/storage key corruption. There is an unhandled invalid->valid pte change where we have to set the real storage key from the pgste. When doing paging a guest page might be swapcache or swap and when faulted in it might be read-only and due to a parallel scan old. An do_wp_page will make it writeable and young. Due to software reference tracking this page was invalid and now becomes valid. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	KVM: s390/mm: Fix storage key corruption during swapping	Christian Borntraeger
	commit 3e03d4c46daa849880837d802e41c14132a03ef9 upstream. Since 3.12 or more precisely commit 0944fe3f4a32 ("s390/mm: implement software referenced bits") guest storage keys get corrupted during paging. This commit added another valid->invalid translation for page tables - namely ptep_test_and_clear_young. We have to transfer the storage key into the pgste in that case. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	KVM: s390/mm: try a cow on read only pages for key ops	Christian Borntraeger
	commit ab3f285f227fec62868037e9b1b1fd18294a83b8 upstream. The PFMF instruction handler blindly wrote the storage key even if the page was mapped R/O in the host. Lets try a COW before continuing and bail out in case of errors. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	KVM: s390: Fix user triggerable bug in dead code	Christian Borntraeger
	commit 614a80e474b227cace52fd6e3c790554db8a396e upstream. In the early days, we had some special handling for the KVM_EXIT_S390_SIEIC exit, but this was gone in 2009 with commit d7b0b5eb3000 (KVM: s390: Make psw available on all exits, not just a subset). Now this switch statement is just a sanity check for userspace not messing with the kvm_run structure. Unfortunately, this allows userspace to trigger a kernel BUG. Let's just remove this switch statement. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	arm64: Add brackets around user_stack_pointer()	Catalin Marinas
	commit 2520d039728b2a3c5ae7f79fe2a0e9d182855b12 upstream. Commit 5f888a1d33 (ARM64: perf: support dwarf unwinding in compat mode) changes user_stack_pointer() to return the compat SP for 32-bit tasks but without brackets around the whole definition, with possible issues on the call sites (noticed with a subsequent fix for KSTK_ESP). Fixes: 5f888a1d33c4 (ARM64: perf: support dwarf unwinding in compat mode) Reported-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	MIPS: mcount: Adjust stack pointer for static trace in MIPS32	Markos Chandras
	commit 8a574cfa2652545eb95595d38ac2a0bb501af0ae upstream. Every mcount() call in the MIPS 32-bit kernel is done as follows: [...] move at, ra jal _mcount addiu sp, sp, -8 [...] but upon returning from the mcount() function, the stack pointer is not adjusted properly. This is explained in details in 58b69401c797 (MIPS: Function tracer: Fix broken function tracing). Commit ad8c396936e3 ("MIPS: Unbreak function tracer for 64-bit kernel.) fixed the stack manipulation for 64-bit but it didn't fix it completely for MIPS32. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/7792/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	MIPS: Fix MFC1 & MFHC1 emulation for 64-bit MIPS systems	Paul Burton
	commit c8c0da6bdf0f0d6f59fc23aab6ee373a131df82d upstream. Commit bbd426f542cb "MIPS: Simplify FP context access" modified the SIFROMREG & SIFROMHREG macros such that they return unsigned rather than signed 32b integers. I had believed that to be fine, but inadvertently missed the MFC1 & MFHC1 cases which write to a struct pt_regs regs element. On MIPS32 this is fine, but on 64 bit those saved regs' fields are 64 bit wide. Using unsigned values caused the 32 bit value from the FP register to be zero rather than sign extended as the architecture specifies, causing incorrect emulation of the MFC1 & MFHc1 instructions. Fix by reintroducing the casts to signed integers, and therefore the sign extension. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/7848/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	MIPS: ZBOOT: add missing <linux/string.h> include	Aurelien Jarno
	commit 29593fd5a8149462ed6fad0d522234facdaee6c8 upstream. Commit dc4d7b37 (MIPS: ZBOOT: gather string functions into string.c) moved the string related functions into a separate file, which might cause the following build error, depending on the configuration: \| CC arch/mips/boot/compressed/decompress.o \| In file included from linux/arch/mips/boot/compressed/../../../../lib/decompress_unxz.c:234:0, \| from linux/arch/mips/boot/compressed/decompress.c:67: \| linux/arch/mips/boot/compressed/../../../../lib/xz/xz_dec_stream.c: In function 'fill_temp': \| linux/arch/mips/boot/compressed/../../../../lib/xz/xz_dec_stream.c:162:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration] \| cc1: some warnings being treated as errors \| linux/scripts/Makefile.build:308: recipe for target 'arch/mips/boot/compressed/decompress.o' failed \| make[6]: *** [arch/mips/boot/compressed/decompress.o] Error 1 \| linux/arch/mips/Makefile:308: recipe for target 'vmlinuz' failed It does not fail with the standard configuration, as when CONFIG_DYNAMIC_DEBUG is not enabled <linux/string.h> gets included in include/linux/dynamic_debug.h. There might be other ways for it to get indirectly included. We can't add the include directly in xz_dec_stream.c as some architectures might want to use a different version for the boot/ directory (see for example arch/x86/boot/string.h). Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/7420/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: 8178/1: fix set_tls for !CONFIG_KUSER_HELPERS	Nathan Lynch
	commit 9cc6d9e5daaa147a9a3e31557efcb331989e77be upstream. Joachim Eastwood reports that commit fbfb872f5f41 "ARM: 8148/1: flush TLS and thumbee register state during exec" causes a boot-time crash on a Cortex-M4 nommu system: Freeing unused kernel memory: 68K (281e5000 - 281f6000) Unhandled exception: IPSR = 00000005 LR = fffffff1 CPU: 0 PID: 1 Comm: swapper Not tainted 3.17.0-rc6-00313-gd2205fa30aa7 #191 task: 29834000 ti: 29832000 task.ti: 29832000 PC is at flush_thread+0x2e/0x40 LR is at flush_thread+0x21/0x40 pc : [<2800954a>] lr : [<2800953d>] psr: 4100000b sp : 29833d60 ip : 00000000 fp : 00000001 r10: 00003cf8 r9 : 29b1f000 r8 : 00000000 r7 : 29b0bc00 r6 : 29834000 r5 : 29832000 r4 : 29832000 r3 : ffff0ff0 r2 : 29832000 r1 : 00000000 r0 : 282121f0 xPSR: 4100000b CPU: 0 PID: 1 Comm: swapper Not tainted 3.17.0-rc6-00313-gd2205fa30aa7 #191 [<2800afa5>] (unwind_backtrace) from [<2800a327>] (show_stack+0xb/0xc) [<2800a327>] (show_stack) from [<2800a963>] (__invalid_entry+0x4b/0x4c) The problem is that set_tls is attempting to clear the TLS location in the kernel-user helper page, which isn't set up on V7M. Fix this by guarding the write to the kuser helper page with a CONFIG_KUSER_HELPERS ifdef. Fixes: fbfb872f5f41 ARM: 8148/1: flush TLS and thumbee register state during exec Reported-by: Joachim Eastwood <manabian@gmail.com> Tested-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: 8165/1: alignment: don't break misaligned NEON load/store	Robin Murphy
	commit 5ca918e5e3f9df4634077c06585c42bc6a8d699a upstream. The alignment fixup incorrectly decodes faulting ARM VLDn/VSTn instructions (where the optional alignment hint is given but incorrect) as LDR/STR, leading to register corruption. Detect these and correctly treat them as unhandled, so that userspace gets the fault it expects. Reported-by: Simon Hosie <simon.hosie@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: imx: fix .is_enabled() of shared gate clock	Shawn Guo
	commit 9e1ac462b982f496ed3b491f02c417dcc8e40347 upstream. Commit 63288b721a80 ("ARM: imx: fix shared gate clock") attempted to fix an issue with particular enable/disable sequence from two shared gate clocks. But unfortunately, while it partially fixed the issue, it also did something wrong in .is_enabled() function hook. In case of shared gate, the function shouldn't really query the hardware state via share_count, because the function is trying to query the enabling state of the clock in question, not the hardware state which is shared by multiple clocks. Fix the issue by returning the enable_count of the clock itself which is maintained by clock core, in case it's a clock sharing hardware gate with others. As the result, the initialization of share_count per hardware state is not needed now. So remove it. Reported-by: Fabio Estevam <fabio.estevam@freescale.com> Fixes: 63288b721a80 ("ARM: imx: fix shared gate clock") Signed-off-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: DT: imx53: fix lvds channel 1 port	Markus Niebel
	commit 1b134c9c4b555342be667f144ee714af1c3f6a9f upstream. using LVDS channel 1 on an i.MX53 leads to following error: imx-ldb 53fa8008.ldb: unable to set di0 parent clock to ldb_di1 This comes from imx_ldb_set_clock with mux = 0. Mux parameter must be "1" for reparenting di1 clock to ldb_di1. The value of the mux param comes from device tree port settings. On i.MX5, the internal two-input-multiplexer is used. Due to hardware limitations, only one port (port@[0,1]) can be used for each channel (lvds-channel@[0,1], respectively) Documentation update suggested by Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Markus Niebel <Markus.Niebel@tq-group.com> Fixes: e05c8c9a790a ("ARM: dts: imx53: Add IPU DI ports and endpoints, move imx-drm node to dtsi") Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Shawn Guo <shawn.guo@freescale.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: dts: dra7-evm: Fix NAND GPMC timings	Roger Quadros
	commit 5990047cef0c6a36eefcb166bd32d99a8f95c75b upstream. The nand timings were scaled down by 2 to account for the 2x rate returned by clk_get_rate(gpmc_fclk). As the clock data got fixed by [1], revert back to actual timings (i.e. scale them up by 2). Without this NAND doesn't work on dra7-evm. [1] - commit dd94324b983afe114ba9e7ee3649313b451f63ce ARM: dts: dra7xx-clocks: Fix the l3 and l4 clock rates Fixes: ff66a3c86e00 ("ARM: dts: dra7: add support for parallel NAND flash") Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: 8149/1: perf: Don't sleep while atomic when enabling per-cpu interrupts	Stephen Boyd
	commit 505013bc9065391f09a51d51cd3bf0b06dfb570a upstream. Rob Clark reports a sleeping while atomic bug when using perf. BUG: sleeping function called from invalid context at ../kernel/locking/mutex.c:583 in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/0 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 4828 at ../kernel/locking/mutex.c:479 mutex_lock_nested+0x3a0/0x3e8() DEBUG_LOCKS_WARN_ON(in_interrupt()) Modules linked in: CPU: 2 PID: 4828 Comm: Xorg.bin Tainted: G W 3.17.0-rc3-00234-gd535c45-dirty #819 [<c0216690>] (unwind_backtrace) from [<c0212174>] (show_stack+0x10/0x14) [<c0212174>] (show_stack) from [<c0867cc0>] (dump_stack+0x98/0xb8) [<c0867cc0>] (dump_stack) from [<c02492a4>] (warn_slowpath_common+0x70/0x8c) [<c02492a4>] (warn_slowpath_common) from [<c02492f0>] (warn_slowpath_fmt+0x30/0x40) [<c02492f0>] (warn_slowpath_fmt) from [<c086a3f8>] (mutex_lock_nested+0x3a0/0x3e8) [<c086a3f8>] (mutex_lock_nested) from [<c0294d08>] (irq_find_host+0x20/0x9c) [<c0294d08>] (irq_find_host) from [<c0769d50>] (of_irq_get+0x28/0x48) [<c0769d50>] (of_irq_get) from [<c057d104>] (platform_get_irq+0x1c/0x8c) [<c057d104>] (platform_get_irq) from [<c021a06c>] (cpu_pmu_enable_percpu_irq+0x14/0x38) [<c021a06c>] (cpu_pmu_enable_percpu_irq) from [<c02b1634>] (flush_smp_call_function_queue+0x88/0x178) [<c02b1634>] (flush_smp_call_function_queue) from [<c0214dc0>] (handle_IPI+0x88/0x160) [<c0214dc0>] (handle_IPI) from [<c0208930>] (gic_handle_irq+0x64/0x68) [<c0208930>] (gic_handle_irq) from [<c0212d04>] (__irq_svc+0x44/0x5c) Exception stack(0xe63ddea0 to 0xe63ddee8) dea0: 00000001 00000001 00000000 c2f3b200 c16db380 c032d4a0 e63ddf40 60010013 dec0: 00000000 001fbfd4 00000100 00000000 00000001 e63ddee8 c0284770 c02a2e30 dee0: 20010013 ffffffff [<c0212d04>] (__irq_svc) from [<c02a2e30>] (ktime_get_ts64+0x1c8/0x200) [<c02a2e30>] (ktime_get_ts64) from [<c032d4a0>] (poll_select_set_timeout+0x60/0xa8) [<c032d4a0>] (poll_select_set_timeout) from [<c032df64>] (SyS_select+0xa8/0x118) [<c032df64>] (SyS_select) from [<c020e8e0>] (ret_fast_syscall+0x0/0x48) ---[ end trace 0bb583b46342da6f ]--- INFO: lockdep is turned off. We don't really need to get the platform irq again when we're enabling or disabling the per-cpu irq. Furthermore, we don't really need to set and clear bits in the active_irqs bitmask because that's only used in the non-percpu irq case to figure out when the last CPU PMU has been disabled. Just pass the irq directly to the enable/disable functions to clean all this up. This should be slightly more efficient and also fix the scheduling while atomic bug. Fixes: bbd64559376f "ARM: perf: support percpu irqs for the CPU PMU" Reported-by: Rob Clark <robdclark@gmail.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: 8148/1: flush TLS and thumbee register state during exec	Nathan Lynch
	commit fbfb872f5f417cea48760c535e0ff027c88b507a upstream. The TPIDRURO and TPIDRURW registers need to be flushed during exec; otherwise TLS information is potentially leaked. TPIDRURO in particular needs careful treatment. Since flush_thread basically needs the same code used to set the TLS in arm_syscall, pull that into a common set_tls helper in tls.h and use it in both places. Similarly, TEEHBR needs to be cleared during exec as well. Clearing its save slot in thread_info isn't right as there is no guarantee that a thread switch will occur before the new program runs. Just setting the register directly is sufficient. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: 8133/1: use irq_set_affinity with force=false when migrating irqs	Sudeep Holla
	commit a040803a9d6b8c1876d3487a5cb69602ebcbb82c upstream. Since commit 1dbfa187dad ("ARM: irq migration: force migration off CPU going down") the ARM interrupt migration code on cpu offline calls irqchip.irq_set_affinity() with the argument force=true. At the point of this change the argument had no effect because it was not used by any interrupt chip driver and there was no semantics defined. This changed with commit 01f8fa4f01d8 ("genirq: Allow forcing cpu affinity of interrupts") which made the force argument useful to route interrupts to not yet online cpus without checking the target cpu against the cpu online mask. The following commit ffde1de64012 ("irqchip: gic: Support forced affinity setting") implemented this for the GIC interrupt controller. As a consequence the ARM cpu offline irq migration fails if CPU0 is offlined, because CPU0 is still set in the affinity mask and the validataion against cpu online mask is skipped to the force argument being true. The following first_cpu(mask) selection always selects CPU0 as the target. Solve the issue by calling irq_set_affinity() with force=false from the CPU offline irq migration code so the GIC driver validates the affinity mask against CPU online mask and therefore removes CPU0 from the possible target candidates. Tested on TC2 hotpluging CPU0 in and out. Without this patch the system locks up as the IRQs are not migrated away from CPU0. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: dts: dra7-evm: Fix spi1 mux documentation	Nishanth Menon
	commit 68e4d9e58dbae2fb178e8b74806f521adb16f0d3 upstream. While auditing the various pin ctrl configurations using the following command: grep PIN_ arch/arm/boot/dts/dra7-evm.dts\|(while read line; do v=`echo "$line" \| sed -e "s/\s\s*/\|/g" \| cut -d '\|' -f1 \| cut -d 'x' -f2\|tr [a-z] [A-Z]`; HEX=`echo "obase=16;ibase=16;4A003400+$v"\| bc`; echo "$HEX ===> $line"; done) against DRA75x/74x NDA TRM revision S(SPRUHI2S August 2014), documentation errors were found for spi1 pinctrl. Fix the same. Fixes: 6e58b8f1daaf1af ("ARM: dts: DRA7: Add the dts files for dra7 SoC and dra7-evm board") Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: edma: Fix configuration parsing for SoCs with multiple eDMA3 CC	Peter Ujfalusi
	commit 929a015b1809a30748d487f9d25b16a41434b61a upstream. The edma_setup_from_hw() should know about the CC number when parsing the CCCFG register - when it reads the register to be precise. The base addresses for CCs stored in an array and we need to provide the correct id to edma_read() in order to read the correct register. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: dts: DRA7: fix interrupt-cells for GPIO	Nishanth Menon
	commit e49d519c456f4fb6f4a0473bc04ba30bb805fce5 upstream. GPIO modules are also interrupt sources. However, they require both the GPIO number and IRQ type to function properly. By declaring that GPIO uses interrupt-cells=<1>, we essentially do not allow users of the nodes to use the interrupt property appropritely. With this change, the following now works: interrupt-parent = <&gpio6>; interrupts = <5 IRQ_TYPE_LEVEL_LOW>; Fixes: 6e58b8f1daaf ('ARM: dts: DRA7: Add the dts files for dra7 SoC and dra7-evm board') Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: DRA7: hwmod: Add dra74x and dra72x specific ocp interface lists	Rajendra Nayak
	commit f7f7a29bf0cf25af23f37e5b5bf1368b85705286 upstream. To deal with IPs which are specific to dra74x and dra72x, maintain seperate ocp interface lists, while keeping the common list for all common IPs. Move USB OTG SS4 to dra74x only list since its unavailable in dra72x and is giving an abort during boot. The dra72x only list is empty for now and a placeholder for future hwmod additions which are specific to dra72x. Fixes: d904b38df0db13 ("ARM: DRA7: hwmod: Add SYSCONFIG for usb_otg_ss") Reported-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Rajendra Nayak <rnayak@ti.com> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Tested-by: Nishanth Menon <nm@ti.com> [paul@pwsan.com: fixed comment style to conform with CodingStyle] Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: dts: imx53-qsrb: Fix suspend/resume	Fabio Estevam
	commit 090727b880ff3c56e333f267cc24ab076f3bc096 upstream. The following error is seen after a suspend/resume cycle on a mx53qsb with a MC34708 PMIC: root@freescale /$ echo mem > /sys/power/state [ 32.630592] PM: Syncing filesystems ... done. [ 32.643924] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 32.652384] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 32.679156] PM: suspend of devices complete after 13.113 msecs [ 32.685128] PM: suspend devices took 0.030 seconds [ 32.696109] PM: late suspend of devices complete after 6.133 msecs [ 33.313032] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 33.322009] PM: noirq suspend of devices complete after 619.667 msecs [ 33.328544] Disabling non-boot CPUs ... [ 33.335031] PM: noirq resume of devices complete after 2.352 msecs [ 33.842940] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 33.976095] [sched_delayed] sched: RT throttling activated [ 33.984804] PM: early resume of devices complete after 642.642 msecs [ 34.352954] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 34.862910] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 34.996595] PM: resume of devices complete after 1005.367 msecs [ 35.372925] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 35.882911] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 35.955707] PM: resume devices took 1.970 seconds [ 35.960445] Restarting tasks ... done. [ 35.993386] fec 63fec000.ethernet eth0: Link is Down [ 36.392980] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 36.902908] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 36.953036] ata1: SATA link down (SStatus 0 SControl 300) [ 37.412922] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 37.922906] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 37.993379] fec 63fec000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx [ 38.432938] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 38.942920] mc13xxx 0-0008: Failed to read IRQ status: -110 [ 39.452933] mc13xxx 0-0008: Failed to read IRQ status: -110 (flood of this error message continues forever) Commit 5169df8be0a432ee ("ARM: dts: i.MX53: add support for MCIMX53-START-R") missed to configure the IOMUX for the PMIC IRQ pin. Configure the PMIC IRQ pin so that the suspend/resume sequence behaves cleanly as expected. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Shawn Guo <shawn.guo@freescale.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: 8129/1: errata: work around Cortex-A15 erratum 830321 using dummy strex	Mark Rutland
	commit 2c32c65e3726c773760038910be30cce1b4d4149 upstream. On revisions of Cortex-A15 prior to r3p3, a CLREX instruction at PL1 may falsely trigger a watchpoint exception, leading to potential data aborts during exception return and/or livelock. This patch resolves the issue in the following ways: - Replacing our uses of CLREX with a dummy STREX sequence instead (as we did for v6 CPUs). - Removing the clrex code from v7_exit_coherency_flush and derivatives, since this only exists as a minor performance improvement when non-cached exclusives are in use (Linux doesn't use these). Benchmarking on a variety of ARM cores revealed no measurable performance difference with this change applied, so the change is performed unconditionally and no new Kconfig entry is added. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM: 8128/1: abort: don't clear the exclusive monitors	Mark Rutland
	commit 85868313177700d20644263a782351262d2aff84 upstream. The ARMv6 and ARMv7 early abort handlers clear the exclusive monitors upon entry to the kernel, but this is redundant: - We clear the monitors on every exception return since commit 200b812d0084 ("Clear the exclusive monitor when returning from an exception"), so this is not necessary to ensure the monitors are cleared before returning from a fault handler. - Any dummy STREX will target a temporary scratch area in memory, and may succeed or fail without corrupting useful data. Its status value will not be used. - Any other STREX in the kernel must be preceded by an LDREX, which will initialise the monitors consistently and will not depend on the earlier state of the monitors. Therefore we have no reason to care about the initial state of the exclusive monitors when a data abort is taken, and clearing the monitors prior to exception return (as we already do) is sufficient. This patch removes the redundant clearing of the exclusive monitors from the early abort handlers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	xtensa: fix a6 and a7 handling in fast_syscall_xtensa	Max Filippov
	commit d1b6ba82a50cecf94be540a3a153aa89d97511a0 upstream. Remove restoring a6 on some return paths and instead modify and restore it in a single place, using symbolic name. Correctly restore a7 from PT_AREG7 in case of illegal a6 value. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	xtensa: fix TLBTEMP_BASE_2 region handling in fast_second_level_miss	Max Filippov
	commit 7128039fe2dd3d59da9e4ffa036f3aaa3ba87b9f upstream. Current definition of TLBTEMP_BASE_2 is always 32K above the TLBTEMP_BASE_1, whereas fast_second_level_miss handler for the TLBTEMP region analyzes virtual address bit (PAGE_SHIFT + DCACHE_ALIAS_ORDER) to determine TLBTEMP region where the fault happened. The size of the TLBTEMP region is also checked incorrectly: not 64K, but twice data cache way size (whicht may as well be less than the instruction cache way size). Fix TLBTEMP_BASE_2 to be TLBTEMP_BASE_1 + data cache way size. Provide TLBTEMP_SIZE that is a greater of doubled data cache way size or the instruction cache way size, and use it to determine if the second level TLB miss occured in the TLBTEMP region. Practical occurence of page faults in the TLBTEMP area is extremely rare, this code can be tested by deletion of all w[di]tlb instructions in the tlbtemp_mapping region. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	xtensa: fix access to THREAD_RA/THREAD_SP/THREAD_DS	Max Filippov
	commit 52247123749cc3cbc30168b33ad8c69515c96d23 upstream. With SMP and a lot of debug options enabled task_struct::thread gets out of reach of s32i/l32i instructions with base pointing at task_struct, breaking build with the following messages: arch/xtensa/kernel/entry.S: Assembler messages: arch/xtensa/kernel/entry.S:1002: Error: operand 3 of 'l32i.n' has invalid value '1048' arch/xtensa/kernel/entry.S:1831: Error: operand 3 of 's32i.n' has invalid value '1040' arch/xtensa/kernel/entry.S:1832: Error: operand 3 of 's32i.n' has invalid value '1044' Change base to point to task_struct::thread in such cases. Don't use a10 in _switch_to to save/restore prev pointer as a2 is not clobbered. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	xtensa: fix address checks in dma_{alloc,free}_coherent	Alan Douglas
	commit 1ca49463c44c970b1ab1d71b0f268bfdf8427a7e upstream. Virtual address is translated to the XCHAL_KSEG_CACHED region in the dma_free_coherent, but is checked to be in the 0...XCHAL_KSEG_SIZE range. Change check for end of the range from 'addr >= X' to 'addr > X - 1' to handle the case of X == 0. Replace 'if (C) BUG();' construct with 'BUG_ON(C);'. Signed-off-by: Alan Douglas <adouglas@cadence.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	xtensa: replace IOCTL code definitions with constants	Max Filippov
	commit f61bf8e7d19e0a3456a7a9ed97c399e4353698dc upstream. This fixes userspace code that builds on other architectures but fails on xtensa due to references to structures that other architectures don't refer to. E.g. this fixes the following issue with python-2.7.8: python-2.7.8/Modules/termios.c:861:25: error: invalid application of 'sizeof' to incomplete type 'struct serial_multiport_struct' {"TIOCSERGETMULTI", TIOCSERGETMULTI}, python-2.7.8/Modules/termios.c:870:25: error: invalid application of 'sizeof' to incomplete type 'struct serial_multiport_struct' {"TIOCSERSETMULTI", TIOCSERSETMULTI}, python-2.7.8/Modules/termios.c:900:24: error: invalid application of 'sizeof' to incomplete type 'struct tty_struct' {"TIOCTTYGSTRUCT", TIOCTTYGSTRUCT}, Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	arm64: ptrace: fix compat hardware watchpoint reporting	Will Deacon
	commit 27d7ff273c2aad37b28f6ff0cab2cfa35b51e648 upstream. I'm not sure what I was on when I wrote this, but when iterating over the hardware watchpoint array (hbp_watch_array), our index is off by ARM_MAX_BRP, so we walk off the end of our thread_struct... ... except, a dodgy condition in the loop means that it never executes at all (bp cannot be NULL). This patch fixes the code so that we remove the bp check and use the correct index for accessing the watchpoint structures. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	ARM/ARM64: KVM: Nuke Hyp-mode tlbs before enabling MMU	Pranavkumar Sawargaonkar
	commit f6edbbf36da3a27b298b66c7955fc84e1dcca305 upstream. X-Gene u-boot runs in EL2 mode with MMU enabled hence we might have stale EL2 tlb enteris when we enable EL2 MMU on each host CPU. This can happen on any ARM/ARM64 board running bootloader in Hyp-mode (or EL2-mode) with MMU enabled. This patch ensures that we flush all Hyp-mode (or EL2-mode) TLBs on each host CPU before enabling Hyp-mode (or EL2-mode) MMU. Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>