aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2009-10-05Fix NULL ptr regression in powernow-k8Kurt Roeckx
commit f0adb134d8dc9993a9998dc50845ec4f6ff4fadc upstream. Fixes bugzilla #13780 From: Kurt Roeckx <kurt@roeckx.be> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05Revert "KVM: x86: check for cr3 validity in ioctl_set_sregs"Marcelo Tosatti
(cherry picked from commit dc7e795e3dd2a763e5ceaa1615f307e808cf3932) This reverts commit 6c20e1442bb1c62914bb85b7f4a38973d2a423ba. To my understanding, it became obsolete with the advent of the more robust check in mmu_alloc_roots (89da4ff17f). Moreover, it prevents the conceptually safe pattern 1. set sregs 2. register mem-slots 3. run vcpu by setting a sticky triple fault during step 1. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM: fix cpuid E2BIG handling for extended request typesMark McLoughlin
(cherry picked from commit cb007648de83cf226d69ec76e1c01848b4e8e49f) If we run out of cpuid entries for extended request types we should return -E2BIG, just like we do for the standard request types. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM guest: fix bogus wallclock physical address calculationGlauber Costa
(cherry picked from commit a20316d2aa41a8f4fd171648bad8f044f6060826) The use of __pa() to calculate the address of a C-visible symbol is wrong, and can lead to unpredictable results. See arch/x86/include/asm/page.h for details. It should be replaced with __pa_symbol(), that does the correct math here, by taking relocations into account. This ensures the correct wallclock data structure physical address is passed to the hypervisor. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM: limit lapic periodic timer frequencyMarcelo Tosatti
(cherry picked from commit 1444885a045fe3b1905a14ea1b52540bf556578b) Otherwise its possible to starve the host by programming lapic timer with a very high frequency. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM: MMU: fix bogus alloc_mmu_pages assignmentMarcelo Tosatti
(cherry picked from commit b90c062c65cc8839edfac39778a37a55ca9bda36) Remove the bogus n_free_mmu_pages assignment from alloc_mmu_pages. It breaks accounting of mmu pages, since n_free_mmu_pages is modified but the real number of pages remains the same. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM: MMU: fix missing locking in alloc_mmu_pagesMarcelo Tosatti
(cherry picked from commit 6a1ac77110ee3e8d8dfdef8442f3b30b3d83e6a2) n_requested_mmu_pages/n_free_mmu_pages are used by kvm_mmu_change_mmu_pages to calculate the number of pages to zap. alloc_mmu_pages, called from the vcpu initialization path, modifies this variables without proper locking, which can result in a negative value in kvm_mmu_change_mmu_pages (say, with cpu hotplug). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM: x86: Disallow hypercalls for guest callers in rings > 0Jan Kiszka
(cherry picked from commit 07708c4af1346ab1521b26a202f438366b7bcffd) So far unprivileged guest callers running in ring 3 can issue, e.g., MMU hypercalls. Normally, such callers cannot provide any hand-crafted MMU command structure as it has to be passed by its physical address, but they can still crash the guest kernel by passing random addresses. To close the hole, this patch considers hypercalls valid only if issued from guest ring 0. This may still be relaxed on a per-hypercall base in the future once required. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM: MMU: make __kvm_mmu_free_some_pages handle empty listIzik Eidus
(cherry picked from commit 3b80fffe2b31fb716d3ebe729c54464ee7856723) First check if the list is empty before attempting to look at list entries. Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM: VMX: Fix cr8 exiting control clobbering by EPTGleb Natapov
(cherry picked from commit 5fff7d270bd6a4759b6d663741b729cdee370257) Don't call adjust_vmx_controls() two times for the same control. It restores options that were dropped earlier. This loses us the cr8 exit control, which causes a massive performance regression Windows x64. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-10-05KVM: VMX: Check cpl before emulating debug register accessAvi Kivity
(cherry picked from commit 0a79b009525b160081d75cef5dbf45817956acf2) Debug registers may only be accessed from cpl 0. Unfortunately, vmx will code to emulate the instruction even though it was issued from guest userspace, possibly leading to an unexpected trap later. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-24x86, pat: Fix cacheflush address in change_page_attr_set_clr()Jack Steiner
commit fa526d0d641b5365676a1fb821ce359e217c9b85 upstream. Fix address passed to cpa_flush_range() when changing page attributes from WB to UC. The address (*addr) is modified by __change_page_attr_set_clr(). The result is that the pages being flushed start at the _end_ of the changed range instead of the beginning. This should be considered for 2.6.30-stable and 2.6.31-stable. Signed-off-by: Jack Steiner <steiner@sgi.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-24x86/i386: Make sure stack-protector segment base is cache alignedJeremy Fitzhardinge
commit 1ea0d14e480c245683927eecc03a70faf06e80c8 upstream. The Intel Optimization Reference Guide says: In Intel Atom microarchitecture, the address generation unit assumes that the segment base will be 0 by default. Non-zero segment base will cause load and store operations to experience a delay. - If the segment base isn't aligned to a cache line boundary, the max throughput of memory operations is reduced to one [e]very 9 cycles. [...] Assembly/Compiler Coding Rule 15. (H impact, ML generality) For Intel Atom processors, use segments with base set to 0 whenever possible; avoid non-zero segment base address that is not aligned to cache line boundary at all cost. We can't avoid having a non-zero base for the stack-protector segment, but we can make it cache-aligned. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> LKML-Reference: <4AA01893.6000507@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-24x86: Fix x86_model test in es7000_apic_is_cluster()Roel Kluin
commit 005155b1f626d2b2d7932e4afdf4fead168c6888 upstream. For the x86_model to be greater than 6 or less than 12 is logically always true. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08x86, amd: Don't probe for extended APIC ID if APICs are disabledJeremy Fitzhardinge
commit 2cb078603abb612e3bcd428fb8122c3d39e08832 upstream. If we've logically disabled apics, don't probe the PCI space for the AMD extended APIC ID. [ Impact: prevent boot crash under Xen. ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reported-by: Bastian Blank <bastian@waldi.eu.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: Fix KVM_GET_MSR_INDEX_LISTJan Kiszka
commit e125e7b6944898831b56739a5448e705578bf7e2 upstream. So far, KVM copied the emulated_msrs (only MSR_IA32_MISC_ENABLE) to a wrong address in user space due to broken pointer arithmetic. This caused subtle corruption up there (missing MSR_IA32_MISC_ENABLE had probably no practical relevance). Moreover, the size check for the user-provided kvm_msr_list forgot about emulated MSRs. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: MMU: limit rmap chain lengthMarcelo Tosatti
(cherry picked from commit 53a27b39ff4d2492f84b1fdc2f0047175f0b0b93) Otherwise the host can spend too long traversing an rmap chain, which happens under a spinlock. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in ↵Marcelo Tosatti
kvm_mmu_change_mmu_pages (cherry picked from commit 025dbbf36a7680bffe54d9dcbf0a8bc01a7cbd10) kvm_mmu_change_mmu_pages mishandles the case where n_alloc_mmu_pages is smaller then n_free_mmu_pages, by not checking if the result of the subtraction is negative. Its a valid condition which can happen if a large number of pages has been recently freed. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: SVM: force new asid on vcpu migrationMarcelo Tosatti
(cherry picked from commit 4b656b1202498184a0ecef86b3b89ff613b9c6ab) If a migrated vcpu matches the asid_generation value of the target pcpu, there will be no TLB flush via TLB_CONTROL_FLUSH_ALL_ASID. The check for vcpu.cpu in pre_svm_run is meaningless since svm_vcpu_load already updated it on schedule in. Such vcpu will VMRUN with stale TLB entries. Based on original patch from Joerg Roedel (http://patchwork.kernel.org/patch/10021/) Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: x86: verify MTRR/PAT validityMarcelo Tosatti
(cherry picked from commit d6289b9365c3f622a8cfe62c4fb054bb70b5061a) Do not allow invalid memory types in MTRR/PAT (generating a #GP otherwise). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: Fix cpuid feature misreportingAvi Kivity
(cherry picked from commit 8d753f369bd28fff1706ffe9fb9fea4fd88cf85b) MTRR, PAT, MCE, and MCA are all supported (to some extent) but not reported. Vista requires these features, so if userspace relies on kernel cpuid reporting, it loses support for Vista. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: Ignore reads to K7 EVNTSEL MSRsAmit Shah
(cherry picked from commit 9e6996240afcbe61682eab8eeaeb65c34333164d) In commit 7fe29e0faacb650d31b9e9f538203a157bec821d we ignored the reads to the P6 EVNTSEL MSRs. That fixed crashes on Intel machines. Ignore the reads to K7 EVNTSEL MSRs as well to fix this on AMD hosts. This fixes Kaspersky antivirus crashing Windows guests on AMD hosts. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: x86: Ignore reads to EVNTSEL MSRsAmit Shah
(cherry picked from commit 7fe29e0faacb650d31b9e9f538203a157bec821d) We ignore writes to the performance counters and performance event selector registers already. Kaspersky antivirus reads the eventsel MSR causing it to crash with the current behaviour. Return 0 as data when the eventsel registers are read to stop the crash. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: MMU: Use different shadows when EFER.NXE changesAvi Kivity
(cherry picked from commit 9645bb56b31a1b70ab9e470387b5264cafc04aa9) A pte that is shadowed when the guest EFER.NXE=1 is not valid when EFER.NXE=0; if bit 63 is set, the pte should cause a fault, and since the shadow EFER always has NX enabled, this won't happen. Fix by using a different shadow page table for different EFER.NXE bits. This allows vcpus to run correctly with different values of EFER.NXE, and for transitions on this bit to be handled correctly without requiring a full flush. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: Deal with interrupt shadow state for emulated instructionsGlauber Costa
(cherry picked from commit 310b5d306c1aee7ebe32f702c0e33e7988d50646) We currently unblock shadow interrupt state when we skip an instruction, but failing to do so when we actually emulate one. This blocks interrupts in key instruction blocks, in particular sti; hlt; sequences If the instruction emulated is an sti, we have to block shadow interrupts. The same goes for mov ss. pop ss also needs it, but we don't currently emulate it. Without this patch, I cannot boot gpxe option roms at vmx machines. This is described at https://bugzilla.redhat.com/show_bug.cgi?id=494469 Signed-off-by: Glauber Costa <glommer@redhat.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: Introduce {set/get}_interrupt_shadow()Glauber Costa
This patch introduces set/get_interrupt_shadow(), that does exactly what the name suggests. It also replaces open code that explicitly does it with the now existent functions. It differs slightly from upstream, because upstream merged it after gleb's interrupt rework, that we don't ship. Just for reference, upstream changelog is (2809f5d2c4cfad171167b131bb2a21ab65eba40f): This patch replaces drop_interrupt_shadow with the more general set_interrupt_shadow, that can either drop or raise it, depending on its parameter. It also adds ->get_interrupt_shadow() for future use. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: MMU: do not free active mmu pages in free_mmu_pages()Gleb Natapov
(cherry picked from commit f00be0cae4e6ad0a8c7be381c6d9be3586800b3e) free_mmu_pages() should only undo what alloc_mmu_pages() does. Free mmu pages from the generic VM destruction function, kvm_destroy_vm(). Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: MMU: protect kvm_mmu_change_mmu_pages with mmu_lockMarcelo Tosatti
(cherry picked from commit 7c8a83b75a38a807d37f5a4398eca2a42c8cf513) kvm_handle_hva, called by MMU notifiers, manipulates mmu data only with the protection of mmu_lock. Update kvm_mmu_change_mmu_pages callers to take mmu_lock, thus protecting against kvm_handle_hva. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08KVM: x86: check for cr3 validity in mmu_alloc_rootsMarcelo Tosatti
(cherry picked from commit 8986ecc0ef58c96eec48d8502c048f3ab67fd8e2) Verify the cr3 address stored in vcpu->arch.cr3 points to an existant memslot. If not, inject a triple fault. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08x86: don't call '->send_IPI_mask()' with an empty maskLinus Torvalds
commit b04e6373d694e977c95ae0ae000e2c1e2cf92d73 upstream. As noted in 83d349f35e1ae72268c5104dbf9ab2ae635425d4 ("x86: don't send an IPI to the empty set of CPU's"), some APIC's will be very unhappy with an empty destination mask. That commit added a WARN_ON() for that case, and avoided the resulting problem, but didn't fix the underlying reason for why those empty mask cases happened. This fixes that, by checking the result of 'cpumask_andnot()' of the current CPU actually has any other CPU's left in the set of CPU's to be sent a TLB flush, and not calling down to the IPI code if the mask is empty. The reason this started happening at all is that we started passing just the CPU mask pointers around in commit 4595f9620 ("x86: change flush_tlb_others to take a const struct cpumask"), and when we did that, the cpumask was no longer thread-local. Before that commit, flush_tlb_mm() used to create it's own copy of 'mm->cpu_vm_mask' and pass that copy down to the low-level flush routines after having tested that it was not empty. But after changing it to just pass down the CPU mask pointer, the lower level TLB flush routines would now get a pointer to that 'mm->cpu_vm_mask', and that could still change - and become empty - after the test due to other CPU's having flushed their own TLB's. See http://bugzilla.kernel.org/show_bug.cgi?id=13933 for details. Tested-by: Thomas Björnell <thomas.bjornell@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08x86: don't send an IPI to the empty set of CPU'sLinus Torvalds
commit 83d349f35e1ae72268c5104dbf9ab2ae635425d4 upstream. The default_send_IPI_mask_logical() function uses the "flat" APIC mode to send an IPI to a set of CPU's at once, but if that set happens to be empty, some older local APIC's will apparently be rather unhappy. So just warn if a caller gives us an empty mask, and ignore it. This fixes a regression in 2.6.30.x, due to commit 4595f9620 ("x86: change flush_tlb_others to take a const struct cpumask"), documented here: http://bugzilla.kernel.org/show_bug.cgi?id=13933 which causes a silent lock-up. It only seems to happen on PPro, P2, P3 and Athlon XP cores. Most developers sadly (or not so sadly, if you're a developer..) have more modern CPU's. Also, on x86-64 we don't use the flat APIC mode, so it would never trigger there even if the APIC didn't like sending an empty IPI mask. Reported-by: Pavel Vilim <wylda@volny.cz> Reported-and-tested-by: Thomas Björnell <thomas.bjornell@gmail.com> Reported-and-tested-by: Martin Rogge <marogge@onlinehome.de> Cc: Mike Travis <travis@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16x86: Fix VMI && stack protectorAlok Kataria
commit 7d5b005652bc5ae3e1e0efc53fd0e25a643ec506 upstream. With CONFIG_STACK_PROTECTOR turned on, VMI doesn't boot with more than one processor. The problem is with the gs value not being initialized correctly when registering the secondary processor for VMI's case. The patch below initializes the gs value for the AP to __KERNEL_STACK_CANARY. Without this the secondary processor keeps on taking a GP on every gs access. Signed-off-by: Alok N Kataria <akataria@vmware.com> LKML-Reference: <1249425262.18955.40.camel@ank32.eng.vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16x86, pat: Fix set_memory_wc related corruptionPallipadi, Venkatesh
commit bdc6340f4eb68295b1e7c0ade2356b56dca93d93 upstream. Changeset 3869c4aa18835c8c61b44bd0f3ace36e9d3b5bd0 that went in after 2.6.30-rc1 was a seemingly small change to _set_memory_wc() to make it complaint with SDM requirements. But, introduced a nasty bug, which can result in crash and/or strange corruptions when set_memory_wc is used. One such crash reported here http://lkml.org/lkml/2009/7/30/94 Actually, that changeset introduced two bugs. * change_page_attr_set() takes &addr as first argument and can the addr value might have changed on return, even for single page change_page_attr_set() call. That will make the second change_page_attr_set() in this routine operate on unrelated addr, that can eventually cause strange corruptions and bad page state crash. * The second change_page_attr_set() call, before setting _PAGE_CACHE_WC, should clear the earlier _PAGE_CACHE_UC_MINUS, as otherwise cache attribute will not be WC (will be UC instead). The patch below fixes both these problems. Sending a single patch to fix both the problems, as the change is to the same line of code. The change to have a addr_copy is not very clean. But, it is simpler than making more changes through various routines in pageattr.c. A huge thanks to Jerome for reporting this problem and providing a simple test case that helped us root cause the problem. Reported-by: Jerome Glisse <glisse@freedesktop.org> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20090730214319.GA1889@linux-os.sc.intel.com> Acked-by: Dave Airlie <airlied@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16x86: fix assembly constraints in native_save_fl()H. Peter Anvin
commit f1f029c7bfbf4ee1918b90a431ab823bed812504 upstream. From Gabe Black in bugzilla 13888: native_save_fl is implemented as follows: 11static inline unsigned long native_save_fl(void) 12{ 13 unsigned long flags; 14 15 asm volatile("# __raw_save_flags\n\t" 16 "pushf ; pop %0" 17 : "=g" (flags) 18 : /* no input */ 19 : "memory"); 20 21 return flags; 22} If gcc chooses to put flags on the stack, for instance because this is inlined into a larger function with more register pressure, the offset of the flags variable from the stack pointer will change when the pushf is performed. gcc doesn't attempt to understand that fact, and address used for pop will still be the same. It will write to somewhere near flags on the stack but not actually into it and overwrite some other value. I saw this happen in the ide_device_add_all function when running in a simulator I work on. I'm assuming that some quirk of how the simulated hardware is set up caused the code path this is on to be executed when it normally wouldn't. A simple fix might be to change "=g" to "=r". Reported-by: Gabe Black <spamforgabe@umich.edu> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16x86: Fix CPA memtype reserving in the set_pages_array*() casesThomas Hellstrom
commit 8523acfe40efc1a8d3da8f473ca67cb195b06f0c upstream. The code was incorrectly reserving memtypes using the page virtual address instead of the physical address. Furthermore, the code was not ignoring highmem pages as it ought to. ( upstream does not pass in highmem pages yet - but upcoming graphics code will do it and there's no reason to not handle this properly in the CPA APIs.) Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=13884 Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: dri-devel@lists.sourceforge.net Cc: venkatesh.pallipadi@intel.com LKML-Reference: <1249284345-7654-1-git-send-email-thellstrom@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86: don't use 'access_ok()' as a range check in get_user_pages_fast()Linus Torvalds
[ Upstream commit 7f8189068726492950bf1a2dcfd9b51314560abf - modified for stable to not use the sloppy __VIRTUAL_MASK_SHIFT ] It's really not right to use 'access_ok()', since that is meant for the normal "get_user()" and "copy_from/to_user()" accesses, which are done through the TLB, rather than through the page tables. Why? access_ok() does both too few, and too many checks. Too many, because it is meant for regular kernel accesses that will not honor the 'user' bit in the page tables, and because it honors the USER_DS vs KERNEL_DS distinction that we shouldn't care about in GUP. And too few, because it doesn't do the 'canonical' check on the address on x86-64, since the TLB will do that for us. So instead of using a function that isn't meant for this, and does something else and much more complicated, just do the real rules: we don't want the range to overflow, and on x86-64, we want it to be a canonical low address (on 32-bit, all addresses are canonical). Acked-by: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86, setup (2.6.30-stable) fix 80x34 and 80x60 console modesMarc Aurele La France
Note: this is not in upstream since upstream is not affected due to the new "BIOS glovebox" subsystem. As coded, most INT10 calls in video-vga.c allow the compiler to assume EAX remains unchanged across them, which is not always the case. This triggers an optimisation issue that causes vga_set_vertical_end() to be called with an incorrect number of scanlines. Fix this by beefing up the asm constraints on these calls. Reported-by: Marc Aurele La France <tsi@xfree86.org> Signed-off-by: Marc Aurele La France <tsi@xfree86.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86-64: Fix bad_srat() to clear all stateAndi Kleen
commit 429b2b319af3987e808c18f6b81313104caf782c upstream. Need to clear both nodes and nodes_add state for start/end. Signed-off-by: Andi Kleen <ak@linux.intel.com> LKML-Reference: <20090718065657.GA2898@basil.fritz.box> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86: Add quirk for Intel DG45ID board to avoid low memory corruptionAlexey Fisher
commit 6aa542a694dc9ea4344a8a590d2628c33d1b9431 upstream. AMI BIOS with low memory corruption was found on Intel DG45ID board (Bug 13710). Add this board to the blacklist - in the (somewhat optimistic) hope of future boards/BIOSes from Intel not having this bug. Also see: http://bugzilla.kernel.org/show_bug.cgi?id=13736 Signed-off-by: Alexey Fisher <bug-track@fisher-privat.net> Cc: ykzhao <yakui.zhao@intel.com> Cc: alan@lxorguk.ukuu.org.uk Cc: <stable@kernel.org> LKML-Reference: <1247660169-4503-1-git-send-email-bug-track@fisher-privat.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86: Fix movq immediate operand constraints in uaccess.hH. Peter Anvin
commit ebe119cd0929df4878f758ebf880cb435e4dcaaf upstream. The movq instruction, generated by __put_user_asm() when used for 64-bit data, takes a sign-extended immediate ("e") not a zero-extended immediate ("Z"). Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86: Fix movq immediate operand constraints in uaccess_64.hUros Bizjak
commit 155b73529583c38f30fd394d692b15a893960782 upstream. arch/x86/include/asm/uaccess_64.h uses wrong asm operand constraint ("ir") for movq insn. Since movq sign-extends its immediate operand, "er" constraint should be used instead. Attached patch changes all uses of __put_user_asm in uaccess_64.h to use "er" when "q" insn suffix is involved. Patch was compile tested on x86_64 with defconfig. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86: geode: Mark mfgpt irq IRQF_TIMER to prevent resume failureThomas Gleixner
commit d6c585a4342a2ff627a29f9aea77c5ed4cd76023 upstream. Timer interrupts are excluded from being disabled during suspend. The clock events code manages the disabling of clock events on its own because the timer interrupt needs to be functional before the resume code reenables the device interrupts. The mfgpt timer request its interrupt without setting the IRQF_TIMER flag so suspend_device_irqs() disables it as well which results in a fatal resume failure. Adding IRQF_TIMER to the interupt flags when requesting the mrgpt timer interrupt solves the problem. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: Andres Salomon <dilinger@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86/pci: insert ioapic resource before assigning unassigned resourcesYinghai Lu
commit 857fdc53a0a90c3ba7fcf5b1fb4c7a62ae03cf82 upstream. Stephen reported that his DL585 G2 needed noapic after 2.6.22 (?) Dann bisected it down to: commit 30a18d6c3f1e774de656ebd8ff219d53e2ba4029 Date: Tue Feb 19 03:21:20 2008 -0800 x86: multi pci root bus with different io resource range, on 64-bit It turns out that: 1. that AMD-based systems have two HT chains. 2. BIOS doesn't allocate resources for BAR 6 of devices under 8132 etc 3. that multi-peer-root patch will try to split root resources to peer root resources according to PCI conf of NB 4. PCI core assigns unassigned resources, but they overlap with BARs that are used by ioapic addr of io4 and 8132. The reason: at that point ioapic address are not inserted yet. Solution is to insert ioapic resources into the tree a bit earlier. Reported-by: Stephen Frost <sfrost@snowman.net> Reported-and-Tested-by: dann frazier <dannf@hp.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86: Fix fixmap page order for FIX_TEXT_POKE0,1Mathieu Desnoyers
commit 12b9d7ccb841805e347fec8f733f368f43ddba40 upstream. Masami reported: > Since the fixmap pages are assigned higher address to lower, > text_poke() has to use it with inverted order (FIX_TEXT_POKE1 > to FIX_TEXT_POKE0). I prefer to just invert the order of the fixmap declaration. It's simpler and more straightforward. Backward fixmaps seems to be used by both x86 32 and 64. It's really rare but a nasty bug, because it only hurts when instructions to patch are crossing a page boundary. If this happens, the fixmap write accesses will spill on the following fixmap, which may very well crash the system. And this does not crash the system, it could leave illegal instructions in place. Thanks Masami for finding this. It seems to have crept into the 2.6.30-rc series, so this calls for a -stable inclusion. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <20090701213722.GH19926@Krystal> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30x86: Fix fixmap orderingJan Beulich
commit 789d03f584484af85dbdc64935270c8e45f36ef7 upstream. The merge of the 32- and 64-bit fixmap headers made a latent bug on x86-64 a real one: with the right config settings it is possible for FIX_OHCI1394_BASE to overlap the FIX_BTMAP_* range. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4A4A0A8702000078000082E8@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-19Fix pci_unmap_addr() et al on i386.David Woodhouse
commit 788d84bba47ea3eb377f7a3ae4fd1ee84b84877b upstream. We can run a 32-bit kernel on boxes with an IOMMU, so we need pci_unmap_addr() etc. to work -- without it, drivers will leak mappings. To be honest, this whole thing looks like it's more pain than it's worth; I'm half inclined to remove the no-op #else case altogether. But this is the minimal fix, which just does the right thing if CONFIG_DMAR is set. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-02KVM: x86: silence preempt warning on kvm_write_guest_timeMatt T. Yourst
commit 2dea4c84bc936731668b5a7a9fba5b436a422668 upstream. This issue just appeared in kvm-84 when running on 2.6.28.7 (x86-64) with PREEMPT enabled. We're getting syslog warnings like this many (but not all) times qemu tells KVM to run the VCPU: BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-x86/28938 caller is kvm_arch_vcpu_ioctl_run+0x5d1/0xc70 [kvm] Pid: 28938, comm: qemu-system-x86 2.6.28.7-mtyrel-64bit Call Trace: debug_smp_processor_id+0xf7/0x100 kvm_arch_vcpu_ioctl_run+0x5d1/0xc70 [kvm] ? __wake_up+0x4e/0x70 ? wake_futex+0x27/0x40 kvm_vcpu_ioctl+0x2e9/0x5a0 [kvm] enqueue_hrtimer+0x8a/0x110 _spin_unlock_irqrestore+0x27/0x50 vfs_ioctl+0x31/0xa0 do_vfs_ioctl+0x74/0x480 sys_futex+0xb4/0x140 sys_ioctl+0x99/0xa0 system_call_fastpath+0x16/0x1b As it turns out, the call trace is messed up due to gcc's inlining, but I isolated the problem anyway: kvm_write_guest_time() is being used in a non-thread-safe manner on preemptable kernels. Basically kvm_write_guest_time()'s body needs to be surrounded by preempt_disable() and preempt_enable(), since the kernel won't let us query any per-CPU data (indirectly using smp_processor_id()) without preemption disabled. The attached patch fixes this issue by disabling preemption inside kvm_write_guest_time(). [marcelo: surround only __get_cpu_var calls since the warning is harmless] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-02x86: Set cpu_llc_id on AMD CPUsAndreas Herrmann
commit 99bd0c0fc4b04da54cb311953ef9489931c19c63 upstream. This counts when building sched domains in case NUMA information is not available. ( See cpu_coregroup_mask() which uses llc_shared_map which in turn is created based on cpu_llc_id. ) Currently Linux builds domains as follows: (example from a dual socket quad-core system) CPU0 attaching sched-domain: domain 0: span 0-7 level CPU groups: 0 1 2 3 4 5 6 7 ... CPU7 attaching sched-domain: domain 0: span 0-7 level CPU groups: 7 0 1 2 3 4 5 6 Ever since that is borked for multi-core AMD CPU systems. This patch fixes that and now we get a proper: CPU0 attaching sched-domain: domain 0: span 0-3 level MC groups: 0 1 2 3 domain 1: span 0-7 level CPU groups: 0-3 4-7 ... CPU7 attaching sched-domain: domain 0: span 4-7 level MC groups: 7 4 5 6 domain 1: span 0-7 level CPU groups: 4-7 0-3 This allows scheduler to assign tasks to cores on different sockets (i.e. that don't share last level cache) for performance reasons. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20090619085909.GJ5218@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-02x86: Fix non-lazy GS handling in sys_vm86()Lubomir Rintel
commit 3aa6b186f86c5d06d6d92d14311ffed51f091f40 upstream. This fixes a stack corruption panic or null dereference oops due to a bad GS in resume_userspace() when returning from sys_vm86() and calling lockdep_sys_exit(). Only a problem when CONFIG_LOCKDEP and CONFIG_CC_STACKPROTECTOR enabled. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1244384628.2323.4.camel@bimbo> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-02crypto: aes-ni - Fix cbc mode IV savingHuang Ying
commit e6efaa025384f86a18814a6b9f4e5d54484ab9ff upstream. Original implementation of aesni_cbc_dec do not save IV if input length % 4 == 0. This will make decryption of next block failed. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>