aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2010-12-09x86: Ignore trap bits on single step exceptionsFrederic Weisbecker
commit 6c0aca288e726405b01dacb12cac556454d34b2a upstream. When a single step exception fires, the trap bits, used to signal hardware breakpoints, are in a random state. These trap bits might be set if another exception will follow, like a breakpoint in the next instruction, or a watchpoint in the previous one. Or there can be any junk there. So if we handle these trap bits during the single step exception, we are going to handle an exception twice, or we are going to handle junk. Just ignore them in this case. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=21332 Reported-by: Michael Stefaniuc <mstefani@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Maciej Rutecki <maciej.rutecki@gmail.com> Cc: Alexandre Julliard <julliard@winehq.org> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-09acpi-cpufreq: fix a memleak when unloading driverZhang Rui
commit dab5fff14df2cd16eb1ad4c02e83915e1063fece upstream. We didn't free per_cpu(acfreq_data, cpu)->freq_table when acpi_freq driver is unloaded. Resulting in the following messages in /sys/kernel/debug/kmemleak: unreferenced object 0xf6450e80 (size 64): comm "modprobe", pid 1066, jiffies 4294677317 (age 19290.453s) hex dump (first 32 bytes): 00 00 00 00 e8 a2 24 00 01 00 00 00 00 9f 24 00 ......$.......$. 02 00 00 00 00 6a 18 00 03 00 00 00 00 35 0c 00 .....j.......5.. backtrace: [<c123ba97>] kmemleak_alloc+0x27/0x50 [<c109f96f>] __kmalloc+0xcf/0x110 [<f9da97ee>] acpi_cpufreq_cpu_init+0x1ee/0x4e4 [acpi_cpufreq] [<c11cd8d2>] cpufreq_add_dev+0x142/0x3a0 [<c11920b7>] sysdev_driver_register+0x97/0x110 [<c11cce56>] cpufreq_register_driver+0x86/0x140 [<f9dad080>] 0xf9dad080 [<c1001130>] do_one_initcall+0x30/0x160 [<c10626e9>] sys_init_module+0x99/0x1e0 [<c1002d97>] sysenter_do_call+0x12/0x26 [<ffffffff>] 0xffffffff https://bugzilla.kernel.org/show_bug.cgi?id=15807#c21 Tested-by: Toralf Forster <toralf.foerster@gmx.de> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-09KVM: VMX: Fix host userspace gsbase corruptionAvi Kivity
commit c8770e7ba63bb5dd8fe5f9d251275a8fa717fb78 upstream. We now use load_gs_index() to load gs safely; unfortunately this also changes MSR_KERNEL_GS_BASE, which we managed separately. This resulted in confusion and breakage running 32-bit host userspace on a 64-bit kernel. Fix by - saving guest MSR_KERNEL_GS_BASE before we we reload the host's gs - doing the host save/load unconditionally, instead of only when in guest long mode Things can be cleaned up further, but this is the minmal fix for now. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-09KVM: Correct ordering of ldt reload wrt fs/gs reloadAvi Kivity
commit 0a77fe4c188e25917799f2356d4aa5e6d80c39a2 upstream. If fs or gs refer to the ldt, they must be reloaded after the ldt. Reorder the code to that effect. Userspace code that uses the ldt with kvm is nonexistent, so this doesn't fix a user-visible bug. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-09KVM: x86: fix information leak to userlandVasiliy Kulikov
commit 97e69aa62f8b5d338d6cff49be09e37cc1262838 upstream. Structures kvm_vcpu_events, kvm_debugregs, kvm_pit_state2 and kvm_clock_data are copied to userland with some padding and reserved fields unitialized. It leads to leaking of contents of kernel stack memory. We have to initialize them to zero. In patch v1 Jan Kiszka suggested to fill reserved fields with zeros instead of memset'ting the whole struct. It makes sense as these fields are explicitly marked as padding. No more fields need zeroing. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-09KVM: Write protect memory after slot swapMichael S. Tsirkin
commit edde99ce05290e50ce0b3495d209e54e6349ab47 upstream. I have observed the following bug trigger: 1. userspace calls GET_DIRTY_LOG 2. kvm_mmu_slot_remove_write_access is called and makes a page ro 3. page fault happens and makes the page writeable fault is logged in the bitmap appropriately 4. kvm_vm_ioctl_get_dirty_log swaps slot pointers a lot of time passes 5. guest writes into the page 6. userspace calls GET_DIRTY_LOG At point (5), bitmap is clean and page is writeable, thus, guest modification of memory is not logged and GET_DIRTY_LOG returns an empty bitmap. The rule is that all pages are either dirty in the current bitmap, or write-protected, which is violated here. It seems that just moving kvm_mmu_slot_remove_write_access down to after the slot pointer swap should fix this bug. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-09xen: don't bother to stop other cpus on shutdown/rebootJeremy Fitzhardinge
commit 31e323cca9d5c8afd372976c35a5d46192f540d1 upstream. Xen will shoot all the VCPUs when we do a shutdown hypercall, so there's no need to do it manually. In any case it will fail because all the IPI irqs have been pulled down by this point, so the cross-CPU calls will simply hang forever. Until change 76fac077db6b34e2c6383a7b4f3f4f7b7d06d8ce the function calls were not synchronously waited for, so this wasn't apparent. However after that change the calls became synchronous leading to a hang on shutdown on multi-VCPU guests. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Alok Kataria <akataria@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22KVM: SVM: Restore correct registers after sel_cr0 intercept emulationJoerg Roedel
commit cda0008299a06f0d7218c6037c3c02d7a865e954 upstream. This patch implements restoring of the correct rip, rsp, and rax after the svm emulation in KVM injected a selective_cr0 write intercept into the guest hypervisor. The problem was that the vmexit is emulated in the instruction emulation which later commits the registers right after the write-cr0 instruction. So the l1 guest will continue to run with the l2 rip, rsp and rax resulting in unpredictable behavior. This patch is not the final word, it is just an easy patch to fix the issue. The real fix will be done when the instruction emulator is made aware of nested virtualization. Until this is done this patch fixes the issue and provides an easy way to fix this in -stable too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22KVM: X86: Report SVM bit to userspace only when supportedJoerg Roedel
commit 4c62a2dc92518c5adf434df8e5c2283c6762672a upstream. This patch fixes a bug in KVM where it _always_ reports the support of the SVM feature to userspace. But KVM only supports SVM on AMD hardware and only when it is enabled in the kernel module. This patch fixes the wrong reporting. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22x86, vm86: Fix preemption bug for int1 debug and int3 breakpoint handlers.Bart Oldeman
commit 6554287b1de0448f1e02e200d02b43914e997d15 upstream. Impact: fix kernel bug such as: BUG: scheduling while atomic: dosemu.bin/19680/0x00000004 See also Ubuntu bug 455067 at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/455067 Commits 4915a35e35a037254550a2ba9f367a812bc37d40 ("Use preempt_conditional_sti/cli in do_int3, like on x86_64.") and 3d2a71a596bd9c761c8487a2178e95f8a61da083 ("x86, traps: converge do_debug handlers") started disabling preemption in int1 and int3 handlers on i386. The problem with vm86 is that the call to handle_vm86_trap() may jump straight to entry_32.S and never returns so preempt is never enabled again, and there is an imbalance in the preempt count. Commit be716615fe596ee117292dc615e95f707fb67fd1 ("x86, vm86: fix preemption bug"), which was later (accidentally?) reverted by commit 08d68323d1f0c34452e614263b212ca556dae47f ("hw-breakpoints: modifying generic debug exception to use thread-specific debug registers") fixed the problem for debug exceptions but not for breakpoints. There are three solutions to this problem. 1. Reenable preemption before calling handle_vm86_trap(). This was the approach that was later reverted. 2. Do not disable preemption for i386 in breakpoint and debug handlers. This was the situation before October 2008. As far as I understand preemption only needs to be disabled on x86_64 because a seperate stack is used, but it's nice to have things work the same way on i386 and x86_64. 3. Let handle_vm86_trap() return instead of jumping to assembly code. By setting a flag in _TIF_WORK_MASK, either TIF_IRET or TIF_NOTIFY_RESUME, the code in entry_32.S is instructed to return to 32 bit mode from V86 mode. The logic in entry_32.S was already present to handle signals. (I chose TIF_IRET because it's slightly more efficient in do_notify_resume() in signal.c, but in fact TIF_IRET can probably be replaced by TIF_NOTIFY_RESUME everywhere.) I'm submitting approach 3, because I believe it is the most elegant and prevents future confusion. Still, an obvious preempt_conditional_cli(regs); is necessary in traps.c to correct the bug. [ hpa: This is technically a regression, but because: 1. the regression is so old, 2. the patch seems relatively high risk, justifying more testing, and 3. we're late in the 2.6.36-rc cycle, I'm queuing it up for the 2.6.37 merge window. It might, however, justify as a -stable backport at a latter time, hence Cc: stable. ] Signed-off-by: Bart Oldeman <bartoldeman@users.sourceforge.net> LKML-Reference: <alpine.DEB.2.00.1009231312330.4732@localhost.localdomain> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22x86, kdump: Change copy_oldmem_page() to use cached addressingCliff Wickman
commit 37a2f9f30a360fb03522d15c85c78265ccd80287 upstream. The copy of /proc/vmcore to a user buffer proceeds much faster if the kernel addresses memory as cached. With this patch we have seen an increase in transfer rate from less than 15MB/s to 80-460MB/s, depending on size of the transfer. This makes a big difference in time needed to save a system dump. Signed-off-by: Cliff Wickman <cpw@sgi.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: kexec@lists.infradead.org LKML-Reference: <E1OtMLz-0001yp-Ia@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22x86, intr-remap: Set redirection hint in the IRTESuresh Siddha
commit 75e3cfbed6f71a8f151dc6e413b6ce3c390030cb upstream. Currently the redirection hint in the interrupt-remapping table entry is set to 0, which means the remapped interrupt is directed to the processors listed in the destination. So in logical flat mode in the presence of intr-remapping, this results in a single interrupt multi-casted to multiple cpu's as specified by the destination bit mask. But what we really want is to send that interrupt to one of the cpus based on the lowest priority delivery mode. Set the redirection hint in the IRTE to '1' to indicate that we want the remapped interrupt to be directed to only one of the processors listed in the destination. This fixes the issue of same interrupt getting delivered to multiple cpu's in the logical flat mode in the presence of interrupt-remapping. While there is no functional issue observed with this behavior, this will impact performance of such configurations (<=8 cpu's using logical flat mode in the presence of interrupt-remapping) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100827181049.013051492@sbsiddha-MOBL3.sc.intel.com> Cc: Weidong Han <weidong.han@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22x86, mtrr: Assume SYS_CFG[Tom2ForceMemTypeWB] exists on all future AMD CPUsAndreas Herrmann
commit 3fdbf004c1706480a7c7fac3c9d836fa6df20d7d upstream. Instead of adapting the CPU family check in amd_special_default_mtrr() for each new CPU family assume that all new AMD CPUs support the necessary bits in SYS_CFG MSR. Tom2Enabled is architectural (defined in APM Vol.2). Tom2ForceMemTypeWB is defined in all BKDGs starting with K8 NPT. In pre K8-NPT BKDG this bit is reserved (read as zero). W/o this adaption Linux would unnecessarily complain about bad MTRR settings on every new AMD CPU family, e.g. [ 0.000000] WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 4863MB of RAM. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100930123235.GB20545@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22x86, olpc: Don't retry EC commands foreverPaul Fox
commit 286e5b97eb22baab9d9a41ca76c6b933a484252c upstream. Avoids a potential infinite loop. It was observed once, during an EC hacking/debugging session - not in regular operation. Signed-off-by: Daniel Drake <dsd@laptop.org> Cc: dilinger@queued.net Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22x86, kexec: Make sure to stop all CPUs before exiting the kernelAlok Kataria
commit 76fac077db6b34e2c6383a7b4f3f4f7b7d06d8ce upstream. x86 smp_ops now has a new op, stop_other_cpus which takes a parameter "wait" this allows the caller to specify if it wants to stop until all the cpus have processed the stop IPI. This is required specifically for the kexec case where we should wait for all the cpus to be stopped before starting the new kernel. We now wait for the cpus to stop in all cases except for panic/kdump where we expect things to be broken and we are doing our best to make things work anyway. This patch fixes a legitimate regression, which was introduced during 2.6.30, by commit id 4ef702c10b5df18ab04921fc252c26421d4d6c75. Signed-off-by: Alok N Kataria <akataria@vmware.com> LKML-Reference: <1286833028.1372.20.camel@ank32.eng.vmware.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22x86, mrst: A function in a header file needs to be marked "inline"H. Peter Anvin
commit 55572b293b3a5929e8c54bc91d14ae6264186bf6 upstream. A function in a header file needs to be explicitly marked "inline", or gcc will complain if it is not used. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> LKML-Reference: <1274295685-6774-3-git-send-email-jacob.jun.pan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22x86, cpu: Fix renamed, not-yet-shipping AMD CPUID feature bitAndre Przywara
commit 7ef8aa72ab176e0288f363d1247079732c5d5792 upstream. The AMD SSE5 feature set as-it has been replaced by some extensions to the AVX instruction set. Thus the bit formerly advertised as SSE5 is re-used for one of these extensions (XOP). Although this changes the /proc/cpuinfo output, it is not user visible, as there are no CPUs (yet) having this feature. To avoid confusion this should be added to the stable series, too. Signed-off-by: Andre Przywara <andre.przywara@amd.com> LKML-Reference: <1283778860-26843-2-git-send-email-andre.przywara@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22mm, x86: Saving vmcore with non-lazy freeing of vmasCliff Wickman
commit 3ee48b6af49cf534ca2f481ecc484b156a41451d upstream. During the reading of /proc/vmcore the kernel is doing ioremap()/iounmap() repeatedly. And the buildup of un-flushed vm_area_struct's is causing a great deal of overhead. (rb_next() is chewing up most of that time). This solution is to provide function set_iounmap_nonlazy(). It causes a subsequent call to iounmap() to immediately purge the vma area (with try_purge_vmap_area_lazy()). With this patch we have seen the time for writing a 250MB compressed dump drop from 71 seconds to 44 seconds. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: kexec@lists.infradead.org LKML-Reference: <E1OwHZ4-0005WK-Tw@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-22perf_events: Fix bogus AMD64 generic TLB eventsStephane Eranian
commit ba0cef3d149ce4db293c572bf36ed352b11ce7b9 upstream. PERF_COUNT_HW_CACHE_DTLB:READ:MISS had a bogus umask value of 0 which counts nothing. Needed to be 0x7 (to count all possibilities). PERF_COUNT_HW_CACHE_ITLB:READ:MISS had a bogus umask value of 0 which counts nothing. Needed to be 0x3 (to count all possibilities). Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <4cb85478.41e9d80a.44e2.3f00@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-19KVM: Fix fs/gs reload oops with invalid ldtAvi Kivity
kvm reloads the host's fs and gs blindly, however the underlying segment descriptors may be invalid due to the user modifying the ldt after loading them. Fix by using the safe accessors (loadsegment() and load_gs_index()) instead of home grown unsafe versions. This is CVE-2010-3698. KVM-Stable-Tag. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-14Don't dump task struct in a.out core-dumpsLinus Torvalds
akiphie points out that a.out core-dumps have that odd task struct dumping that was never used and was never really a good idea (it goes back into the mists of history, probably the original core-dumping code). Just remove it. Also do the access_ok() check on dump_write(). It probably doesn't matter (since normal filesystems all seem to do it anyway), but he points out that it's normally done by the VFS layer, so ... [ I suspect that we should possibly do "vfs_write()" instead of calling ->write directly. That also does the whole fsnotify and write statistics thing, which may or may not be a good idea. ] And just to be anal, do this all for the x86-64 32-bit a.out emulation code too, even though it's not enabled (and won't currently even compile) Reported-by: akiphie <akiphie@lavabit.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-13Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, numa: For each node, register the memory blocks actually used x86, AMD, MCE thresholding: Fix the MCi_MISCj iteration order x86, mce, therm_throt.c: Fix missing curly braces in error handling logic
2010-10-11x86, numa: For each node, register the memory blocks actually usedYinghai Lu
Russ reported SGI UV is broken recently. He said: | The SRAT table shows that memory range is spread over two nodes. | | SRAT: Node 0 PXM 0 100000000-800000000 | SRAT: Node 1 PXM 1 800000000-1000000000 | SRAT: Node 0 PXM 0 1000000000-1080000000 | |Previously, the kernel early_node_map[] would show three entries |with the proper node. | |[ 0.000000] 0: 0x00100000 -> 0x00800000 |[ 0.000000] 1: 0x00800000 -> 0x01000000 |[ 0.000000] 0: 0x01000000 -> 0x01080000 | |The problem is recent community kernel early_node_map[] shows |only two entries with the node 0 entry overlapping the node 1 |entry. | | 0: 0x00100000 -> 0x01080000 | 1: 0x00800000 -> 0x01000000 After looking at the changelog, Found out that it has been broken for a while by following commit |commit 8716273caef7f55f39fe4fc6c69c5f9f197f41f1 |Author: David Rientjes <rientjes@google.com> |Date: Fri Sep 25 15:20:04 2009 -0700 | | x86: Export srat physical topology Before that commit, register_active_regions() is called for every SRAT memory entry right away. Use nodememblk_range[] instead of nodes[] in order to make sure we capture the actual memory blocks registered with each node. nodes[] contains an extended range which spans all memory regions associated with a node, but that does not mean that all the memory in between are included. Reported-by: Russ Anderson <rja@sgi.com> Tested-by: Russ Anderson <rja@sgi.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4CB27BDF.5000800@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@kernel.org> 2.6.33 .34 .35 .36 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-10-11KVM: x86: Move TSC reset out of vmcb_initZachary Amsden
The VMCB is reset whenever we receive a startup IPI, so Linux is setting TSC back to zero happens very late in the boot process and destabilizing the TSC. Instead, just set TSC to zero once at VCPU creation time. Why the separate patch? So git-bisect is your friend. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-11KVM: x86: Fix SVM VMCB resetZachary Amsden
On reset, VMCB TSC should be set to zero. Instead, code was setting tsc_offset to zero, which passes through the underlying TSC. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-10-11x86, AMD, MCE thresholding: Fix the MCi_MISCj iteration orderBorislav Petkov
This fixes possible cases of not collecting valid error info in the MCE error thresholding groups on F10h hardware. The current code contains a subtle problem of checking only the Valid bit of MSR0000_0413 (which is MC4_MISC0 - DRAM thresholding group) in its first iteration and breaking out if the bit is cleared. But (!), this MSR contains an offset value, BlkPtr[31:24], which points to the remaining MSRs in this thresholding group which might contain valid information too. But if we bail out only after we checked the valid bit in the first MSR and not the block pointer too, we miss that other information. The thing is, MC4_MISC0[BlkPtr] is not predicated on MCi_STATUS[MiscV] or MC4_MISC0[Valid] and should be checked prior to iterating over the MCI_MISCj thresholding group, irrespective of the MC4_MISC0[Valid] setting. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-08x86, mce, therm_throt.c: Fix missing curly braces in error handling logicJin Dongming
When the feature PTS is not supported by CPU, the sysfile package_power_limit_count for package should not be generated. This patch is used for fixing missing { and }. The patch is not complete as there are other error handling problems in this function - but that can wait until the merge window. Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@initel.com> Acked-by: Jean Delvare <khali@linux-fr.org> Cc: Brown Len <len.brown@intel.com> Cc: Guenter Roeck <guenter.roeck@ericsson.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: lm-sensors@lm-sensors.org <lm-sensors@lm-sensors.org> LKML-Reference: <4C7625D1.4060201@np.css.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-06Merge branch 'v2.6.36-rc6-urgent-fixes' of ↵Linus Torvalds
git://xenbits.xen.org/people/sstabellini/linux-pvhvm * 'v2.6.36-rc6-urgent-fixes' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: do not initialize PV timers on HVM if !xen_have_vector_callback xen: do not set xenstored_ready before xenbus_probe on hvm
2010-10-05Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf trace scripting: Fix extern struct definitions perf ui hist browser: Fix segfault on 'a' for annotate perf tools: Fix build breakage perf, x86: Handle in flight NMIs on P4 platform oprofile, ARM: Release resources on failure oprofile: Add Support for Intel CPU Family 6 / Model 29
2010-10-05modules: Fix module_bug_list list corruption raceLinus Torvalds
With all the recent module loading cleanups, we've minimized the code that sits under module_mutex, fixing various deadlocks and making it possible to do most of the module loading in parallel. However, that whole conversion totally missed the rather obscure code that adds a new module to the list for BUG() handling. That code was doubly obscure because (a) the code itself lives in lib/bugs.c (for dubious reasons) and (b) it gets called from the architecture-specific "module_finalize()" rather than from generic code. Calling it from arch-specific code makes no sense what-so-ever to begin with, and is now actively wrong since that code isn't protected by the module loading lock any more. So this commit moves the "module_bug_{finalize,cleanup}()" calls away from the arch-specific code, and into the generic code - and in the process protects it with the module_mutex so that the list operations are now safe. Future fixups: - move the module list handling code into kernel/module.c where it belongs. - get rid of 'module_bug_list' and just use the regular list of modules (called 'modules' - imagine that) that we already create and maintain for other reasons. Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Adrian Bunk <bunk@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-05xen: do not initialize PV timers on HVM if !xen_have_vector_callbackStefano Stabellini
if !xen_have_vector_callback do not initialize PV timer unconditionally because we still don't know how many cpus are available and if there is more than one we won't be able to receive the timer interrupts on cpu > 0. This patch fixes an hang at boot when Xen does not support vector callbacks and the guest has multiple vcpus. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
2010-10-04Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Fix memory leaks in pcc_cpufreq_do_osc [CPUFREQ] acpi-cpufreq: add missing __percpu markup
2010-10-01Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hpet: Fix bogus error check in hpet_assign_irq() x86, irq: Plug memory leak in sparse irq x86, cpu: After uncapping CPUID, re-run CPU feature detection
2010-09-30x86, hpet: Fix bogus error check in hpet_assign_irq()Thomas Gleixner
create_irq() returns -1 if the interrupt allocation failed, but the code checks for irq == 0. Use create_irq_nr() instead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <alpine.LFD.2.00.1009282310360.2416@localhost6.localdomain6> Cc: stable@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-30x86, irq: Plug memory leak in sparse irqThomas Gleixner
free_irq_cfg() is not freeing the cpumask_vars in irq_cfg. Fixing this triggers a use after free caused by the fact that copying struct irq_cfg is done with memcpy, which copies the pointer not the cpumask. Fix both places. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yhlu.kernel@gmail.com> LKML-Reference: <alpine.LFD.2.00.1009282052570.2416@localhost6.localdomain6> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-30[CPUFREQ] Fix memory leaks in pcc_cpufreq_do_oscPekka Enberg
If acpi_evaluate_object() function call doesn't fail, we must kfree() output.buffer before returning from pcc_cpufreq_do_osc(). Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Dave Jones <davej@redhat.com>
2010-09-30[CPUFREQ] acpi-cpufreq: add missing __percpu markupNamhyung Kim
acpi_perf_data is a percpu pointer but was missing __percpu markup. Add it. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Dave Jones <davej@redhat.com>
2010-09-30perf, x86: Handle in flight NMIs on P4 platformCyrill Gorcunov
Stephane reported we've forgot to guard the P4 platform against spurious in-flight performance IRQs. Fix it. This fixes potential spurious 'dazed and confused' NMI messages. Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: fweisbec@gmail.com Cc: peterz@infradead.org Cc: Robert Richter <robert.richter@amd.com> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <1285815698-4298-1-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-28ACPI: add missing __percpu markup in arch/x86/kernel/acpi/cstate.cNamhyung Kim
cpu_cstate_entry is a percpu pointer but was missing __percpu markup. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Len Brown <len.brown@intel.com>
2010-09-28x86, cpu: After uncapping CPUID, re-run CPU feature detectionH. Peter Anvin
After uncapping the CPUID level, we need to also re-run the CPU feature detection code. This resolves kernel bugzilla 16322. Reported-by: boris64 <bugzilla.kernel.org@boris64.net> Cc: <stable@kernel.org> v2.6.29..2.6.35 LKML-Reference: <tip-@git.kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-27Merge branch 'x86/urgent' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Avoid 'constant_test_bit()' misoptimization due to cast to non-volatile
2010-09-27Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/amd-iommu: Fix rounding-bug in __unmap_single x86/amd-iommu: Work around S3 BIOS bug x86/amd-iommu: Set iommu configuration flags in enable-loop x86, setup: Fix earlyprintk=serial,0x3f8,115200 x86, setup: Fix earlyprintk=serial,ttyS0,115200
2010-09-27Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, x86: Catch spurious interrupts after disabling counters tracing/x86: Don't use mcount in kvmclock.c tracing/x86: Don't use mcount in pvclock.c
2010-09-27Merge branch 'urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent
2010-09-26x86: Avoid 'constant_test_bit()' misoptimization due to cast to non-volatileAlexander Chumachenko
While debugging bit_spin_lock() hang, it was tracked down to gcc-4.4 misoptimization of non-inlined constant_test_bit() due to non-volatile addr when 'const volatile unsigned long *addr' cast to 'unsigned long *' with subsequent unconditional jump to pause (and not to the test) leading to hang. Compiling with gcc-4.3 or disabling CONFIG_OPTIMIZE_INLINING yields inlined constant_test_bit() and correct jump, thus working around the kernel bug. Other arches than asm-x86 may implement this slightly differently; 2.6.29 mitigates the misoptimization by changing the function prototype (commit c4295fbb6048d85f0b41c5ced5cbf63f6811c46c) but probably fixing the issue itself is better. Signed-off-by: Alexander Chumachenko <ledest@gmail.com> Signed-off-by: Michael Shigorin <mike@osdn.org.ua> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-09-24x86/hwmon: fix initialization of coretempJan Beulich
Using cpuid_eax() to determine feature availability on other than the current CPU is invalid. And feature availability should also be checked in the hotplug code path. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Rudolf Marek <r.marek@assembler.cz> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
2010-09-24perf, x86: Catch spurious interrupts after disabling countersRobert Richter
Some cpus still deliver spurious interrupts after disabling a counter. This caused 'undelivered NMI' messages. This patch fixes this. Introduced by: 4177c42: perf, x86: Try to handle unknown nmis with an enabled PMU Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: Don Zickus <dzickus@redhat.com> Cc: gorcunov@gmail.com <gorcunov@gmail.com> Cc: fweisbec@gmail.com <fweisbec@gmail.com> Cc: ying.huang@intel.com <ying.huang@intel.com> Cc: ming.m.lin@intel.com <ming.m.lin@intel.com> Cc: yinghai@kernel.org <yinghai@kernel.org> Cc: andi@firstfloor.org <andi@firstfloor.org> Cc: eranian@google.com <eranian@google.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20100915162034.GO13563@erda.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-24Merge branch 'amd-iommu/2.6.36' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent
2010-09-23x86/amd-iommu: Fix rounding-bug in __unmap_singleJoerg Roedel
In the __unmap_single function the dma_addr is rounded down to a page boundary before the dma pages are unmapped. The address is later also used to flush the TLB entries for that mapping. But without the offset into the dma page the amount of pages to flush might be miscalculated in the TLB flushing path. This patch fixes this bug by using the original address to flush the TLB. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-09-23x86/amd-iommu: Work around S3 BIOS bugJoerg Roedel
This patch adds a workaround for an IOMMU BIOS problem to the AMD IOMMU driver. The result of the bug is that the IOMMU does not execute commands anymore when the system comes out of the S3 state resulting in system failure. The bug in the BIOS is that is does not restore certain hardware specific registers correctly. This workaround reads out the contents of these registers at boot time and restores them on resume from S3. The workaround is limited to the specific IOMMU chipset where this problem occurs. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>