Age | Commit message (Collapse) | Author |
|
This reverts the following patch, which shouldn't have been applied
to the .32 stable tree as it causes problems.
commit 5fd4514bb351b5ecb0da3692fff70741e5ed200c upstream.
Set the PCI CLS early in the boot process to prevent
device failures. In pcibios_set_master use the new
pci_cache_line_size instead of a hard-coded value.
Signed-off-by: Carlos O'Donell <carlos@codesourcery.com>
Reviewed-by: Grant Grundler <grundler@google.com>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit f01487119dda3d9f58c9729c7361ecc50a61c188 upstream.
If host CPU is exposed to a guest the OSVW MSRs are not guaranteed
to be present and a GP fault occurs. Thus checking the feature flag is
essential.
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100427101348.GC4489@alberich.amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
environments
commit 7f284d3cc96e02468a42e045f77af11e5ff8b095 upstream.
When running a quest kernel on xen we get:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df
PGD 0
Oops: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 0, comm: swapper Tainted: G W 2.6.34-rc3 #1 /HVM domU
RIP: 0010:[<ffffffff8142f2fb>] [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x
2ca/0x3df
RSP: 0018:ffff880002203e08 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000060
RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000
RBP: ffff880002203ed8 R08: 00000000000017c0 R09: ffff880002203e38
R10: ffff8800023d5d40 R11: ffffffff81a01e28 R12: ffff880187e6f5c0
R13: ffff880002203e34 R14: ffff880002203e58 R15: ffff880002203e68
FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000038 CR3: 0000000001a3c000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a44020)
Stack:
ffffffff810d7ecb ffff880002203e20 ffffffff81059140 ffff880002203e30
<0> ffffffff810d7ec9 0000000002203e40 000000000050d140 ffff880002203e70
<0> 0000000002008140 0000000000000086 ffff880040020140 ffffffff81068b8b
Call Trace:
<IRQ>
[<ffffffff810d7ecb>] ? sync_supers_timer_fn+0x0/0x1c
[<ffffffff81059140>] ? mod_timer+0x23/0x25
[<ffffffff810d7ec9>] ? arm_supers_timer+0x34/0x36
[<ffffffff81068b8b>] ? hrtimer_get_next_event+0xa7/0xc3
[<ffffffff81058e85>] ? get_next_timer_interrupt+0x19a/0x20d
[<ffffffff8142fa23>] get_cpu_leaves+0x5c/0x232
[<ffffffff8106a7b1>] ? sched_clock_local+0x1c/0x82
[<ffffffff8106a9a0>] ? sched_clock_tick+0x75/0x7a
[<ffffffff8107748c>] generic_smp_call_function_single_interrupt+0xae/0xd0
[<ffffffff8101f6ef>] smp_call_function_single_interrupt+0x18/0x27
[<ffffffff8100a773>] call_function_single_interrupt+0x13/0x20
<EOI>
[<ffffffff8143c468>] ? notifier_call_chain+0x14/0x63
[<ffffffff810295c6>] ? native_safe_halt+0xc/0xd
[<ffffffff810114eb>] ? default_idle+0x36/0x53
[<ffffffff81008c22>] cpu_idle+0xaa/0xe4
[<ffffffff81423a9a>] rest_init+0x7e/0x80
[<ffffffff81b10dd2>] start_kernel+0x40e/0x419
[<ffffffff81b102c8>] x86_64_start_reservations+0xb3/0xb7
[<ffffffff81b103c4>] x86_64_start_kernel+0xf8/0x107
Code: 14 d5 40 ff ae 81 8b 14 02 31 c0 3b 15 47 1c 8b 00 7d 0e 48 8b 05 36 1c 8b
00 48 63 d2 48 8b 04 d0 c7 85 5c ff ff ff 00 00 00 00 <8b> 70 38 48 8d 8d 5c ff
ff ff 48 8b 78 10 ba c4 01 00 00 e8 eb
RIP [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df
RSP <ffff880002203e08>
CR2: 0000000000000038
---[ end trace a7919e7f17c0a726 ]---
The L3 cache index disable feature of AMD CPUs has to be disabled if the
kernel is running as guest on top of a hypervisor because northbridge
devices are not available to the guest. Currently, this fixes a boot
crash on top of Xen. In the future this will become an issue on KVM as
well.
Check if northbridge devices are present and do not enable the feature
if there are none.
[ hpa: backported to 2.6.34 ]
Signed-off-by: Frank Arnold <frank.arnold@amd.com>
LKML-Reference: <1271945222-5283-3-git-send-email-bp@amd64.org>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit ade029e2aaacc8965a548b0b0f80c5bee97ffc68 upstream.
K8_NB depends on PCI and when the last is disabled (allnoconfig) we fail
at the final linking stage due to missing exported num_k8_northbridges.
Add a header stub for that.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100503183036.GJ26107@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 0fe1ac48bef018bed896307cd12f6ca9b5e704ab upstream.
Anton Blanchard found that large POWER systems would occasionally
crash in the exception exit path when profiling with perf_events.
The symptom was that an interrupt would occur late in the exit path
when the MSR[RI] (recoverable interrupt) bit was clear. Interrupts
should be hard-disabled at this point but they were enabled. Because
the interrupt was not recoverable the system panicked.
The reason is that the exception exit path was calling
perf_event_do_pending after hard-disabling interrupts, and
perf_event_do_pending will re-enable interrupts.
The simplest and cleanest fix for this is to use the same mechanism
that 32-bit powerpc does, namely to cause a self-IPI by setting the
decrementer to 1. This means we can remove the tests in the exception
exit path and raw_local_irq_restore.
This also makes sure that the call to perf_event_do_pending from
timer_interrupt() happens within irq_enter/irq_exit. (Note that
calling perf_event_do_pending from timer_interrupt does not mean that
there is a possible 1/HZ latency; setting the decrementer to 1 ensures
that the timer interrupt will happen immediately, i.e. within one
timebase tick, which is a few nanoseconds or 10s of nanoseconds.)
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 545c174d1f093a462b4bb9131b23d5ea72a600e1 upstream.
strace may change the system call number, so regs->gprs[2] must not
be read before tracehook_report_syscall_entry(). This fixes a bug
where "strace -f" will hang after a vfork().
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
(cherry picked from commit e65c7f33d75e977350ca350573d93c517ec02776)
Previously it was unconditionally used on all Sibyte family SOCs. The
M3 bug has to be handled in the TLB exception handler which is extremly
performance sensitive, so this modification is expected to deliver around
2-3% performance improvment. This is important as required changes to the
M3 workaround will make it more costly.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit ccb8d8d070b8f25f0163da5c9ceacf63a5169540 upstream.
The use of mfp_cfg_t causes build errors without including <mach/mfp.h>.
CC: Daniel Mack <daniel@caiaq.de>
Signed-off-by: Jakob Viketoft <jakob.viketoft@bitsim.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 5fd4514bb351b5ecb0da3692fff70741e5ed200c upstream.
Set the PCI CLS early in the boot process to prevent
device failures. In pcibios_set_master use the new
pci_cache_line_size instead of a hard-coded value.
Signed-off-by: Carlos O'Donell <carlos@codesourcery.com>
Reviewed-by: Grant Grundler <grundler@google.com>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 56151e753468e34aeb322af4b0309ab727c97d2e upstream.
The bypassing of this test is a leftover from 2.4 vintage
kernels, and is no longer appropriate, or even used by KGDB.
Currently KGDB uses probe_kernel_write() for all access to
memory via the KGDB core, so it can simply be deleted.
This fixes CVE-2010-1446.
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Wufei <fei.wu@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Commit 78ce64a384 in v2.6.32.12 introduced a warning due to unused
load_segment_descriptor_to_kvm_desct helper, which has been opencoded by
this commit.
On upstream, the helper was removed as part of a different commit.
Remove the now unused function.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit d7f0eea9e431e1b8b0742a74db1a9490730b2a25 upstream.
Introduce kernel parameter acpi_sleep=sci_force_enable
some laptop requires SCI_EN being set directly on resume,
or else they hung somewhere in the resume code path.
We already have a blacklist for these laptops but we still need
this option, especially when debugging some suspend/resume problems,
in case there are systems that need this workaround and are not yet
in the blacklist.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit ebb682f522411abbe358059a256a8672ec0bd55b upstream.
The per_cpu cpuid4_info shared_map can contain stale data when CPUs are added
and removed.
The stale data can lead to a NULL pointer derefernce panic on a remove of a
CPU that has had siblings previously removed.
This patch resolves the panic by verifying a cpu is actually online before
adding it to the shared_cpu_map, only examining cpus that are part of
the same lower level cache, and by updating other siblings lowest level cache
maps when a cpu is added.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
LKML-Reference: <20091209183336.17855.98708.sendpatchset@prarit.bos.redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
systems
commit 0e152cd7c16832bd5cadee0c2e41d9959bc9b6f9 upstream.
de957628ce7c84764ff41331111036b3ae5bad0f changed setting of the
x86_init.iommu.iommu_init function ptr only when GART IOMMU is
found.
One side effect of it is that num_k8_northbridges
is not initialized anymore if not explicitly
called. This resulted in uninitialized pointers in
<arch/x86/kernel/cpu/intel_cacheinfo.c:amd_calc_l3_indices()>,
for example, which uses the num_k8_northbridges thing through
node_to_k8_nb_misc().
Fix that through an initcall that runs right after the PCI
subsystem and does all the scanning. Then, remove initialization
in gart_iommu_init() which is a rootfs_initcall and we're
running before that.
What is more, since num_k8_northbridges is being used in other
places beside GART IOMMU, include it whenever we add AMD CPU
support. The previous dependency chain in kconfig contained
K8_NB depends on AGP_AMD64|GART_IOMMU
which was clearly incorrect. The more natural way in terms of
hardware dependency should be
AGP_AMD64|GART_IOMMU depends on K8_NB depends on CPU_SUP_AMD &&
PCI. Make it so Number One!
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <20100312144303.GA29262@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 7a0fc404ae663776e96db43879a0fa24fec1fa3a upstream.
Atom erratum AAE44/AAF40/AAG38/AAH41:
"If software clears the PS (page size) bit in a present PDE (page
directory entry), that will cause linear addresses mapped through this
PDE to use 4-KByte pages instead of using a large page after old TLB
entries are invalidated. Due to this erratum, if a code fetch uses
this PDE before the TLB entry for the large page is invalidated then
it may fetch from a different physical address than specified by
either the old large page translation or the new 4-KByte page
translation. This erratum may also cause speculative code fetches from
incorrect addresses."
[http://download.intel.com/design/processor/specupdt/319536.pdf]
Where as commit 211b3d03c7400f48a781977a50104c9d12f4e229 seems to
workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
opportunity for the bug to occur and does not totally remove it. This
patch disables mixed 4K/4MB page tables totally avoiding the page
splitting and not tripping this processor issue.
This is based on an original patch by Colin King.
Originally-by: Colin Ian King <colin.king@canonical.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 7ce5a2b9bb2e92902230e3121d8c3047fab9cb47 upstream.
When we do a thread switch, we clear the outgoing FS/GS base if the
corresponding selector is nonzero. This is taken by __switch_to() as
an entry invariant; it does not verify that it is true on entry.
However, copy_thread() doesn't enforce this constraint, which can
result in inconsistent results after fork().
Make copy_thread() match the behavior of __switch_to().
Reported-and-tested-by: Samuel Thibault <samuel.thibault@inria.fr>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4BD1E061.8030605@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit b810e94c9d8e3fff6741b66cd5a6f099a7887871 upstream.
With F10, model 10, all valid frequencies are in the ACPI _PST table.
Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
LKML-Reference: <1270065406-1814-6-git-send-email-bp@amd64.org>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Reviewed-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit a29815a333c6c6e677294bbe5958e771d0aad3fd upstream.
The list macros use LIST_POISON1 and LIST_POISON2 as undereferencable
pointers in order to trap erronous use of freed list_heads. Unfortunately
userspace can arrange for those pointers to actually be dereferencable,
potentially turning an oops to an expolit.
To avoid this allow architectures (currently x86_64 only) to override
the default values for these pointers with truly-undereferencable values.
This is easy on x86_64 as the virtual address space is large and contains
areas that cannot be mapped.
Other 64-bit architectures will likely find similar unmapped ranges.
[ingo: switch to 0xdead000000000000 as the unmapped area]
[ingo: add comments, cleanup]
[jaswinder: eliminate sparse warnings]
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 4b83873d3da0704987cb116833818ed96214ee29 upstream.
If we boot into a crash-kernel the gart might still be
enabled and its caches might be dirty. This can result in
undefined behavior later. Fix it by explicitly disabling the
gart hardware before initialization and flushing the caches
after enablement.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
(Cherry-picked from commit e8861cfe2c75bdce36655b64d7ce02c2b31b604d)
A 16-bit TSS is only 44 bytes long. So make sure to test for the correct
size on task switch.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
(Cherry-picked from commit 87bf6e7de1134f48681fd2ce4b7c1ec45458cb6d)
Int is not long enough to store the size of a dirty bitmap.
This patch fixes this problem with the introduction of a wrapper
function to calculate the sizes of dirty bitmaps.
Note: in mark_page_dirty(), we have to consider the fact that
__set_bit() takes the offset as int, not long.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
(Cherry-picked from commit 77662e0028c7c63e34257fda03ff9625c59d939d)
This patch fix:
- calculate zapped page number properly in mmu_zap_unsync_children()
- calculate freeed page number properly kvm_mmu_change_mmu_pages()
- if zapped children page it shoud restart hlist walking
KVM-Stable-Tag.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
(Cherry-picked from commit 78ac8b47c566dd6177a3b9b291b756ccb70670b7)
Currently we set eflags.vm unconditionally when entering real mode emulation
through virtual-8086 mode, and clear it unconditionally when we enter protected
mode. The means that the following sequence
KVM_SET_REGS (rflags.vm=1)
KVM_SET_SREGS (cr0.pe=1)
Ends up with rflags.vm clear due to KVM_SET_SREGS triggering enter_pmode().
Fix by shadowing rflags.vm (and rflags.iopl) correctly while in real mode:
reads and writes to those bits access a shadow register instead of the actual
register.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
(Cherry-picked from commit 114be429c8cd44e57f312af2bbd6734e5a185b0d)
There is a quirk for AMD K8 CPUs in many Linux kernels (see
arch/x86/kernel/cpu/mcheck/mce.c:__mcheck_cpu_apply_quirks()) that
clears bit 10 in that MCE related MSR. KVM can only cope with all
zeros or all ones, so it will inject a #GP into the guest, which
will let it panic.
So lets add a quirk to the quirk and ignore this single cleared bit.
This fixes -cpu kvm64 on all machines and -cpu host on K8 machines
with some guest Linux kernels.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
(Cherry-picked from commit d6a23895aa82353788a1cc5a1d9a1c963465463e)
These are guest-triggerable.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
(Cherry-picked from commit b7af40433870aa0636932ad39b0c48a0cb319057)
svm_create_vcpu() does not free the pages allocated during the creation
when it fails to complete the allocations. This patch fixes it.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
original patch commit ids: 452a339a976e7f782c786eb3f73080401e2fa3a6 and
134fbadf028a5977a1b06b0253d3ee33e6f0c642
perf_events, x86: Implement Intel Westmere support
The new Intel documentation includes Westmere arch specific
event maps that are significantly different from the Nehalem
ones. Add support for this generation.
Found the CPUID model numbers on wikipedia.
Also ammend some Nehalem constraints, spotted those when looking
for the differences between Nehalem and Westmere.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <20100127221122.151865645@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf, x86: Enable Nehalem-EX support
According to Intel Software Devel Manual Volume 3B, the
Nehalem-EX PMU is just like regular Nehalem (except for the
uncore support, which is completely different).
Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <alpine.DEB.2.00.1004060956580.1417@cl320.eecs.utk.edu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Youquan Song <youquan.song@linux.intel.com>
|
|
commit 93da6202264ce1256b04db8008a43882ae62d060 upstream.
This patch adds the Intel Cougar Point (PCH) LPC and SMBus Controller DeviceIDs.
Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: maximilian attems <max@stro.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc upstream.
The Intel Architecture Optimization Reference Manual states that a short
load that follows a long store to the same object will suffer a store
forwading penalty, particularly if the two accesses use different addresses.
Trivially, a long load that follows a short store will also suffer a penalty.
__downgrade_write() in rwsem incurs both penalties: the increment operation
will not be able to reuse a recently-loaded rwsem value, and its result will
not be reused by any recently-following rwsem operation.
A comment in the code states that this is because 64-bit immediates are
special and expensive; but while they are slightly special (only a single
instruction allows them), they aren't expensive: a test shows that two loops,
one loading a 32-bit immediate and one loading a 64-bit immediate, both take
1.5 cycles per iteration.
Fix this by changing __downgrade_write to use the same add instruction on
i386 and on x86_64, so that it uses the same operand size as all the other
rwsem functions.
Signed-off-by: Avi Kivity <avi@redhat.com>
LKML-Reference: <1266049992-17419-1-git-send-email-avi@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe upstream.
The patches 5d0b7235d83eefdafda300656e97d368afcafc9a and
bafaecd11df15ad5b1e598adc7736afcd38ee13d broke the UML build:
On Sun, 17 Jan 2010, Ingo Molnar wrote:
>
> FYI, -tip testing found that these changes break the UML build:
>
> kernel/built-in.o: In function `__up_read':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:192: undefined reference to `call_rwsem_wake'
> kernel/built-in.o: In function `__up_write':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:210: undefined reference to `call_rwsem_wake'
> kernel/built-in.o: In function `__downgrade_write':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:228: undefined reference to `call_rwsem_downgrade_wake'
> kernel/built-in.o: In function `__down_read':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:112: undefined reference to `call_rwsem_down_read_failed'
> kernel/built-in.o: In function `__down_write_nested':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:154: undefined reference to `call_rwsem_down_write_failed'
> collect2: ld returned 1 exit status
Add lib/rwsem_64.o to the UML subarch objects to fix.
LKML-Reference: <alpine.LFD.2.00.1001171023440.13231@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit bafaecd11df15ad5b1e598adc7736afcd38ee13d upstream.
This one is much faster than the spinlock based fallback rwsem code,
with certain artifical benchmarks having shown 300%+ improvement on
threaded page faults etc.
Again, note the 32767-thread limit here. So this really does need that
whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
extension on top of it, but that is conceptually a totally independent
issue.
NOT TESTED! The original patch that this all was based on were tested by
KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
cleaned-up series, so caveat emptor..
Also note that it _may_ be a good idea to mark some more registers
clobbered on x86-64 in the inline asms instead of saving/restoring them.
They are inline functions, but they are only used in places where there
are not a lot of live registers _anyway_, so doing for example the
clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
worse, and would make the slow-path code smaller.
(Not that the slow-path really matters to that degree. Saving a few
unnecessary registers is the _least_ of our problems when we hit the slow
path. The instruction/cycle counting really only matters in the fast
path).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121810410.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 1838ef1d782f7527e6defe87e180598622d2d071 upstream.
For x86-64, 32767 threads really is not enough. Change rwsem_count_t
to a signed long, so that it is 64 bits on x86-64.
This required the following changes to the assembly code:
a) %z0 doesn't work on all versions of gcc! At least gcc 4.4.2 as
shipped with Fedora 12 emits "ll" not "q" for 64 bits, even for
integer operands. Newer gccs apparently do this correctly, but
avoid this problem by using the _ASM_ macros instead of %z.
b) 64 bits immediates are only allowed in "movq $imm,%reg"
constructs... no others. Change some of the constraints to "e",
and fix the one case where we would have had to use an invalid
immediate -- in that case, we only care about the upper half
anyway, so just access the upper half.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <tip-bafaecd11df15ad5b1e598adc7736afcd38ee13d@git.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 5d0b7235d83eefdafda300656e97d368afcafc9a upstream.
The fast version of the rwsems (the code that uses xadd) has
traditionally only worked on x86-32, and as a result it mixes different
kinds of types wildly - they just all happen to be 32-bit. We have
"long", we have "__s32", and we have "int".
To make it work on x86-64, the types suddenly matter a lot more. It can
be either a 32-bit or 64-bit signed type, and both work (with the caveat
that a 32-bit counter will only have 15 bits of effective write
counters, so it's limited to 32767 users). But whatever type you
choose, it needs to be used consistently.
This makes a new 'rwsem_counter_t', that is a 32-bit signed type. For a
64-bit type, you'd need to also update the BIAS values.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121755220.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 59c33fa7791e9948ba467c2b83e307a0d087ab49 upstream.
This makes gcc use the right register names and instruction operand sizes
automatically for the rwsem inline asm statements.
So instead of using "(%%eax)" to specify the memory address that is the
semaphore, we use "(%1)" or similar. And instead of forcing the operation
to always be 32-bit, we use "%z0", taking the size from the actual
semaphore data structure itself.
This doesn't actually matter on x86-32, but if we want to use the same
inline asm for x86-64, we'll need to have the compiler generate the proper
64-bit names for the registers (%rax instead of %eax), and if we want to
use a 64-bit counter too (in order to avoid the 15-bit limit on the
write counter that limits concurrent users to 32767 threads), we'll need
to be able to generate instructions with "q" accesses rather than "l".
Since this header currently isn't enabled on x86-64, none of that matters,
but we do want to use the xadd version of the semaphores rather than have
to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended
when you have lots of threads all taking page faults, and the fallback
rwsem code that uses a spinlock performs abysmally badly in that case.
[ hpa: modified the patch to skip size suffixes entirely when they are
redundant due to register operands. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit cb19060abfdecac0d1eb2d2f0e7d6b7a3f8bc4f4 upstream.
Final stage linking can fail with
arch/x86/built-in.o: In function `store_cache_disable':
intel_cacheinfo.c:(.text+0xc509): undefined reference to `amd_get_nb_id'
arch/x86/built-in.o: In function `show_cache_disable':
intel_cacheinfo.c:(.text+0xc7d3): undefined reference to `amd_get_nb_id'
when CONFIG_CPU_SUP_AMD is not enabled because the amd_get_nb_id
helper is defined in AMD-specific code but also used in generic code
(intel_cacheinfo.c). Reorganize the L3 cache index disable code under
CONFIG_CPU_SUP_AMD since it is AMD-only anyway.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100218184210.GF20473@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit f619b3d8427eb57f0134dab75b0d217325c72411 upstream.
The show/store_cache_disable routines depend unnecessarily on NUMA's
cpu_to_node and the disabling of cache indices broke when !CONFIG_NUMA.
Remove that dependency by using a helper which is always correct.
While at it, enable L3 Cache Index disable on rev D1 Istanbuls which
sport the feature too.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100218184339.GG20473@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 048a8774ca43488d78605031f11cc206d7a2682a upstream.
We need to know the valid L3 indices interval when disabling them over
/sysfs. Do that when the core is brought online and add boundary checks
to the sysfs .store attribute.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-6-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 897de50e08937663912c86fb12ad7f708af2386c upstream.
The cache_disable_[01] attribute in
/sys/devices/system/cpu/cpu?/cache/index[0-3]/
is enabled on all cache levels although only L3 supports it. Add it only
to the cache level that actually supports it.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-5-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit dcf39daf3d6d97f8741e82f0b9fb7554704ed2d1 upstream.
* Correct the masks used for writing the cache index disable indices.
* Do not turn off L3 scrubber - it is not necessary.
* Make sure wbinvd is executed on the same node where the L3 is.
* Check for out-of-bounds values written to the registers.
* Make show_cache_disable hex values unambiguous
* Check for Erratum #388
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-4-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit a7b480e7f30b3813353ec009f10f2ac7a6669f3b upstream.
Add wbinvd_on_cpu and wbinvd_on_all_cpus stubs for executing wbinvd on a
particular CPU.
[ hpa: renamed lib/smp.c to lib/cache-smp.c ]
[ hpa: wbinvd_on_all_cpus() returns int, but wbinvd() returns
void. Thus, the former cannot be a macro for the latter,
replace with an inline function. ]
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-2-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 8f9f55e83e939724490d7cde3833c4883c6d1310 upstream.
This effectively reverts commit 61d047be99757fd9b0af900d7abce9a13a337488.
Disabling the IOMMU can potetially allow DMA transactions to
complete without being translated. Leave it enabled, and allow
crash kernel to do the IOMMU reinitialization properly.
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 75f66533bc883f761a7adcab3281fe3323efbc90 upstream.
Hit another kdump problem as reported by Neil Horman. When initializaing
the IOMMU, we attach devices to their domains before the IOMMU is
fully (re)initialized. Attaching a device will issue some important
invalidations. In the context of the newly kexec'd kdump kernel, the
IOMMU may have stale cached data from the original kernel. Because we
do the attach too early, the invalidation commands are placed in the new
command buffer before the IOMMU is updated w/ that buffer. This leaves
the stale entries in the kdump context and can renders device unusable.
Simply enable the IOMMU before we do the attach.
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 8b408fe4f853dcfa18d133aa4cf1d7546b4c3870 upstream.
In the amd_iommu_domain_destroy the protection_domain_free
function is partly reimplemented. The 'partly' is the bug
here because the domain is not deleted from the domain list.
This results in use-after-free errors and data-corruption.
Fix it by just using protection_domain_free instead.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 472a474c6630efd195d3738339fd1bdc8aa3b1aa upstream.
Jan Grossmann reported kernel boot panic while booting SMP
kernel on his system with a single core cpu. SMP kernels call
enable_IR_x2apic() from native_smp_prepare_cpus() and on
platforms where the kernel doesn't find SMP configuration we
ended up again calling enable_IR_x2apic() from the
APIC_init_uniprocessor() call in the smp_sanity_check(). Thus
leading to kernel panic.
Don't call enable_IR_x2apic() and default_setup_apic_routing()
from APIC_init_uniprocessor() in CONFIG_SMP case.
NOTE: this kind of non-idempotent and assymetric initialization
sequence is rather fragile and unclean, we'll clean that up
in v2.6.35. This is the minimal fix for v2.6.34.
Reported-by: Jan.Grossmann@kielnet.net
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: <jbarnes@virtuousgeek.org>
Cc: <david.woodhouse@intel.com>
Cc: <weidong.han@intel.com>
Cc: <youquan.song@intel.com>
Cc: <Jan.Grossmann@kielnet.net>
LKML-Reference: <1270083887.7835.78.camel@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 8da854cb02156c90028233ae1e85ce46a1d3f82c upstream.
On Wed, Feb 24, 2010 at 03:37:04PM -0800, Justin Piszcz wrote:
> Hello,
>
> Again, on the Intel DP55KG board:
>
> # uname -a
> Linux host 2.6.33 #1 SMP Wed Feb 24 18:31:00 EST 2010 x86_64 GNU/Linux
>
> [ 1.237600] ------------[ cut here ]------------
> [ 1.237890] WARNING: at arch/x86/kernel/hpet.c:404 hpet_next_event+0x70/0x80()
> [ 1.238221] Hardware name:
> [ 1.238504] hpet: compare register read back failed.
> [ 1.238793] Modules linked in:
> [ 1.239315] Pid: 0, comm: swapper Not tainted 2.6.33 #1
> [ 1.239605] Call Trace:
> [ 1.239886] <IRQ> [<ffffffff81056c13>] ? warn_slowpath_common+0x73/0xb0
> [ 1.240409] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0
> [ 1.240699] [<ffffffff81056cb0>] ? warn_slowpath_fmt+0x40/0x50
> [ 1.240992] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0
> [ 1.241281] [<ffffffff81041ad0>] ? hpet_next_event+0x70/0x80
> [ 1.241573] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0
> [ 1.241859] [<ffffffff81078e32>] ? tick_handle_oneshot_broadcast+0xe2/0x100
> [ 1.246533] [<ffffffff8102a67a>] ? timer_interrupt+0x1a/0x30
> [ 1.246826] [<ffffffff81085499>] ? handle_IRQ_event+0x39/0xd0
> [ 1.247118] [<ffffffff81087368>] ? handle_edge_irq+0xb8/0x160
> [ 1.247407] [<ffffffff81029f55>] ? handle_irq+0x15/0x20
> [ 1.247689] [<ffffffff810294a2>] ? do_IRQ+0x62/0xe0
> [ 1.247976] [<ffffffff8146be53>] ? ret_from_intr+0x0/0xa
> [ 1.248262] <EOI> [<ffffffff8102f277>] ? mwait_idle+0x57/0x80
> [ 1.248796] [<ffffffff8102645c>] ? cpu_idle+0x5c/0xb0
> [ 1.249080] ---[ end trace db7f668fb6fef4e1 ]---
>
> Is this something Intel has to fix or is it a bug in the kernel?
This is a chipset erratum.
Thomas: You mentioned we can retain this check only for known-buggy and
hpet debug kind of options. But here is the simple workaround patch for
this particular erratum.
Some chipsets have a erratum due to which read immediately following a
write of HPET comparator returns old comparator value instead of most
recently written value.
Erratum 15 in
"Intel I/O Controller Hub 9 (ICH9) Family Specification Update"
(http://www.intel.com/assets/pdf/specupdate/316973.pdf)
Workaround for the errata is to read the comparator twice if the first
one fails.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20100225185348.GA9674@linux-os.sc.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 18ed61da985c57eea3fe8038b13fa2837c9b3c3f upstream.
Andrew complained rightly that the WARN_ON in hpet_next_event() is
confusing and the code comment not really helpful.
Change it to WARN_ONCE and print the reason in clear text. Change the
comment to explain what kind of hardware wreckage we deal with.
Pointed-out-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 8ae06d223f8203c72104e5c0c4ee49a000aedb42 upstream.
Colin King reported a strange oops in S4 resume code path (see below). The test
system has i5/i7 CPU. The kernel doesn't open PAE, so 4M page table is used.
The oops always happen a virtual address 0xc03ff000, which is mapped to the
last 4k of first 4M memory. Doing a global tlb flush fixes the issue.
EIP: 0060:[<c0493a01>] EFLAGS: 00010086 CPU: 0
EIP is at copy_loop+0xe/0x15
EAX: 36aeb000 EBX: 00000000 ECX: 00000400 EDX: f55ad46c
ESI: 0f800000 EDI: c03ff000 EBP: f67fbec4 ESP: f67fbea8
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
...
...
CR2: 00000000c03ff000
Tested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
LKML-Reference: <20100305005932.GA22675@sli10-desk.sh.intel.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit d4d9959c099751158c5cf14813fe378e206339c6 upstream.
98e12b5a6e05413 ("ARM: Fix decompressor's kernel size estimation for
ROM=y") broke the Thumb-2 decompressor because it added an entry in the
LC0 table but didn't adjust the offset the Thumb-2 code uses to load the
SP from that table. Fix it.
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit a68a6a7282373bedba8a2ed751b6384edb983a64 upstream
Disable paravirt MMU capability reporting, so that new (or rebooted)
guests switch to native operation.
Paravirt MMU is a burden to maintain and does not bring significant
advantages compared to shadow anymore.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 046d87103addc117f0d397196e85189722d4d7de upstream
Otherwise would cause VMEntry failure when using ept=0 on unrestricted guest
supported processors.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|