Age | Commit message (Collapse) | Author |
|
commit f98b7a772ab51b52ca4d2a14362fc0e0c8a2e0f3 upstream.
There was a large performance regression that was bisected to
commit 611ae8e3 ("x86/tlb: enable tlb flush range support for
x86"). This patch simply changes the default balance point
between a local and global flush for IvyBridge.
In the interest of allowing the tests to be reproduced, this
patch was tested using mmtests 0.15 with the following
configurations
configs/config-global-dhp__tlbflush-performance
configs/config-global-dhp__scheduler-performance
configs/config-global-dhp__network-performance
Results are from two machines
Ivybridge 4 threads: Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz
Ivybridge 8 threads: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
Page fault microbenchmark showed nothing interesting.
Ebizzy was configured to run multiple iterations and threads.
Thread counts ranged from 1 to NR_CPUS*2. For each thread count,
it ran 100 iterations and each iteration lasted 10 seconds.
Ivybridge 4 threads
3.13.0-rc7 3.13.0-rc7
vanilla altshift-v3
Mean 1 6395.44 ( 0.00%) 6789.09 ( 6.16%)
Mean 2 7012.85 ( 0.00%) 8052.16 ( 14.82%)
Mean 3 6403.04 ( 0.00%) 6973.74 ( 8.91%)
Mean 4 6135.32 ( 0.00%) 6582.33 ( 7.29%)
Mean 5 6095.69 ( 0.00%) 6526.68 ( 7.07%)
Mean 6 6114.33 ( 0.00%) 6416.64 ( 4.94%)
Mean 7 6085.10 ( 0.00%) 6448.51 ( 5.97%)
Mean 8 6120.62 ( 0.00%) 6462.97 ( 5.59%)
Ivybridge 8 threads
3.13.0-rc7 3.13.0-rc7
vanilla altshift-v3
Mean 1 7336.65 ( 0.00%) 7787.02 ( 6.14%)
Mean 2 8218.41 ( 0.00%) 9484.13 ( 15.40%)
Mean 3 7973.62 ( 0.00%) 8922.01 ( 11.89%)
Mean 4 7798.33 ( 0.00%) 8567.03 ( 9.86%)
Mean 5 7158.72 ( 0.00%) 8214.23 ( 14.74%)
Mean 6 6852.27 ( 0.00%) 7952.45 ( 16.06%)
Mean 7 6774.65 ( 0.00%) 7536.35 ( 11.24%)
Mean 8 6510.50 ( 0.00%) 6894.05 ( 5.89%)
Mean 12 6182.90 ( 0.00%) 6661.29 ( 7.74%)
Mean 16 6100.09 ( 0.00%) 6608.69 ( 8.34%)
Ebizzy hits the worst case scenario for TLB range flushing every
time and it shows for these Ivybridge CPUs at least that the
default choice is a poor on. The patch addresses the problem.
Next was a tlbflush microbenchmark written by Alex Shi at
http://marc.info/?l=linux-kernel&m=133727348217113 . It
measures access costs while the TLB is being flushed. The
expectation is that if there are always full TLB flushes that
the benchmark would suffer and it benefits from range flushing
There are 320 iterations of the test per thread count. The
number of entries is randomly selected with a min of 1 and max
of 512. To ensure a reasonably even spread of entries, the full
range is broken up into 8 sections and a random number selected
within that section.
iteration 1, random number between 0-64
iteration 2, random number between 64-128 etc
This is still a very weak methodology. When you do not know
what are typical ranges, random is a reasonable choice but it
can be easily argued that the opimisation was for smaller ranges
and an even spread is not representative of any workload that
matters. To improve this, we'd need to know the probability
distribution of TLB flush range sizes for a set of workloads
that are considered "common", build a synthetic trace and feed
that into this benchmark. Even that is not perfect because it
would not account for the time between flushes but there are
limits of what can be reasonably done and still be doing
something useful. If a representative synthetic trace is
provided then this benchmark could be revisited and the shift values retuned.
Ivybridge 4 threads
3.13.0-rc7 3.13.0-rc7
vanilla altshift-v3
Mean 1 10.50 ( 0.00%) 10.50 ( 0.03%)
Mean 2 17.59 ( 0.00%) 17.18 ( 2.34%)
Mean 3 22.98 ( 0.00%) 21.74 ( 5.41%)
Mean 5 47.13 ( 0.00%) 46.23 ( 1.92%)
Mean 8 43.30 ( 0.00%) 42.56 ( 1.72%)
Ivybridge 8 threads
3.13.0-rc7 3.13.0-rc7
vanilla altshift-v3
Mean 1 9.45 ( 0.00%) 9.36 ( 0.93%)
Mean 2 9.37 ( 0.00%) 9.70 ( -3.54%)
Mean 3 9.36 ( 0.00%) 9.29 ( 0.70%)
Mean 5 14.49 ( 0.00%) 15.04 ( -3.75%)
Mean 8 41.08 ( 0.00%) 38.73 ( 5.71%)
Mean 13 32.04 ( 0.00%) 31.24 ( 2.49%)
Mean 16 40.05 ( 0.00%) 39.04 ( 2.51%)
For both CPUs, average access time is reduced which is good as
this is the benchmark that was used to tune the shift values in
the first place albeit it is now known *how* the benchmark was
used.
The scheduler benchmarks were somewhat inconclusive. They
showed gains and losses and makes me reconsider how stable those
benchmarks really are or if something else might be interfering
with the test results recently.
Network benchmarks were inconclusive. Almost all results were
flat except for netperf-udp tests on the 4 thread machine.
These results were unstable and showed large variations between
reboots. It is unknown if this is a recent problems but I've
noticed before that netperf-udp results tend to vary.
Based on these results, changing the default for Ivybridge seems
like a logical choice.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Reviewed-by: Alex Shi <alex.shi@linaro.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-cqnadffh1tiqrshthRj3Esge@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3b56496865f9f7d9bcb2f93b44c63f274f08e3b6 upstream.
This adds the workaround for erratum 793 as a precaution in case not
every BIOS implements it. This addresses CVE-2013-6885.
Erratum text:
[Revision Guide for AMD Family 16h Models 00h-0Fh Processors,
document 51810 Rev. 3.04 November 2013]
793 Specific Combination of Writes to Write Combined Memory Types and
Locked Instructions May Cause Core Hang
Description
Under a highly specific and detailed set of internal timing
conditions, a locked instruction may trigger a timing sequence whereby
the write to a write combined memory type is not flushed, causing the
locked instruction to stall indefinitely.
Potential Effect on System
Processor core hang.
Suggested Workaround
BIOS should set MSR
C001_1020[15] = 1b.
Fix Planned
No fix planned
[ hpa: updated description, fixed typo in MSR name ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20140114230711.GS29865@pd.tnic
Tested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 77f01bdfa5e55dc19d3eb747181d2730a9bb3ca8 upstream.
When Hyper-V hypervisor leaves are present, KVM must relocate
its own leaves at 0x40000100, because Windows does not look for
Hyper-V leaves at indices other than 0x40000000. In this case,
the KVM features are at 0x40000101, but the old code would always
look at 0x40000001.
Fix by using kvm_cpuid_base(). This also requires making the
function non-inline, since kvm_cpuid_base() is static.
Fixes: 1085ba7f552d84aa8ac0ae903fa8d0cc2ff9f79d
Cc: mtosatti@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1c300a40772dae829b91dad634999a6a522c0829 upstream.
It is unnecessary to go through hypervisor_cpuid_base every time
a leaf is found (which will be every time a feature is requested
after the next patch).
Fixes: 1085ba7f552d84aa8ac0ae903fa8d0cc2ff9f79d
Cc: mtosatti@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1739f09e33d8f66bf48ddbc3eca615574da6c4f6 upstream.
Function tracing callbacks expect to have the ftrace_ops that registered it
passed to them, not the address of the variable that holds the ftrace_ops
that registered it.
Use a mov instead of a lea to store the ftrace_ops into the parameter
of the function tracing callback.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/20131113152004.459787f9@gandalf.local.home
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bee09ed91cacdbffdbcd3b05de8409c77ec9fcd6 upstream.
On AMD family 10h we see following error messages while waking up from
S3 for all non-boot CPUs leading to a failed IBS initialization:
Enabling non-boot CPUs ...
smpboot: Booting Node 0 Processor 1 APIC 0x1
[Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
perf: IBS APIC setup failed on cpu #1
process: Switch to broadcast mode on CPU1
CPU1 is up
...
ACPI: Waking up from system sleep state S3
Reason for this is that during suspend the LVT offset for the IBS
vector gets lost and needs to be reinialized while resuming.
The offset is read from the IBSCTL msr. On family 10h the offset needs
to be 1 as offset 0 is used for the MCE threshold interrupt, but
firmware assings it for IBS to 0 too. The kernel needs to reprogram
the vector. The msr is a readonly node msr, but a new value can be
written via pci config space access. The reinitialization is
implemented for family 10h in setup_ibs_ctl() which is forced during
IBS setup.
This patch fixes IBS setup after waking up from S3 by adding
resume/supend hooks for the boot cpu which does the offset
reinitialization.
Marking it as stable to let distros pick up this fix.
Signed-off-by: Robert Richter <rric@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1389797849-5565-1-git-send-email-rric.net@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 40e2d7f9b5dae048789c64672bf3027fbb663ffa upstream.
Linux 3.10 changed the timing of how thread_info->flags is touched:
x86: Use generic idle loop
(7d1a941731fabf27e5fb6edbebb79fe856edb4e5)
This caused Intel NHM-EX and WSM-EX servers to experience a large number
of immediate MONITOR/MWAIT break wakeups, which caused cpuidle to demote
from deep C-states to shallow C-states, which caused these platforms
to experience a significant increase in idle power.
Note that this issue was already present before the commit above,
however, it wasn't seen often enough to be noticed in power measurements.
Here we extend an errata workaround from the Core2 EX "Dunnington"
to extend to NHM-EX and WSM-EX, to prevent these immediate
returns from MWAIT, reducing idle power on these platforms.
While only acpi_idle ran on Dunnington, intel_idle
may also run on these two newer systems.
As of today, there are no other models that are known
to need this tweak.
Link: http://lkml.kernel.org/r/CAJvTdK=%2BaNN66mYpCGgbHGCHhYQAKx-vB0kJSWjVpsNb_hOAtQ@mail.gmail.com
Signed-off-by: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/baff264285f6e585df757d58b17788feabc68918.1387403066.git.len.brown@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ab4ead02ec235d706d0611d8741964628291237e upstream.
In commit 8a4d0a687a59 "ftrace: Use breakpoint method to update ftrace
caller", we choose to use breakpoint method to update the ftrace
caller. But we also need to skip over the breakpoint in function
ftrace_int3_handler() for them. Otherwise weird things would happen.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
as an error
commit 11f918d3e2d3861b6931e97b3aa778e4984935aa upstream.
Do it the same way as done in microcode_intel.c: use pr_debug()
for missing firmware files.
There seem to be CPUs out there for which no microcode update
has been submitted to kernel-firmware repo yet resulting in
scary sounding error messages in dmesg:
microcode: failed to load file amd-ucode/microcode_amd_fam16h.bin
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1384274383-43510-1-git-send-email-trenn@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 522e66464467543c0d88d023336eec4df03ad40b upstream.
In reboot and crash path, when we shut down the local APIC, the I/O APIC is
still active. This may cause issues because external interrupts
can still come in and disturb the local APIC during shutdown process.
To quiet external interrupts, disable I/O APIC before shutdown local APIC.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1382578212-4677-1-git-send-email-fenghua.yu@intel.com
[ I suppose the 'issue' is a hang during shutdown. It's a fine change nevertheless. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ea8117478918a4734586d35ff530721b682425be upstream.
Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
regressed several workloads and caused excessive reschedule
interrupts.
The patch in question failed to notice that the x86 code had an
inverted sense of the polling state versus the new generic code (x86:
default polling, generic: default !polling).
Fix the two prominent x86 mwait based idle drivers and introduce a few
new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
usage).
Also switch the idle routines to using tif_need_resched() which is an
immediate TIF_NEED_RESCHED test as opposed to need_resched which will
end up being slightly different.
Reported-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: lenb@kernel.org
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Two fixes:
- Fix 'NMI handler took too long to run' false positives
[ Genuine NMI overhead speedups will come for v3.13, this commit
only fixes a measurement bug ]
- Fix perf ring-buffer missed barrier causing (rare) ring-buffer data
corruption on ppc64"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix NMI measurements
perf: Fix perf ring buffer memory ordering
|
|
The x86 specific kvm init creates a new conflicting
debugfs directory which causes modprobe issues
with kvm_intel and kvm_amd. For example,
sudo modprobe kvm_amd
modprobe: ERROR: could not insert 'kvm_amd': Bad address
The simplest fix is to just rename the directory. The following
KVM config options are set:
CONFIG_KVM_GUEST=y
CONFIG_KVM_DEBUG_FS=y
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_KVM_AMD=m
CONFIG_KVM_DEVICE_ASSIGNMENT=y
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
[Change debugfs directory name. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
OK, so what I'm actually seeing on my WSM is that sched/clock.c is
'broken' for the purpose we're using it for.
What triggered it is that my WSM-EP is broken :-(
[ 0.001000] tsc: Fast TSC calibration using PIT
[ 0.002000] tsc: Detected 2533.715 MHz processor
[ 0.500180] TSC synchronization [CPU#0 -> CPU#6]:
[ 0.505197] Measured 3 cycles TSC warp between CPUs, turning off TSC clock.
[ 0.004000] tsc: Marking TSC unstable due to check_tsc_sync_source failed
For some reason it consistently detects TSC skew, even though NHM+
should have a single clock domain for 'reasonable' systems.
This marks sched_clock_stable=0, which means that we do fancy stuff to
try and get a 'sane' clock. Part of this fancy stuff relies on the tick,
clearly that's gone when NOHZ=y. So for idle cpus time gets stuck, until
it either wakes up or gets kicked by another cpu.
While this is perfectly fine for the scheduler -- it only cares about
actually running stuff, and when we're running stuff we're obviously not
idle. This does somewhat break down for perf which can trigger events
just fine on an otherwise idle cpu.
So I've got NMIs get get 'measured' as taking ~1ms, which actually
don't last nearly that long:
<idle>-0 [013] d.h. 886.311970: rcu_nmi_enter <-do_nmi
...
<idle>-0 [013] d.h. 886.311997: perf_sample_event_took: HERE!!! : 1040990
So ftrace (which uses sched_clock(), not the fancy bits) only sees
~27us, but we measure ~1ms !!
Now since all this measurement stuff lives in x86 code, we can actually
fix it.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org
Cc: dave.hansen@linux.intel.com
Cc: eranian@google.com
Cc: Don Zickus <dzickus@redhat.com>
Cc: jmario@redhat.com
Cc: acme@infradead.org
Link: http://lkml.kernel.org/r/20131017133350.GG3364@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two fixlets:
- fix a (rare-config) build bug
- fix a next-gen SGI/UV hw/firmware enumeration bug"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Update UV3 hub revision ID
x86/microcode: Correct Kconfig dependencies
|
|
We use jump label to enable pv-spinlock. With the changes in (442e0973e927
Merge branch 'x86/jumplabel'), the jump label behaviour has changed
that would result in eventual hang of the VM since we would end up in a
situation where slow path locks would halt the vcpus but we will not be
able to wakeup the vcpu by lock releaser using unlock kick.
Similar problem in Xen and more detailed description is available in
a945928ea270 (xen: Do not enable spinlocks before jump_label_init()
has executed)
This patch splits kvm_spinlock_init to separate jump label changes with
pvops patching and also make jump label enabling after jump_label_init().
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
|
The UV3 hub revision ID is different than expected. The first
revision was supposed to start at 1 but instead will start at 0.
Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: <stable@kernel.org> # v3.9, v3.10, v3.11
Link: http://lkml.kernel.org/r/20131014161733.GA6274@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"A build fix and a reboot quirk"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/reboot: Add reboot quirk for Dell Latitude E5410
x86, build, pci: Fix PCI_MSI build on !SMP
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Various fixlets:
On the kernel side:
- fix a race
- fix a bug in the handling of the perf ring-buffer data page
On the tooling side:
- fix the handling of certain corrupted perf.data files
- fix a bug in 'perf probe'
- fix a bug in 'perf record + perf sched'
- fix a bug in 'make install'
- fix a bug in libaudit feature-detection on certain distros"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf session: Fix infinite loop on invalid perf.data file
perf tools: Fix installation of libexec components
perf probe: Fix to find line information for probe list
perf tools: Fix libaudit test
perf stat: Set child_pid after perf_evlist__prepare_workload()
perf tools: Add default handler for mmap2 events
perf/x86: Clean up cap_user_time* setting
perf: Fix perf_pmu_migrate_context
|
|
Dell Latitude E5410 needs reboot=pci to actually reboot.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://lkml.kernel.org/r/1380888964-14517-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Currently the cap_user_time_zero capability has different tests than
cap_user_time; even though they expose the exact same data.
Switch from CONSTANT && NONSTOP to sched_clock_stable to also deal
with multi cabinet machines and drop the tsc_disabled() check.. non of
this will work sanely without tsc anyway.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-nmgn0j0muo1r4c94vlfh23xy@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
bootup warning
IORESOURCE_BUSY is used to mark temporary driver mem-resources
instead of global regions. This suppresses warnings if regions
overlap with a region marked as BUSY.
This was always the case for VESA/VGA/EFI framebuffer regions so
do the same for simplefb regions. The reason we do this is to
allow device handover to real GPU drivers like
i915/radeon/nouveau which get the same regions via PCI BARs.
Maybe at some point we will be able to unregister platform
devices properly during the handover. In this case the simplefb
region would get removed before the new region is created.
However, this is currently not the case and would require rather
huge changes in remove_conflicting_framebuffers(). Add the BUSY
marker now and try to eventually rewrite the handover for a next release.
Also see kernel/resource.c for more information:
/*
* if a resource is "BUSY", it's not a hardware resource
* but a driver mapping of such a resource; we don't want
* to warn for those; some drivers legitimately map only
* partial hardware resources. (example: vesafb)
*/
This suppresses warnings like:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 199 at arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2e3/0x390()
Info: mapping multiple BARs. Your kernel is fine.
Call Trace:
dump_stack+0x54/0x8d
warn_slowpath_common+0x7d/0xa0
warn_slowpath_fmt+0x4c/0x50
iomem_map_sanity_check+0xac/0xe0
__ioremap_caller+0x2e3/0x390
ioremap_wc+0x32/0x40
i915_driver_load+0x670/0xf50 [i915]
...
Reported-by: Tom Gundersen <teg@jklm.no>
Tested-by: Tom Gundersen <teg@jklm.no>
Tested-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Link: http://lkml.kernel.org/r/1380724864-1757-1-git-send-email-dh.herrmann@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
On my MacBook Air lfb_size is 4M, which makes the bitshit
overflow (to 256GB - larger than 32 bits), meaning we fall
back to efifb unnecessarily.
Cast to u64 to avoid the overflow.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Stephen Warren <swarren@nvidia.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Link: http://lkml.kernel.org/r/1380644320-1026-1-git-send-email-teg@jklm.no
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler, timer and x86 fixes from Ingo Molnar:
- A context tracking ARM build and functional fix
- A handful of ARM clocksource/clockevent driver fixes
- An AMD microcode patch level sysfs reporting fixlet
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm: Fix build error with context tracking calls
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast
clocksource: of: Respect device tree node status
clocksource: exynos_mct: Set IRQ affinity when the CPU goes online
arm: clocksource: mvebu: Use the main timer as clock source from DT
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Fix patch level reporting for family 15h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"A couple of tooling fixlets and a PMU detection printout fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix PMU detection printout when no PMU is detected
perf symbols: Demangle cloned functions
perf machine: Fix path unpopulated in machine__create_modules()
perf tools: Explicitly add libdl dependency
perf probe: Fix probing symbols with optimization suffix
perf trace: Add mmap2 handler
perf kmem: Make it work again on non NUMA machines
|
|
Ran into this cryptic PMU bootup log recently:
[ 0.124047] Performance Events:
[ 0.125000] smpboot: ...
Turns out we print this if no PMU is detected. Fall back to
the right condition so that the following is printed:
[ 0.122381] Performance Events: no PMU driver, software events only.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/tip-u2fwaUffakjp0qkpRfqljgsn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
On AMD family 14h, applying microcode patch on the a core (core0)
would also affect the other core (core1) in the same compute
unit. The driver would skip applying the patch on core1, but it
still need to update kernel structures to reflect the proper
patch level.
The current logic is not updating the struct
ucode_cpu_info.cpu_sig.rev of the skipped core. This causes the
/sys/devices/system/cpu/cpu1/microcode/version to report
incorrect patch level as shown below:
$ grep . cpu?/microcode/version
cpu0/microcode/version:0x600063d
cpu1/microcode/version:0x6000626
cpu2/microcode/version:0x600063d
cpu3/microcode/version:0x6000626
cpu4/microcode/version:0x600063d
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: <bp@alien8.de>
Cc: <jacob.w.shin@gmail.com>
Cc: <herrmann.der.user@googlemail.com>
Link: http://lkml.kernel.org/r/1285806432-1995-1-git-send-email-suravee.suthikulpanit@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"An EFI fix and two reboot-quirk fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/reboot: Fix apparent cut-n-paste mistake in Dell reboot workaround
x86/reboot: Add quirk to make Dell C6100 use reboot=pci automatically
x86, efi: Don't map Boot Services on i386
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Assorted standalone fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Add model number for Avoton Silvermont
perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page'
perf/x86/intel/uncore: Don't use smp_processor_id() in validate_group()
perf: Update ABI comment
tools lib lk: Uninclude linux/magic.h in debugfs.c
perf tools: Fix old GCC build error in trace-event-parse.c:parse_proc_kallsyms()
perf probe: Fix finder to find lines of given function
perf session: Check for SIGINT in more loops
perf tools: Fix compile with libelf without get_phdrnum
perf tools: Fix buildid cache handling of kallsyms with kcore
perf annotate: Fix objdump line parsing offset validation
perf tools: Fill in new definitions for madvise()/mmap() flags
perf tools: Sharpen the libaudit dependencies test
|
|
This seems to have been copied from the Optiplex 990 entry
above, but somoene forgot to change the ident text.
Signed-off-by: Dave Jones <davej@fedoraproject.org>
Link: http://lkml.kernel.org/r/20130925001344.GA13554@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Dell PowerEdge C6100 machines fail to completely reboot about 20% of the time.
Signed-off-by: Masoud Sharbiani <msharbiani@twitter.com>
Signed-off-by: Vinson Lee <vlee@twitter.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1379717947-18042-1-git-send-email-vlee@freedesktop.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Cc: a.p.zijlstra@chello.nl
Cc: eranian@google.com
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/1379837953-17755-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Solve the problems around the broken definition of perf_event_mmap_page::
cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially
fixed by:
860f085b74e9 ("perf: Fix broken union in 'struct perf_event_mmap_page'")
The problem with the fix (merged in v3.12-rc1 and not yet released
officially), noticed by Vince Weaver is that the new behavior is
not detectable by new user-space, and that due to the reuse of the
field names it's easy to mis-compile a binary if old headers are used
on a new kernel or new headers are used on an old kernel.
To solve all that make this change explicit, detectable and self-contained,
by iterating the ABI the following way:
- Always clear bit 0, and rename it to usrpage->cap_bit0, to at least not
confuse old user-space binaries. RDPMC will be marked as unavailable
to old binaries but that's within the ABI, this is a capability bit.
- Rename bit 1 to ->cap_bit0_is_deprecated and always set it to 1, so new
libraries can reliably detect that bit 0 is deprecated and perma-zero
without having to check the kernel version.
- Use bits 2, 3, 4 for the newly defined, correct functionality:
cap_user_rdpmc : 1, /* The RDPMC instruction can be used to read counts */
cap_user_time : 1, /* The time_* fields are used */
cap_user_time_zero : 1, /* The time_zero field is used */
- Rename all the bitfield names in perf_event.h to be different from the
old names, to make sure it's not possible to mis-compile it
accidentally with old assumptions.
The 'size' field can then be used in the future to add new fields and it
will act as a natural ABI version indicator as well.
Also adjust tools/perf/ userspace for the new definitions, noticed by
Adrian Hunter.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Also-Fixed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/n/tip-zr03yxjrpXesOzzupszqglbv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
uncore_validate_group() can't call smp_processor_id() because it is
in preemptible context. Pass NUMA_NO_NODE to the allocator instead.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379400493-11505-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel/lpss: Add pin control support to Intel low power subsystem
perf/x86/intel: Mark MEM_LOAD_UOPS_MISS_RETIRED as precise on SNB
x86: Remove now-unused save_rest()
x86/smpboot: Fix announce_cpu() to printk() the last "OK" properly
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Two small fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix UAPI export of PERF_EVENT_IOC_ID
perf/x86/intel: Fix Silvermont offcore masks
|
|
On Intel SNB (SNB, SNB-EP), the event MEM_LOAD_UOPS_MISS_RETIRED
supports PEBS. It was missing for the SNB PEBS event constraint
table thereby preventing any measurement with PEBS for it.
This patch adds the event to the PEBS table for SNB.
WARNING: it should be noted that this event like a few others
are subject to the erratum BT241 for Xeon E5 (SNB-EP). As such,
the event may undercount when used with PEBS unless the
workaround is implemented. But without this patch and just the
workaround, the kernel would not allow precise sampling on this
event. BT241 is documented in:
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-family-spec-update.pdf
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/20130913201646.GA23981@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Various fixes.
The -g perf report lockup you reported is only partially addressed,
patches that fix the excessive runtime are still being worked on"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix uncore PCI fixed counter handling
uprobes: Fix utask->depth accounting in handle_trampoline()
perf/x86: Add constraint for IVB CYCLE_ACTIVITY:CYCLES_LDM_PENDING
perf: Fix up MMAP2 buffer space reservation
perf tools: Add attr->mmap2 support
perf kvm: Fix sample_type manipulation
perf evlist: Fix id pos in perf_evlist__open()
perf trace: Handle perf.data files with no tracepoints
perf session: Separate progress bar update when processing events
perf trace: Check if MAP_32BIT is defined
perf hists: Fix formatting of long symbol names
perf evlist: Fix parsing with no sample_id_all bit set
perf tools: Add test for parsing with no sample_id_all bit
perf trace: Check control+C more often
|
|
Fengguang Wu reported:
> sparse warnings: (new ones prefixed by >>)
>
> >> arch/x86/kernel/cpu/perf_event_intel.c:901:9: sparse: constant 0x768005ffff is so big it is long
> >> arch/x86/kernel/cpu/perf_event_intel.c:902:9: sparse: constant 0x768005ffff is so big it is long
>
> vim +901 arch/x86/kernel/cpu/perf_event_intel.c
>
> 895 },
> 896 };
> 897
> 898 static struct extra_reg intel_slm_extra_regs[] __read_mostly =
> 899 {
> 900 /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
> > 901 INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x768005ffff, RSP_0),
> > 902 INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x768005ffff, RSP_1),
> 903 EVENT_EXTRA_END
> 904 };
> 905
Extend those constants to 64 bits.
Reported-by: fengguang.wu@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130909112636.GQ31370@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
There was a bug in the handling of SNB-EP/IVB-EP uncore PCI
fixed counters, e.g., IMC.
It would cause erratic values to be returned for the IMC
clockticks event. This was due to a bogus hwc->config value
which was then written to PCI config space.
The erratic values can be seen via:
$ perf stat -a -C 0 -e uncore_imc_0/clockticks/ -I 1000 sleep 10
The fixed counter has most fields marked as reserved with
hw reset values of 0. Yet the kernel was defaulting to a
hwc->config = ~0 and that was causing the issues.
This patch sets the hwc->config values for fixed uncore event
to 0. Now, the values of IMC clockticks is correct.
Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: peterz@infradead.org
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/20130909195350.GA17643@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The IvyBridge event CYCLE_ACTIVITY:CYCLES_LDM_PENDING can only
be measured on counters 0-3 when HT is off. When HT is on, you
only have counters 0-3.
If you program it on the eight counters for 1s on a 3GHz
IVB laptop running a noploop, you see:
2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
Clearly the last 4 values are bogus.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: zheng.z.yan@intel.com
Cc: dhsharp@google.com
Link: http://lkml.kernel.org/r/20130911152222.GA28761@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The previous patch doing vmstats for TLB flushes ("mm: vmstats: tlb flush
counters") effectively missed UP since arch/x86/mm/tlb.c is only compiled
for SMP.
UP systems do not do remote TLB flushes, so compile those counters out on
UP.
arch/x86/kernel/cpu/mtrr/generic.c calls __flush_tlb() directly. This is
probably an optimization since both the mtrr code and __flush_tlb() write
cr4. It would probably be safe to make that a flush_tlb_all() (and then
get these statistics), but the mtrr code is ancient and I'm hesitant to
touch it other than to just stick in the counters.
[akpm@linux-foundation.org: tweak comments]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull drm fixes from Dave Airlie:
"Daniel had some fixes queued up, that were delayed, the stolen memory
ones and vga arbiter ones are quite useful, along with his usual bunch
of stuff, nothing for HSW outputs yet.
The one nouveau fix is for a regression I caused with the poweroff stuff"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (30 commits)
drm/nouveau: fix oops on runtime suspend/resume
drm/i915: Delay disabling of VGA memory until vgacon->fbcon handoff is done
drm/i915: try not to lose backlight CBLV precision
drm/i915: Confine page flips to BCS on Valleyview
drm/i915: Skip stolen region initialisation if none is reserved
drm/i915: fix gpu hang vs. flip stall deadlocks
drm/i915: Hold an object reference whilst we shrink it
drm/i915: fix i9xx_crtc_clock_get for multiplied pixels
drm/i915: handle sdvo input pixel multiplier correctly again
drm/i915: fix hpd work vs. flush_work in the pageflip code deadlock
drm/i915: fix up the relocate_entry refactoring
drm/i915: Fix pipe config warnings when dealing with LVDS fixed mode
drm/i915: Don't call sg_free_table() if sg_alloc_table() fails
i915: Update VGA arbiter support for newer devices
vgaarb: Fix VGA decodes changes
vgaarb: Don't disable resources that are not owned
drm/i915: Pin pages whilst mapping the dma-buf
drm/i915: enable trickle feed on Haswell
x86: add early quirk for reserving Intel graphics stolen memory v5
drm/i915: split PCI IDs out into i915_drm.h v4
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 jumplabel changes from Peter Anvin:
"One more x86 tree for this merge window. This tree improves the
handling of jump labels, so that most of the time we don't have to do
a massive initial patching run.
Furthermore, we will error out of the jump label is not what is
expected, eg if it has been corrupted or tampered with"
* 'x86/jumplabel' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/jump-label: Show where and what was wrong on errors
x86/jump-label: Add safety checks to jump label conversions
x86/jump-label: Do not bother updating nops if they are correct
x86/jump-label: Use best default nops for inital jump label calls
|
|
Pull device tree core updates from Grant Likely:
"Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding
the entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either"
Tim Bird questions whether the boot time cost of the random feeding may
be noticeable. And "add_device_randomness()" is definitely not some
speed deamon of a function.
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
of/platform: add error reporting to of_amba_device_create()
irq/of: Fix comment typo for irq_of_parse_and_map
of: Feed entire flattened device tree into the random pool
of/fdt: Clean up casting in unflattening path
of/fdt: Remove duplicate memory clearing on FDT unflattening
gpio: implement gpio-ranges binding document fix
of: call __of_parse_phandle_with_args from of_parse_phandle
of: introduce of_parse_phandle_with_fixed_args
of: move of_parse_phandle()
of: move documentation of of_parse_phandle_with_args
of: Fix missing memory initialization on FDT unflattening
of: consolidate definition of early_init_dt_alloc_memory_arch()
of: Make of_get_phy_mode() return int i.s.o. const int
include: dt-binding: input: create a DT header defining key codes.
of/platform: Staticize of_platform_device_create_pdata()
of: Specify initrd location using 64-bit
dt: Typo fix
OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled
|
|
b3af11afe06a ("x86: get rid of pt_regs argument of iopl(2)")
dropped PTREGSCALL which was also the last user of save_rest.
Drop that now-unused function too.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: http://lkml.kernel.org/r/1378546750-19727-1-git-send-email-bp@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"Not much changes for the 3.12 merge window. The major tracing changes
are still in flux, and will have to wait for 3.13.
The changes for 3.12 are mostly clean ups and minor fixes.
H Peter Anvin added a check to x86_32 static function tracing that
helps a small segment of the kernel community.
Oleg Nesterov had a few changes from 3.11, but were mostly clean ups
and not worth pushing in the -rc time frame.
Li Zefan had small clean up with annotating a raw_init with __init.
I fixed a slight race in updating function callbacks, but the race is
so small and the bug that happens when it occurs is so minor it's not
even worth pushing to stable.
The only real enhancement is from Alexander Z Lam that made the
tracing_cpumask work for trace buffer instances, instead of them all
sharing a global cpumask"
* tag 'trace-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/rcu: Do not trace debug_lockdep_rcu_enabled()
x86-32, ftrace: Fix static ftrace when early microcode is enabled
ftrace: Fix a slight race in modifying what function callback gets traced
tracing: Make tracing_cpumask available for all instances
tracing: Kill the !CONFIG_MODULES code i |