aboutsummaryrefslogtreecommitdiff
path: root/arch/i386
AgeCommit message (Collapse)Author
2007-04-08[PATCH] i386: irqbalance_disable() section fixAndrew Morton
WARNING: arch/i386/kernel/built-in.o - Section mismatch: reference to .init.text:irqbalance_disable from .text between 'quirk_intel_irqbalance' (at offset 0x80a5) and 'i8237A_suspend' Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-08[PATCH] Proper fix for highmem kmap_atomic functions for VMI for 2.6.21Zachary Amsden
Since lazy MMU batching mode still allows interrupts to enter, it is possible for interrupt handlers to try to use kmap_atomic, which fails when lazy mode is active, since the PTE update to highmem will be delayed. The best workaround is to issue an explicit flush in kmap_atomic_functions case; this is the only way nested PTE updates can happen in the interrupt handler. Thanks to Jeremy Fitzhardinge for noting the bug and suggestions on a fix. This patch gets reverted again when we start 2.6.22 and the bug gets fixed differently. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Andi Kleen <ak@muc.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6Linus Torvalds
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: [PATCH] x86: Don't probe for DDC on VBE1.2 [PATCH] x86-64: Increase NMI watchdog probing timeout [PATCH] x86-64: Let oprofile reserve MSR on all CPUs [PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E
2007-04-02[PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systemsThomas Gleixner
The __copy_to_user_inatomic() calls in file_read_actor() and pipe_read() are broken on original i386 machines, where WP-works-ok == false, as __copy_to_user_inatomic() on such systems calls functions which might sleep and/or contain cond_resched() calls inside of a kmap_atomic() region. The original check for WP-works-ok was in access_ok(), but got moved during the 2.5 series to fix a race vs. swap. Return the number of bytes to copy in the case where we are in an atomic region, so the non atomic code pathes in file_read_actor() and pipe_read() are taken. This could be optimized to avoid the kmap_atomicby moving the check for WP-works-ok into fault_in_pages_writeable(), but this is more intrusive and can be done later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02[PATCH] Fix microcode-related suspend problemRafael J. Wysocki
Fix the regression resulting from the recent change of suspend code ordering that causes systems based on Intel x86 CPUs using the microcode driver to hang during the resume. The problem occurs since the microcode driver uses request_firmware() in its CPU hotplug notifier, which is called after tasks has been frozen and hangs. It can be fixed by telling the microcode driver to use the microcode stored in memory during the resume instead of trying to load it from disk. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Adrian Bunk <bunk@stusta.de> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maxim <maximlevitsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02[PATCH] x86: Don't probe for DDC on VBE1.2Zwane Mwaikambo
VBE1.2 doesn't support function 15h (DDC) resulting in a 'hang' whilst uncompressing kernel with some video cards. Make sure we check VBE version before fiddling around with DDC. http://bugzilla.kernel.org/show_bug.cgi?id=1458 Opened: 2003-10-30 09:12 Last update: 2007-02-13 22:03 Much thanks to Tobias Hain for help in testing and investigating the bug. Tested on; i386, Chips & Technologies 65548 VESA VBE 1.2 CONFIG_VIDEO_SELECT=Y CONFIG_FIRMWARE_EDID=Y Untested on x86_64. Signed-off-by: Zwane Mwaikambo <zwane@infradead.org> Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-02[PATCH] x86-64: Increase NMI watchdog probing timeoutAndi Kleen
A 4 core Opteron needs longer than 10 ticks for this. Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-02[PATCH] x86-64: Let oprofile reserve MSR on all CPUsAndi Kleen
The MSR reservation is per CPU and oprofile would only allocate them on the CPU it was initialized on. Change this to handle all CPUs. This also fixes a warning about unprotected use of smp_processor_id() in preemptible kernels. Signed-off-by: Andi Kleen <ak@suse.de>
2007-04-02[PATCH] x86-64: Disable local APIC timer use on AMD systems with C1EAndi Kleen
AMD dual core laptops with C1E do not run the APIC timer correctly when they go idle. Previously the code assumed this only happened on C2 or deeper. But not all of these systems report support C2. Use a AMD supplied snippet to detect C1E being enabled and then disable local apic timer use. This supercedes an earlier workaround using DMI detection of specific systems. Thanks to Mark Langsdorf for the detection snippet. Signed-off-by: Andi Kleen <ak@suse.de>
2007-03-29[PATCH] Add suspend/resume for HPETMaxim Levitsky
This adds support of suspend/resume on i386 for HPET, which fixes a number of timer-related failures around STR. Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com> Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Acked-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-28[PATCH] MSI-X: fix resume crashEric W. Biederman
So I think the right solution is to simply make pci_enable_device just flip enable bits and move the rest of the work someplace else. However a thorough cleanup is a little extreme for this point in the release cycle, so I think a quick hack that makes the code not stomp the irq when msi irq's are enabled should be the first fix. Then we can later make the code not change the irqs at all. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27[PATCH] i386: Fix bogus return value in hpet_next_event()Thomas Gleixner
The clockevents / tick management code expects an error value, when the event is already expired. hpet_next_event() returns 1 in that case. Fix it to return the proper -ETIME error code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-26pci: set pci=bfsort for PowerEdge R900Matt Domsch
This patch automatically enables pci=bfsort for the Dell PowerEdge R900. This is necessary to ensure the onboard NICs enumerate in the proper order, similar to the other systems already on the list. Signed-off-by: Matt Domsch <Matt_Domsch@dell.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-24[PATCH] i386: Prevent early access to TSC to avoid crash on TSCless systemsThomas Gleixner
commit f9690982b8c2f9a2c65acdc113e758ec356676a3 removed the check for cpu_khz from sched_clock(), which prevented early access to the TSC by non obvious magic. This is harmless as long as the CPU has a TSC. On TSCless systems this results in an illegal instruction trap. Replace tsc_disabled and tsc_unstable by tsc_enabled, which is only set when the tsc is available and not unstable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-23[PATCH] i386: add command line option "local_apic_timer_c2_ok"Thomas Gleixner
It turned out that it is almost impossible to trust ACPI, BIOS & Co. regarding the C states. This was the reason to switch the local apic timer off in C2 state already. OTOH there are sane and well behaving systems, which get punished by that decision. Allow the user to confirm that the local apic timer is trustworthy in C2 state. This keeps the default behaviour on the safe side. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-22[PATCH] setup_boot_APIC_clock() irq-enable fixIngo Molnar
latest -git triggers an irqtrace/lockdep warning of a leaked irqs-off condition: BUG: at kernel/fork.c:1033 copy_process() after some debugging it turns out that commit ca1b940c accidentally left interrupts disabled - which trickled down all the way to the first time we fork a kernel thread and triggered the warning. the fix is to re-enable interrupts in the 'else' branch of setup_boot_APIC_clock()'s pmtimers calibration path. Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@brown.paperbag.linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-22[PATCH] i386: disable local apic timer via command line or dmi quirkThomas Gleixner
The local APIC timer stops to work in deeper C-States. This is handled by the ACPI code and a broadcast mechanism in the clockevents / tick managment code. Some systems do not expose the deeper C-States to the kernel, but switch into deeper C-States behind the kernels back. This delays the local apic timer interrupts for ever and makes the systems unusable. Add a command line option to disable the local apic timer and a dmi quirk for known broken systems. Andi sayeth: While not wrong by itself i think it is still better to use some heuristic -- like "has battery in ACPI" With the DMI table if the problem is more wide spread we will just continue extending it. But anyways should be ok now for .21 although I'm not really happy with it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Grudgingly-acked-by: Andi Kleen <ak@suse.de> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-22[PATCH] i386: clockevents fix breakage on Geode/Cyrix PIT implementationsThomas Gleixner
The PIT has no dedicated mode for shut down. The only way to disable PIT is to put it into one shot mode. AMD implementations of PIT on Geode (also observed on Cyrix) are confused by an "empty" transition from CLOCK_EVT_MODE_UNUSED to CLOCK_EVT_MODE_SHUTDOWN, which puts the PIT into one shot mode momentarily. I realized after staring helpless at the bug report http://bugzilla.kernel.org/show_bug.cgi?id=8027 for quite a while, that the only change, which might influence the bogomips calibration, is the above transition during the PIT initialization. Avoiding the unnecessary switch to oneshot and later to periodic mode fixes the weird bogomips value and also the resulting slowness. The fix is confirmed on OLPC and another Geode based box. Note: this is unrelated to the Dual Core problem discussed here: http://lkml.org/lkml/2007/3/17/48 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-18[PATCH] i386: trust the PM-Timer calibration of the local APIC timerThomas Gleixner
When PM-Timer is available for local APIC timer calibration we can skip the verification of the calibrated time value. The resulting error is quite small on a bunch of evaluated platforms and is less harming than the observed false positives. We need to keep the verification on systems, which have no PM-Timer to avoid bogus local APIC timer calibrations in the range of factor 2-10, which can be observed when swicthing off the PM-timer support in the kernel configuration. The wrong calibration values are probably caused by SMM code trying to emulate a PS/2 keyboard from a (maybe connected or not) USB keyboard. This prohibits the accurate delivery of PIT interrupts, which are used to calibrate the local APIC timer. Unfortunately we have no way to disable this BIOS misfeature in the early boot process. Add also the dropped cpu_relax() back to the wait loops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16[PATCH] x86: Export _proxy_pda for gcc 4.2Andi Kleen
The symbol is not actually used, but the compiler unforunately generates a (unused) reference to it. This can happen even in modules. So export it. Signed-off-by: Andi Kleen <ak@suse.de>
2007-03-16[PATCH] i386: Don't use the TSC in sched_clock if unstableGuillaume Chazarain
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f9690982b8c2f9a2c65acdc113e758ec356676a3 caused a regression by letting sched_clock use the TSC even when cpufreq disabled it. This caused scheduling weirdnesses. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: Andi Kleen <ak@suse.de>
2007-03-16[PATCH] i386: Enforce GPLness of VMI ROMAndi Kleen
VMI ROMs are pretty intimate to the kernel, so enforce their GPLness. No \0 tricks checking for now This rules out BSD/MIT modules for now, sorry -- the trouble is those could come without source. Acked-by: Zachary Amsden <zach@vmware.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de>
2007-03-16[PATCH] i386: Update defconfigAndi Kleen
Signed-off-by: Andi Kleen <ak@suse.de>
2007-03-14Disable NMI watchdog by default properlyLinus Torvalds
This reverts commit 6ebf622b2577c50b1f496bd6a5e8739e55ae7b1c and replaces it with one that actually works. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-12[PATCH] Fix VMI and COMPAT_VDSO for 2.6.21Zachary Amsden
VMI is broken under COMPAT_VDSO, as Xen and other non hardware assisted hypervisors will be. I have been working on a fix for this which works for older glibcs that panic when the new relocatable VDSO is used. However, I believe at this time that the fix is going to be too radical to consider at this stage in the release of 2.6.21. We don't expect this config option to be turned on by vendors for new distributions, so at this point we are willing to drop support for it when VMI is compiled in, and work on a patch for 2.6.22 which more fully addresses the problem. Signed-off-by: Zachary Amsden <zach@vmware.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-09Pull bugzilla-5966 into release branchLen Brown
2007-03-08[PATCH] build fix for i386 earlyquirk.cDave Jones
missing close bracket. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08[PATCH] ACPI: repair nvidia early quirk breakage on x86_64Len Brown
x86_64 nvidia_bugs() broke when we bailed out on not finding the HPET. However, the quirk works by checking for _not_ finding the HPET... Delete the nvidia_hpet_detected flag and simply test for not finding the HPET, which is simple to do now that acpi_table_parse returns 1 on failure. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08ACPI: fix Thinkpad 600/600E/600X interruptsLen Brown
The root cause of this bug shows that this machine could not possibly run an ACPI-aware OS without a model specific workaround. http://bugzilla.kernel.org/show_bug.cgi?id=5966 Signed-off-by: Len Brown <len.brown@intel.com>
2007-03-07[PATCH] CPU hotplug: call check_tsc_sync_source() with irqs offIngo Molnar
check_tsc_sync_source() depends on being called with irqs disabled (it checks whether the TSC is coherent across two specific CPUs). This is incidentally true during bootup, but not during cpu hotplug __cpu_up(). This got found via smp_processor_id() debugging. disable irqs explicitly and remove the unconditional enabling of interrupts. Add touch_nmi_watchdog() to the cpu_online_map busy loop. this bug is present both on i386 and on x86_64. Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06[PATCH] remove arch/i386/kernel/tsc.c:custom_sched_clockAdrian Bunk
Remove the no longer used custom_sched_clock. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Zachary Amsden <zach@vmware.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06[PATCH] Scheduled removal of SA_xxx interrupt flags fixups 3Thomas Gleixner
The obsolete SA_xxx interrupt flags have been used despite the scheduled removal. Fixup the remaining users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] paravirt: re-enable COMPAT_VDSOIngo Molnar
CONFIG_PARAVIRT broke old glibc bootup: it silently turned off the selectability of CONFIG_COMPAT_VDSO and thus rendered distro kernels unbootable on old-style VDSO glibc setups. the proper solution is to keep COMPAT_VDSO available - if a hypervisor needs any modification of that concept then we'll judge those changes in full context, once those changes are submitted. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] paravirt: let users decide whether they want VMIIngo Molnar
do not use default=y for CONFIG_VMI (we do not do that for any driver or special-hardware feature): the overwhelming majority of Linux users does not need it, and interested users and distributions can enable it as-needed. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] paravirt: clarify VMI descriptionIngo Molnar
Clarify the description of the CONFIG_VMI option: describe the reality that VMI is a VMWare-only interface for now. Once that changes and another hypervisor adopts the VMI ABI we can change the text. As can be seen from the Xen paravirtualization patches submitted to lkml the Xen project has chosen its own, non-VMI interface between Xen and the para-Linux - so remove Xen from the description. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] paravirt: remove NO_IDLE_HZ on x86Ingo Molnar
Temove the mistaken turning on of NO_IDLE_HZ on x86+PARAVIRT kernels. It's an obsolete, limited form of dynticks. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] arch/i386/kernel/vmi.c must #include <asm/kmap_types.h>Adrian Bunk
CC arch/i386/kernel/vmi.o /home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c: In function 'vmi_map_pt_hook': /home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: 'KM_PTE0' undeclared (first use in this function) /home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: (Each undeclared identifier is reported only once /home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: for each function it appears in.) /home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: 'KM_PTE1' undeclared (first use in this function) make[2]: *** [arch/i386/kernel/vmi.o] Error 1 Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] clocksource init adjustments (fix bug #7426)john stultz
This patch resolves the issue found here: http://bugme.osdl.org/show_bug.cgi?id=7426 The basic summary is: Currently we register most of i386/x86_64 clocksources at module_init time. Then we enable clocksource selection at late_initcall time. This causes some problems for drivers that use gettimeofday for init calibration routines (specifically the es1968 driver in this case), where durring module_init, the only clocksource available is the low-res jiffies clocksource. This may cause slight calibration errors, due to the small sampling time used. It should be noted that drivers that require fine grained time may not function on architectures that do not have better then jiffies resolution timekeeping (there are a few). However, this does not discount the reasonable need for such fine-grained timekeeping at init time. Thus the solution here is to register clocksources earlier (ideally when the hardware is being initialized), and then we enable clocksource selection at fs_initcall (before device_initcall). This patch should probably get some testing time in -mm, since clocksource selection is one of the most important issues for correct timekeeping, and I've only been able to test this on a few of my own boxes. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] fix "NMI appears to be stuck"Thomas Gleixner
Testing NMI watchdog ... CPU#0: NMI appears to be stuck (54->54)! CPU#1: NMI appears to be stuck (0->0)! Keep the PIT/HPET alive when nmi_watchdog = 1 is given on the command line. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: smp fixesZachary Amsden
Critical fixes for SMP. Fix a couple functions which needed to be __devinit and fix a bogus parameter to AP startup that just so happened to work because the low virtual mapping of memory was still established. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: apic opsZachary Amsden
Use para_fill instead of directly setting the APIC ops to the result of the vmi_get_function call - this allows one to implement a VMI ROM without implementing APIC functions, just using the native APIC functions. While doing this, I realized that there is a lot more cleanup that should have been done. Basically, we should never assume that the ROM implements a specific set of functions, and always allow fallback to the native implementation. This is critical for future compatibility. Signed-off-by: Anthony Liguori <anthony@codemonkey.ws> Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: fix nohz compileZachary Amsden
More goo from hrtimers integration. We do compile and run properly with NO_HZ enabled. There was a period when we didn't because of a missing export, but that was since fixed. And with the clocksource code now firmly in place, we can get rid of code that fixes up the wallclock, since this is done in the common infrastructure. This actually fixes a timer bug as well, that was caused by do_settimeofday no longer being callable with interrupts disabled due to the use of on_each_cpu(). Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: pit overrideZachary Amsden
The time_init_hook in paravirt-ops no longer functions in the correct manner after the integration of the hrtimers code. The problem is that now the call path for time initialization is: time_init : late_time_init = hpet_time_init; late_time_init -> hpet_time_init: setup_pit_timer (BAD) do_time_init --> (via paravirt.h) time_init_hook --> (via arch_hooks.h) time_init_hook (in SUBARCH/setup.c) If this isn't confusing enough, the paravirt case goes through an indirect function pointer in the paravirt-ops table. The problem is, by the time the paravirt hook is called, the pit timer is already enabled. But paravirt guests have their own timer, and don't want to use the PIT. Rather than intensify the struggle for power going on here, just make it all nice and simple and just unconditionally do all timer setup in the late_time_init hook. This also has the advantage of enabling timers in the same place in all code paths, so everyone has the same bugs and we don't have outliers who break other code because they turn on timer too early or too late. So the paravirt-ops time init function is now by default hpet_time_init, which is the time init function used for native hardware. Paravirt guests have the chance to override this when they setup the paravirt-ops table, and should need no change. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: paravirt drop udelay opZachary Amsden
Not respecting udelay causes problems with any virtual hardware that is passed through to real hardware. This can be noticed by any device that interacts with the real world in real time - like AP startup, which takes real time. Or keyboard LEDs, which should blink in real-time. Or floppy drives, but only when passed through to a real floppy controller on OSes which can't sufficiently buffer the floppy commands to emulate a zero latency floppy. Or IDE drives, when connecting to a physical CDROM. This was mostly a hack to get the kernel to boot faster, but it introduced a number of misvirtualization bugs, and Alan and Pavel argued pretty strongly against it. We were the only client, and now want to clean up this cruft. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: fix highpteZachary Amsden
Provide a PT map hook for HIGHPTE kernels to designate where they are mapping page tables. This information is required so the physical address of PTE updates can be determined; otherwise, the mm layer would have to carry the physical address all the way to each PTE modification callsite, which is even more hideous that the macros required to provide the proper hooks. So lets not mess up arch neutral code to achieve this, but keep the horror in an #ifdef HIGHPTE in include/asm-i386/pgtable.h. I had to use macros here because some types are not yet defined in all the include paths for this header. This patch is absolutely required for HIGHPTE kernels to operate properly with VMI. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: cpu cycles fixZachary Amsden
In order to share the common code in tsc.c which does CPU Khz calibration, we need to make an accurate value of CPU speed available to the tsc.c code. This value loses a lot of precision in a VM because of the timing differences with real hardware, but we need it to be as precise as possible so the guest can make accurate time calculations with the cycle counters. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: sched clock paravirt op fixZachary Amsden
The custom_sched_clock hook is broken. The result from sched_clock needs to be in nanoseconds, not in CPU cycles. The TSC is insufficient for this purpose, because TSC is poorly defined in a virtual environment, and mostly represents real world time instead of scheduled process time (which can be interrupted without notice when a virtual machine is descheduled). To make the scheduler consistent, we must expose a different nature of time, that is scheduled time. So deprecate this custom_sched_clock hack and turn it into a paravirt-op, as it should have been all along. This allows the tsc.c code which converts cycles to nanoseconds to be shared by all paravirt-ops backends. It is unfortunate to add a new paravirt-op, but this is a very distinct abstraction which is clearly different for all virtual machine implementations, and it gets rid of an ugly indirect function which I ashamedly admit I hacked in to try to get this to work earlier, and then even got in the wrong units. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05[PATCH] vmi: timer fixes round twoZachary Amsden
Critical bugfixes for the VMI-Timer code. 1) Do not setup a one shot alarm if we are keeping the periodic alarm armed. Additionally, since the periodic alarm can be run at a lower rate than HZ, let's fixup the guard to the no-idle-hz mode appropriately. This fixes the bug where the no-idle-hz mode might have a higher interrupt rate than the non-idle case. 2) The interrupt handler can no longer adjust xtime due to nested lock acquisition. Drop this. We don't need to check for wallclock time at every tick, it can be done in userspace instead. 3) Add a bypass to disable noidle operation. This is useful as a last minute workaround, or testing measure. 4) The code to skip the IO_APIC timer testing (no_timer_check) should be conditional on IO_APIC, not SMP, since UP kernels can have this configured in as well. Signed-off-by: Dan Hecht <dhecht@vmware.com> Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-01[PATCH] fix memory leak in dma_declare_coherent_memory()Yoichi Yuasa
When it goes to free1_out, dev->dma_mem has not been freed. Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-28[PATCH] x86_64/i386 irq: Fix !CONFIG_SMP compilationEric W. Biederman
When removing set_native_irq I missed the fact that it was called in a couple of places that were compiled even when SMP support is disabled. And since the irq_desc[].affinity field only exists in SMP things broke. Thanks to Simon Arlott <simon@arlott.org> for spotting this. There are a couple of ways to fix this but the simplest one is to just remove the assignments. The affinity field is only used to display a value to the user, and nothing on either i386 or x86_64 reads it or depends on it being any particlua value, so skipping the assignment is safe. The assignment that is being removed is just for the initial affinity value before the user explicitly sets it. The irq_desc array initializes this field to CPU_MASK_ALL so the field is initialized to a reasonable value in the SMP case without being set. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>