aboutsummaryrefslogtreecommitdiff
path: root/arch/s390/kvm
AgeCommit message (Collapse)Author
2013-09-10KVM: s390: move kvm_guest_enter,exit closer to sieDominik Dingel
commit 2b29a9fdcb92bfc6b6f4c412d71505869de61a56 upstream. Any uaccess between guest_enter and guest_exit could trigger a page fault, the page fault handler would handle it as a guest fault and translate a user address as guest address. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.2: adjust context and add the rc variable] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-03-06s390/kvm: Fix store status for ACRS/FPRSChristian Borntraeger
commit 15bc8d8457875f495c59d933b05770ba88d1eacb upstream. On store status we need to copy the current state of registers into a save area. Currently we might save stale versions: The sie state descriptor doesnt have fields for guest ACRS,FPRS, those registers are simply stored in the host registers. The host program must copy these away if needed. We do that in vcpu_put/load. If we now do a store status in KVM code between vcpu_put/load, the saved values are not up-to-date. Lets collect the ACRS/FPRS before saving them. This also fixes some strange problems with hotplug and virtio-ccw, since the low level machine check handler (on hotplug a machine check will happen) will revalidate all registers with the content of the save area. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> [bwh: Backported to 3.2 as done in 3.0 by Jiri Slaby] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Jiri Slaby <jslaby@suse.cz>
2013-02-06s390/time: fix sched_clock() overflowHeiko Carstens
commit ed4f20943cd4c7b55105c04daedf8d63ab6d499c upstream. Converting a 64 Bit TOD format value to nanoseconds means that the value must be divided by 4.096. In order to achieve that we multiply with 125 and divide by 512. When used within sched_clock() this triggers an overflow after appr. 417 days. Resulting in a sched_clock() return value that is much smaller than previously and therefore may cause all sort of weird things in subsystems that rely on a monotonic sched_clock() behaviour. To fix this implement a tod_to_ns() helper function which converts TOD values without overflow and call this function from both places that open coded the conversion: sched_clock() and kvm_s390_handle_wait(). Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-01-03s390/kvm: dont announce RRBM supportChristian Borntraeger
commit 87cac8f879a5ecd7109dbe688087e8810b3364eb upstream. Newer kernels (linux-next with the transparent huge page patches) use rrbm if the feature is announced via feature bit 66. RRBM will cause intercepts, so KVM does not handle it right now, causing an illegal instruction in the guest. The easy solution is to disable the feature bit for the guest. This fixes bugs like: Kernel BUG at 0000000000124c2a [verbose debug info unavailable] illegal operation: 0001 [#1] SMP Modules linked in: virtio_balloon virtio_net ipv6 autofs4 CPU: 0 Not tainted 3.5.4 #1 Process fmempig (pid: 659, task: 000000007b712fd0, ksp: 000000007bed3670) Krnl PSW : 0704d00180000000 0000000000124c2a (pmdp_clear_flush_young+0x5e/0x80) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3 00000000003cc000 0000000000000004 0000000000000000 0000000079800000 0000000000040000 0000000000000000 000000007bed3918 000000007cf40000 0000000000000001 000003fff7f00000 000003d281a94000 000000007bed383c 000000007bed3918 00000000005ecbf8 00000000002314a6 000000007bed36e0 Krnl Code:>0000000000124c2a: b9810025 ogr %r2,%r5 0000000000124c2e: 41343000 la %r3,0(%r4,%r3) 0000000000124c32: a716fffa brct %r1,124c26 0000000000124c36: b9010022 lngr %r2,%r2 0000000000124c3a: e3d0f0800004 lg %r13,128(%r15) 0000000000124c40: eb22003f000c srlg %r2,%r2,63 [ 2150.713198] Call Trace: [ 2150.713223] ([<00000000002312c4>] page_referenced_one+0x6c/0x27c) [ 2150.713749] [<0000000000233812>] page_referenced+0x32a/0x410 [...] CC: Alex Graf <agraf@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-31KVM: s390: Sanitize fpc registers for KVM_SET_FPUChristian Borntraeger
(cherry picked from commit 851755871c1f3184f4124c466e85881f17fa3226) commit 7eef87dc99e419b1cc051e4417c37e4744d7b661 (KVM: s390: fix register setting) added a load of the floating point control register to the KVM_SET_FPU path. Lets make sure that the fpc is valid. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-31KVM: s390: do store status after handling STOP_ON_STOP bitJens Freimann
(cherry picked from commit 9e0d5473e2f0ba2d2fe9dab9408edef3060b710e) In handle_stop() handle the stop bit before doing the store status as described for "Stop and Store Status" in the Principles of Operation. We have to give up the local_int.lock before calling kvm store status since it calls gmap_fault() which might sleep. Since local_int.lock only protects local_int.* and not guest memory we can give up the lock. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2011-11-17KVM: s390: announce SYNC_MMUChristian Borntraeger
KVM on s390 always had a sync mmu. Any mapping change in userspace mapping was always reflected immediately in the guest mapping. - In older code the guest mapping was just an offset - In newer code the last level page table is shared Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-11-17KVM: s390: Fix tprot lockingChristian Borntraeger
There is a potential host deadlock in the tprot intercept handling. We must not hold the mmap semaphore while resolving the guest address. If userspace is remapping, then the memory detection in the guest is broken anyway so we can safely separate the address translation from walking the vmas. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-11-17KVM: s390: handle SIGP sense running interceptsCornelia Huck
SIGP sense running may cause an intercept on higher level virtualization, so handle it by checking the CPUSTAT_RUNNING flag. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-11-17KVM: s390: Fix RUNNING flag misinterpretationCornelia Huck
CPUSTAT_RUNNING was implemented signifying that a vcpu is not stopped. This is not, however, what the architecture says: RUNNING should be set when the host is acting on the behalf of the guest operating system. CPUSTAT_RUNNING has been changed to be set in kvm_arch_vcpu_load() and to be unset in kvm_arch_vcpu_put(). For signifying stopped state of a vcpu, a host-controlled bit has been used and is set/unset basically on the reverse as the old CPUSTAT_RUNNING bit (including pushing it down into stop handling proper in handle_stop()). Cc: stable@kernel.org Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-10-31Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6Linus Torvalds
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (54 commits) [S390] Remove error checking from copy_oldmem_page() [S390] qdio: prevent dsci access without adapter interrupts [S390] irqstats: split IPI interrupt accounting [S390] add missing __tlb_flush_global() for !CONFIG_SMP [S390] sparse: fix sparse symbol shadow warning [S390] sparse: fix sparse NULL pointer warnings [S390] sparse: fix sparse warnings with __user pointers [S390] sparse: fix sparse warnings in math-emu [S390] sparse: fix sparse warnings about missing prototypes [S390] sparse: fix sparse ANSI-C warnings [S390] sparse: fix sparse static warnings [S390] sparse: fix access past end of array warnings [S390] dasd: prevent path verification before resume [S390] qdio: remove multicast polling [S390] qdio: reset outbound SBAL error states [S390] qdio: EQBS retry after CCQ 96 [S390] qdio: add timestamp for last queue scan time [S390] Introduce get_clock_fast() [S390] kvm: Handle diagnose 0x10 (release pages) [S390] take mmap_sem when walking guest page table ...
2011-10-30[S390] kvm: Handle diagnose 0x10 (release pages)Christian Borntraeger
Linux on System z uses a ballooner based on diagnose 0x10. (aka as collaborative memory management). This patch implements diagnose 0x10 on the guest address space. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30KVM: s390: implement sigp external callChristian Ehrhardt
Implement sigp external call, which might be required for guests that issue an external call instead of an emergency signal for IPI. This fixes an issue with "KVM: unknown SIGP: 0x02" when booting such an SMP guest. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-30KVM: s390: fix register settingCarsten Otte
KVM common code does vcpu_load prior to calling our arch ioctls and vcpu_put after we're done here. Via the kvm_arch_vcpu_load/put callbacks we do load the fpu and access register state into the processor, which saves us moving the state on every SIE exit the kernel handles. However this breaks register setting from userspace, because of the following sequence: 1a. vcpu load stores userspace register content 1b. vcpu load loads guest register content 2. kvm_arch_vcpu_ioctl_set_fpu/sregs updates saved guest register content 3a. vcpu put stores the guest registers and overwrites the new content 3b. vcpu put loads the userspace register set again This patch loads the new guest register state into the cpu, so that the correct (new) set of guest registers will be stored in step 3a. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-30KVM: s390: fix return value of kvm_arch_init_vmCarsten Otte
This patch fixes the return value of kvm_arch_init_vm in case a memory allocation goes wrong. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-10-30KVM: s390: check cpu_id prior to using itCarsten Otte
We use the cpu id provided by userspace as array index here. Thus we clearly need to check it first. Ooops. CC: <stable@vger.kernel.org> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-20[S390] kvm: extension capability for new address space layoutChristian Borntraeger
598841ca9919d008b520114d8a4378c4ce4e40a1 ([S390] use gmap address spaces for kvm guest images) changed kvm on s390 to use a separate address space for kvm guests. We can now put KVM guests anywhere in the user address mode with a size up to 8PB - as long as the memory is 1MB-aligned. This change was done without KVM extension capability bit. The change was added after 3.0, but we still have a chance to add a feature bit before 3.1 (keeping the releases in a sane state). We use number 71 to avoid collisions with other pending kvm patches as requested by Alexander Graf. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Avi Kivity <avi@redhat.com> Cc: Alexander Graf <agraf@suse.de> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-09-20[S390] kvm: fix address mode switchingChristian Borntraeger
598841ca9919d008b520114d8a4378c4ce4e40a1 ([S390] use gmap address spaces for kvm guest images) changed kvm to use a separate address space for kvm guests. This address space was switched in __vcpu_run In some cases (preemption, page fault) there is the possibility that this address space switch is lost. The typical symptom was a huge amount of validity intercepts or random guest addressing exceptions. Fix this by doing the switch in sie_loop and sie_exit and saving the address space in the gmap structure itself. Also use the preempt notifier. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Avi Kivity <avi@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-07-24[S390] kvm: make sigp emerg smp capableChristian Ehrhardt
SIGP emerg needs to pass the source vpu adress into __LC_CPU_ADDRESS of the target guest. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24[S390] remove kvm mmu reload on s390Carsten Otte
This patch removes the mmu reload logic for kvm on s390. Via Martin's new gmap interface, we can safely add or remove memory slots while guest CPUs are in-flight. Thus, the mmu reload logic is not needed anymore. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24[S390] Use gmap translation for accessing guest memoryCarsten Otte
This patch removes kvm-s390 internal assumption of a linear mapping of guest address space to user space. Previously, guest memory was translated to user addresses using a fixed offset (gmsor). The new code uses gmap_fault to resolve guest addresses. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24[S390] use gmap address spaces for kvm guest imagesCarsten Otte
This patch switches kvm from using (Qemu's) user address space to Martin's gmap address space. This way QEMU does not have to use a linker script in order to fit large guests at low addresses in its address space. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24[S390] move sie code to entry.SMartin Schwidefsky
The entry to / exit from sie has subtle dependencies to the first level interrupt handler. Move the sie assembler code to entry64.S and replace the SIE_HOOK callback with a test and the new _TIF_SIE bit. In addition this patch fixes several problems in regard to the check for the_TIF_EXIT_SIE bits. The old code checked the TIF bits before executing the interrupt handler and it only modified the instruction address if it pointed directly to the sie instruction. In both cases it could miss a TIF bit that normally would cause an exit from the guest and would reenter the guest context. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24[S390] kvm: handle tprot interceptsChristian Borntraeger
When running a kvm guest we can get intercepts for tprot, if the host page table is read-only or not populated. This patch implements the most common case (linux memory detection). This also allows host copy on write for guest memory on newer systems. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-23virtio: expose for non-virtualization users tooOhad Ben-Cohen
virtio has been so far used only in the context of virtualization, and the virtio Kconfig was sourced directly by the relevant arch Kconfigs when VIRTUALIZATION was selected. Now that we start using virtio for inter-processor communications, we need to source the virtio Kconfig outside of the virtualization scope too. Moreover, some architectures might use virtio for both virtualization and inter-processor communications, so directly sourcing virtio might yield unexpected results due to conflicting selections. The simple solution offered by this patch is to always source virtio's Kconfig in drivers/Kconfig, and remove it from the appropriate arch Kconfigs. Additionally, a virtio menu entry has been added so virtio drivers don't show up in the general drivers menu. This way anyone can use virtio, though it's arguably less accessible (and neat!) for virtualization users now. Note: some architectures (mips and sh) seem to have a VIRTUALIZATION menu merely for sourcing virtio's Kconfig, so that menu is removed too. Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-06-06[S390] kvm-s390: fix stfle facilities numbers >=64Christian Borntraeger
Currently KVM masks out the known good facilities only for the first double word, but passed the 2nd double word without filtering. This breaks some code on newer systems: [ 0.593966] ------------[ cut here ]------------ [ 0.594086] WARNING: at arch/s390/oprofile/hwsampler.c:696 [ 0.594213] Modules linked in: [ 0.594321] Modules linked in: [ 0.594439] CPU: 0 Not tainted 3.0.0-rc1 #46 [ 0.594564] Process swapper (pid: 1, task: 00000001effa8038, ksp: 00000001effafab8) [ 0.594735] Krnl PSW : 0704100180000000 00000000004ab89a (hwsampler_setup+0x75a/0x7b8) [ 0.594910] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0 EA:3 [ 0.595120] Krnl GPRS: ffffffff00000000 00000000ffffffea ffffffffffffffea 00000000004a98f8 [ 0.595351] 00000000004aa002 0000000000000001 000000000080e720 000000000088b9f8 [ 0.595522] 000000000080d3e8 0000000000000000 0000000000000000 000000000080e464 [ 0.595725] 0000000000000000 00000000005db198 00000000004ab3a2 00000001effafd98 [ 0.595901] Krnl Code: 00000000004ab88c: c0e5000673ca brasl %r14,57a020 [ 0.596071] 00000000004ab892: a7f4fc77 brc 15,4ab180 [ 0.596276] 00000000004ab896: a7f40001 brc 15,4ab898 [ 0.596454] >00000000004ab89a: a7c8ffa1 lhi %r12,-95 [ 0.596657] 00000000004ab89e: a7f4fc71 brc 15,4ab180 [ 0.596854] 00000000004ab8a2: a7f40001 brc 15,4ab8a4 [ 0.597029] 00000000004ab8a6: a7f4ff22 brc 15,4ab6ea [ 0.597230] 00000000004ab8aa: c0200011009a larl %r2,6cb9de [ 0.597441] Call Trace: [ 0.597511] ([<00000000004ab3a2>] hwsampler_setup+0x262/0x7b8) [ 0.597676] [<0000000000875812>] oprofile_arch_init+0x32/0xd0 [ 0.597834] [<0000000000875788>] oprofile_init+0x28/0x74 [ 0.597991] [<00000000001001be>] do_one_initcall+0x3a/0x170 [ 0.598151] [<000000000084fa22>] kernel_init+0x142/0x1ec [ 0.598314] [<000000000057db16>] kernel_thread_starter+0x6/0xc [ 0.598468] [<000000000057db10>] kernel_thread_starter+0x0/0xc [ 0.598606] Last Breaking-Event-Address: [ 0.598707] [<00000000004ab896>] hwsampler_setup+0x756/0x7b8 [ 0.598863] ---[ end trace ce3179037f4e3e5b ]--- So lets also mask the 2nd double word. Facilites 66,76,76,77 should be fine. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-06-06[S390] kvm-s390: Fix host crash on misbehaving guestsChristian Borntraeger
commit 9ff4cfb3fcfd48b49fdd9be7381b3be340853aa4 ([S390] kvm-390: Let kernel exit SIE instruction on work) fixed a problem of commit commit cd3b70f5d4d82f85d1e1d6e822f38ae098cf7c72 ([S390] virtualization aware cpu measurement) but uncovered another one. If a kvm guest accesses guest real memory that doesnt exist, the page fault handler calls the sie hook, which then rewrites the return psw from sie_inst to either sie_exit or sie_reenter. On return, the page fault handler will then detect the wrong access as a kernel fault causing a kernel oops in sie_reenter or sie_exit. We have to add these two addresses to the exception table to allow graceful exits. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-04-20[S390] kvm-390: Let kernel exit SIE instruction on workCarsten Otte
From: Christian Borntraeger <borntraeger@de.ibm.com> This patch fixes the sie exit on interrupts. The low level interrupt handler returns to the PSW address in pt_regs and not to the PSW address in the lowcore. Without this fix a cpu bound guest might never leave guest state since the host interrupt handler would blindly return to the SIE instruction, even on need_resched and friends. Cc: stable@kernel.org Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-17s390: change to new flag variablematt mooney
Replace EXTRA_CFLAGS with ccflags-y. Signed-off-by: matt mooney <mfm@muteddisk.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-01-13Merge branch 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (142 commits) KVM: Initialize fpu state in preemptible context KVM: VMX: when entering real mode align segment base to 16 bytes KVM: MMU: handle 'map_writable' in set_spte() function KVM: MMU: audit: allow audit more guests at the same time KVM: Fetch guest cr3 from hardware on demand KVM: Replace reads of vcpu->arch.cr3 by an accessor KVM: MMU: only write protect mappings at pagetable level KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear() KVM: MMU: Initialize base_role for tdp mmus KVM: VMX: Optimize atomic EFER load KVM: VMX: Add definitions for more vm entry/exit control bits KVM: SVM: copy instruction bytes from VMCB KVM: SVM: implement enhanced INVLPG intercept KVM: SVM: enhance mov DR intercept handler KVM: SVM: enhance MOV CR intercept handler KVM: SVM: add new SVM feature bit names KVM: cleanup emulate_instruction KVM: move complete_insn_gp() into x86.c KVM: x86: fix CR8 handling KVM guest: Fix kvm clock initialization when it's configured out ...
2011-01-12KVM: Clean up vm creation and releaseJan Kiszka
IA64 support forces us to abstract the allocation of the kvm structure. But instead of mixing this up with arch-specific initialization and doing the same on destruction, split both steps. This allows to move generic destruction calls into generic code. It also fixes error clean-up on failures of kvm_create_vm for IA64. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-05[S390] cleanup s390 KconfigMartin Schwidefsky
Make use of def_bool and def_tristate where possible and add sensible defaults to the config symbols where applicable. This shortens the defconfig file by another ~40 lines. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-25[S390] cleanup facility list handlingMartin Schwidefsky
Store the facility list once at system startup with stfl/stfle and reuse the result for all facility tests. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-25[S390] kvm: Enable z196 instruction facilitiesChristian Borntraeger
Enable PFPO, floating point extension, distinct-operands, fast-BCR-serialization, high-word, interlocked-access, load/store- on-condition, and population-count facilities for guests. (bits 37, 44 and 45). Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-08-01KVM: Remove memory alias supportAvi Kivity
As advertised in feature-removal-schedule.txt. Equivalent support is provided by overlapping memory regions. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: s390: Don't exit SIE on SIGP sense runningChristian Borntraeger
Newer (guest) kernels use sigp sense running in their spinlock implementation to check if the other cpu is running before yielding the processor. This revealed some wrong guest settings, causing unnecessary exits for every sigp sense running. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: s390: Fix build failure due to centralized vcpu locking patchesChristian Borntraeger
This patch fixes ERROR: "__kvm_s390_vcpu_store_status" [arch/s390/kvm/kvm.ko] undefined! triggered by commit 3268c56840dcee78c3e928336550f4e1861504c4 (kvm.git) Author: Avi Kivity <avi@redhat.com> Date: Thu May 13 12:21:46 2010 +0300 KVM: s390: Centrally lock arch specific vcpu ioctls Reported-by: Sachin Sant <sachinp@in.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: Consolidate arch specific vcpu ioctl lockingAvi Kivity
Now that all arch specific ioctls have centralized locking, it is easy to move it to the central dispatcher. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: s390: Centrally lock arch specific vcpu ioctlsAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: move vcpu locking to dispatcher for generic vcpu ioctlsAvi Kivity
All vcpu ioctls need to be locked, so instead of locking each one specifically we lock at the generic dispatcher. This patch only updates generic ioctls and leaves arch specific ioctls alone. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-06-08[S390] arch/s390/kvm: Use GFP_ATOMIC when a lock is heldJulia Lawall
The containing function is called from several places. At one of them, in the function __sigp_stop, the spin lock &fi->lock is held. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @gfp exists@ identifier fn; position p; @@ fn(...) { ... when != spin_unlock when any GFP_KERNEL@p ... when any } @locked@ identifier gfp.fn; @@ spin_lock(...) ... when != spin_unlock fn(...) @depends on locked@ position gfp.p; @@ - GFP_KERNEL@p + GFP_ATOMIC // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-06-08[S390] appldata/extmem/kvm: add missing GFP_KERNEL flagHeiko Carstens
Add missing GFP flag to memory allocations. The part in cio only changes a comment. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-05-26[S390] spp: remove KVM_AWARE_CMF config optionHeiko Carstens
This config option enables or disables three single instructions which aren't expensive. This is too fine grained. Besided that everybody who uses kvm would enable it anyway in order to debug performance problems. Just remove it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-05-21Merge branch 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (269 commits) KVM: x86: Add missing locking to arch specific vcpu ioctls KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls KVM: MMU: Segregate shadow pages with different cr0.wp KVM: x86: Check LMA bit before set_efer KVM: Don't allow lmsw to clear cr0.pe KVM: Add cpuid.txt file KVM: x86: Tell the guest we'll warn it about tsc stability x86, paravirt: don't compute pvclock adjustments if we trust the tsc x86: KVM guest: Try using new kvm clock msrs KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID KVM: x86: add new KVMCLOCK cpuid feature KVM: x86: change msr numbers for kvmclock x86, paravirt: Add a global synchronization point for pvclock x86, paravirt: Enable pvclock flags in vcpu_time_info structure KVM: x86: Inject #GP with the right rip on efer writes KVM: SVM: Don't allow nested guest to VMMCALL into host KVM: x86: Fix exception reinjection forced to true KVM: Fix wallclock version writing race KVM: MMU: Don't read pdptrs with mmu spinlock held in mmu_alloc_roots KVM: VMX: enable VMXON check with SMX enabled (Intel TXT) ...
2010-05-19KVM: Let vcpu structure alignment be determined at runtimeAvi Kivity
vmx and svm vcpus have different contents and therefore may have different alignmment requirements. Let each specify its required alignment. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: use the correct RCU API for PROVE_RCU=yLai Jiangshan
The RCU/SRCU API have already changed for proving RCU usage. I got the following dmesg when PROVE_RCU=y because we used incorrect API. This patch coverts rcu_deference() to srcu_dereference() or family API. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by qemu-system-x86/8550: #0: (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm] #1: (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm] stack backtrace: Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27 Call Trace: [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm] [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm] [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm] [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm] [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel] [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm] [<ffffffff810a8692>] ? unlock_page+0x27/0x2c [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm] [<ffffffff81060cfa>] ? up_read+0x23/0x3d [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a [<ffffffff810021db>] system_call_fastpath+0x16/0x1b Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: s390: Fix possible memory leak of in kvm_arch_vcpu_create()Wei Yongjun
This patch fixed possible memory leak in kvm_arch_vcpu_create() under s390, which would happen when kvm_arch_vcpu_create() fails. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17[S390] virtualization aware cpu measurementCarsten Otte
Use the SPP instruction to set a tag on entry to / exit of the virtual machine context. This allows the cpu measurement facility to distinguish the samples from the host and the different guests. Signed-off-by: Carsten Otte <cotte@de.ibm.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>