Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit bbf45ba5 authored by Hollis Blanchard's avatar Hollis Blanchard Committed by Avi Kivity
Browse files

KVM: ppc: PowerPC 440 KVM implementation



This functionality is definitely experimental, but is capable of running
unmodified PowerPC 440 Linux kernels as guests on a PowerPC 440 host. (Only
tested with 440EP "Bamboo" guests so far, but with appropriate userspace
support other SoC/board combinations should work.)

See Documentation/powerpc/kvm_440.txt for technical details.

[stephen: build fix]

Signed-off-by: default avatarHollis Blanchard <hollisb@us.ibm.com>
Acked-by: default avatarPaul Mackerras <paulus@samba.org>
Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: default avatarAvi Kivity <avi@qumranet.com>
parent 513014b7
Loading
Loading
Loading
Loading
+41 −0
Original line number Original line Diff line number Diff line
Hollis Blanchard <hollisb@us.ibm.com>
15 Apr 2008

Various notes on the implementation of KVM for PowerPC 440:

To enforce isolation, host userspace, guest kernel, and guest userspace all
run at user privilege level. Only the host kernel runs in supervisor mode.
Executing privileged instructions in the guest traps into KVM (in the host
kernel), where we decode and emulate them. Through this technique, unmodified
440 Linux kernels can be run (slowly) as guests. Future performance work will
focus on reducing the overhead and frequency of these traps.

The usual code flow is started from userspace invoking an "run" ioctl, which
causes KVM to switch into guest context. We use IVPR to hijack the host
interrupt vectors while running the guest, which allows us to direct all
interrupts to kvmppc_handle_interrupt(). At this point, we could either
- handle the interrupt completely (e.g. emulate "mtspr SPRG0"), or
- let the host interrupt handler run (e.g. when the decrementer fires), or
- return to host userspace (e.g. when the guest performs device MMIO)

Address spaces: We take advantage of the fact that Linux doesn't use the AS=1
address space (in host or guest), which gives us virtual address space to use
for guest mappings. While the guest is running, the host kernel remains mapped
in AS=0, but the guest can only use AS=1 mappings.

TLB entries: The TLB entries covering the host linear mapping remain
present while running the guest. This reduces the overhead of lightweight
exits, which are handled by KVM running in the host kernel. We keep three
copies of the TLB:
 - guest TLB: contents of the TLB as the guest sees it
 - shadow TLB: the TLB that is actually in hardware while guest is running
 - host TLB: to restore TLB state when context switching guest -> host
When a TLB miss occurs because a mapping was not present in the shadow TLB,
but was present in the guest TLB, KVM handles the fault without invoking the
guest. Large guest pages are backed by multiple 4KB shadow pages through this
mechanism.

IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network
and block IO, so those drivers must be enabled in the guest. It's possible
that some qemu device emulation (e.g. e1000 or rtl8139) may also work with
little effort.
+1 −0
Original line number Original line Diff line number Diff line
@@ -803,3 +803,4 @@ config PPC_CLOCK
config PPC_LIB_RHEAP
config PPC_LIB_RHEAP
	bool
	bool


source "arch/powerpc/kvm/Kconfig"
+3 −0
Original line number Original line Diff line number Diff line
@@ -151,6 +151,9 @@ config BOOTX_TEXT


config PPC_EARLY_DEBUG
config PPC_EARLY_DEBUG
	bool "Early debugging (dangerous)"
	bool "Early debugging (dangerous)"
	# PPC_EARLY_DEBUG on 440 leaves AS=1 mappings above the TLB high water
	# mark, which doesn't work with current 440 KVM.
	depends on !KVM
	help
	help
	  Say Y to enable some early debugging facilities that may be available
	  Say Y to enable some early debugging facilities that may be available
	  for your processor/board combination. Those facilities are hacks
	  for your processor/board combination. Those facilities are hacks
+1 −0
Original line number Original line Diff line number Diff line
@@ -145,6 +145,7 @@ core-y += arch/powerpc/kernel/ \
				   arch/powerpc/platforms/
				   arch/powerpc/platforms/
core-$(CONFIG_MATH_EMULATION)	+= arch/powerpc/math-emu/
core-$(CONFIG_MATH_EMULATION)	+= arch/powerpc/math-emu/
core-$(CONFIG_XMON)		+= arch/powerpc/xmon/
core-$(CONFIG_XMON)		+= arch/powerpc/xmon/
core-$(CONFIG_KVM) 		+= arch/powerpc/kvm/


drivers-$(CONFIG_OPROFILE)	+= arch/powerpc/oprofile/
drivers-$(CONFIG_OPROFILE)	+= arch/powerpc/oprofile/


+28 −0
Original line number Original line Diff line number Diff line
@@ -23,6 +23,9 @@
#include <linux/mm.h>
#include <linux/mm.h>
#include <linux/suspend.h>
#include <linux/suspend.h>
#include <linux/hrtimer.h>
#include <linux/hrtimer.h>
#ifdef CONFIG_KVM
#include <linux/kvm_host.h>
#endif
#ifdef CONFIG_PPC64
#ifdef CONFIG_PPC64
#include <linux/time.h>
#include <linux/time.h>
#include <linux/hardirq.h>
#include <linux/hardirq.h>
@@ -324,5 +327,30 @@ int main(void)


	DEFINE(PGD_TABLE_SIZE, PGD_TABLE_SIZE);
	DEFINE(PGD_TABLE_SIZE, PGD_TABLE_SIZE);


#ifdef CONFIG_KVM
	DEFINE(TLBE_BYTES, sizeof(struct tlbe));

	DEFINE(VCPU_HOST_STACK, offsetof(struct kvm_vcpu, arch.host_stack));
	DEFINE(VCPU_HOST_PID, offsetof(struct kvm_vcpu, arch.host_pid));
	DEFINE(VCPU_HOST_TLB, offsetof(struct kvm_vcpu, arch.host_tlb));
	DEFINE(VCPU_SHADOW_TLB, offsetof(struct kvm_vcpu, arch.shadow_tlb));
	DEFINE(VCPU_GPRS, offsetof(struct kvm_vcpu, arch.gpr));
	DEFINE(VCPU_LR, offsetof(struct kvm_vcpu, arch.lr));
	DEFINE(VCPU_CR, offsetof(struct kvm_vcpu, arch.cr));
	DEFINE(VCPU_XER, offsetof(struct kvm_vcpu, arch.xer));
	DEFINE(VCPU_CTR, offsetof(struct kvm_vcpu, arch.ctr));
	DEFINE(VCPU_PC, offsetof(struct kvm_vcpu, arch.pc));
	DEFINE(VCPU_MSR, offsetof(struct kvm_vcpu, arch.msr));
	DEFINE(VCPU_SPRG4, offsetof(struct kvm_vcpu, arch.sprg4));
	DEFINE(VCPU_SPRG5, offsetof(struct kvm_vcpu, arch.sprg5));
	DEFINE(VCPU_SPRG6, offsetof(struct kvm_vcpu, arch.sprg6));
	DEFINE(VCPU_SPRG7, offsetof(struct kvm_vcpu, arch.sprg7));
	DEFINE(VCPU_PID, offsetof(struct kvm_vcpu, arch.pid));

	DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
	DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
	DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
#endif

	return 0;
	return 0;
}
}
Loading