Merge tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm (ecefbd94) · Commits · e / devices / android_kernel_xiaomi_markw

Documentation/virtual/kvm/api.txt

+25 −8

Original line number	Diff line number	Diff line
		@@ -857,7 +857,8 @@ struct kvm_userspace_memory_region {
		};

		/* for kvm_memory_region::flags */
		#define KVM_MEM_LOG_DIRTY_PAGES 1UL
		#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
		#define KVM_MEM_READONLY (1UL << 1)

		This ioctl allows the user to create or modify a guest physical memory
		slot. When changing an existing slot, it may be moved in the guest
		@@ -873,14 +874,17 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
		be identical. This allows large pages in the guest to be backed by large
		pages in the host.

		The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
		instructs kvm to keep track of writes to memory within the slot. See
		the KVM_GET_DIRTY_LOG ioctl.
		The flags field supports two flag, KVM_MEM_LOG_DIRTY_PAGES, which instructs
		kvm to keep track of writes to memory within the slot. See KVM_GET_DIRTY_LOG
		ioctl. The KVM_CAP_READONLY_MEM capability indicates the availability of the
		KVM_MEM_READONLY flag. When this flag is set for a memory region, KVM only
		allows read accesses. Writes will be posted to userspace as KVM_EXIT_MMIO
		exits.

		When the KVM_CAP_SYNC_MMU capability, changes in the backing of the memory
		region are automatically reflected into the guest. For example, an mmap()
		that affects the region will be made visible immediately. Another example
		is madvise(MADV_DROP).
		When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
		the memory region are automatically reflected into the guest. For example, an
		mmap() that affects the region will be made visible immediately. Another
		example is madvise(MADV_DROP).

		It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl.
		The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
		@@ -1946,6 +1950,19 @@ the guest using the specified gsi pin. The irqfd is removed using
		the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd
		and kvm_irqfd.gsi.

		With KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify
		mechanism allowing emulation of level-triggered, irqfd-based
		interrupts. When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an
		additional eventfd in the kvm_irqfd.resamplefd field. When operating
		in resample mode, posting of an interrupt through kvm_irq.fd asserts
		the specified gsi in the irqchip. When the irqchip is resampled, such
		as from an EOI, the gsi is de-asserted and the user is notifed via
		kvm_irqfd.resamplefd. It is the user's responsibility to re-queue
		the interrupt if the device making use of it still requires service.
		Note that closing the resamplefd is not sufficient to disable the
		irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
		and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.

		4.76 KVM_PPC_ALLOCATE_HTAB

		Capability: KVM_CAP_PPC_ALLOC_HTAB

Documentation/virtual/kvm/hypercalls.txt

0 → 100644

+66 −0

Original line number	Diff line number	Diff line
		Linux KVM Hypercall:
		===================
		X86:
		KVM Hypercalls have a three-byte sequence of either the vmcall or the vmmcall
		instruction. The hypervisor can replace it with instructions that are
		guaranteed to be supported.

		Up to four arguments may be passed in rbx, rcx, rdx, and rsi respectively.
		The hypercall number should be placed in rax and the return value will be
		placed in rax. No other registers will be clobbered unless explicitly stated
		by the particular hypercall.

		S390:
		R2-R7 are used for parameters 1-6. In addition, R1 is used for hypercall
		number. The return value is written to R2.

		S390 uses diagnose instruction as hypercall (0x500) along with hypercall
		number in R1.

		PowerPC:
		It uses R3-R10 and hypercall number in R11. R4-R11 are used as output registers.
		Return value is placed in R3.

		KVM hypercalls uses 4 byte opcode, that are patched with 'hypercall-instructions'
		property inside the device tree's /hypervisor node.
		For more information refer to Documentation/virtual/kvm/ppc-pv.txt

		KVM Hypercalls Documentation
		===========================
		The template for each hypercall is:
		1. Hypercall name.
		2. Architecture(s)
		3. Status (deprecated, obsolete, active)
		4. Purpose

		1. KVM_HC_VAPIC_POLL_IRQ
		------------------------
		Architecture: x86
		Status: active
		Purpose: Trigger guest exit so that the host can check for pending
		interrupts on reentry.

		2. KVM_HC_MMU_OP
		------------------------
		Architecture: x86
		Status: deprecated.
		Purpose: Support MMU operations such as writing to PTE,
		flushing TLB, release PT.

		3. KVM_HC_FEATURES
		------------------------
		Architecture: PPC
		Status: active
		Purpose: Expose hypercall availability to the guest. On x86 platforms, cpuid
		used to enumerate which hypercalls are available. On PPC, either device tree
		based lookup ( which is also what EPAPR dictates) OR KVM specific enumeration
		mechanism (which is this hypercall) can be used.

		4. KVM_HC_PPC_MAP_MAGIC_PAGE
		------------------------
		Architecture: PPC
		Status: active
		Purpose: To enable communication between the hypervisor and guest there is a
		shared page that contains parts of supervisor visible register state.
		The guest can map this shared page to access its supervisor register through
		memory using this hypercall.

Documentation/virtual/kvm/msr.txt

+20 −12

Original line number	Diff line number	Diff line
		@@ -34,9 +34,12 @@ MSR_KVM_WALL_CLOCK_NEW: 0x4b564d00
		time information and check that they are both equal and even.
		An odd version indicates an in-progress update.

		sec: number of seconds for wallclock.
		sec: number of seconds for wallclock at time of boot.

		nsec: number of nanoseconds for wallclock.
		nsec: number of nanoseconds for wallclock at time of boot.

		In order to get the current wallclock time, the system_time from
		MSR_KVM_SYSTEM_TIME_NEW needs to be added.

		Note that although MSRs are per-CPU entities, the effect of this
		particular MSR is global.
		@@ -82,20 +85,25 @@ MSR_KVM_SYSTEM_TIME_NEW: 0x4b564d01
		time at the time this structure was last updated. Unit is
		nanoseconds.

		tsc_to_system_mul: a function of the tsc frequency. One has
		to multiply any tsc-related quantity by this value to get
		a value in nanoseconds, besides dividing by 2^tsc_shift
		tsc_to_system_mul: multiplier to be used when converting
		tsc-related quantity to nanoseconds

		tsc_shift: cycle to nanosecond divider, as a power of two, to
		allow for shift rights. One has to shift right any tsc-related
		quantity by this value to get a value in nanoseconds, besides
		multiplying by tsc_to_system_mul.
		tsc_shift: shift to be used when converting tsc-related
		quantity to nanoseconds. This shift will ensure that
		multiplication with tsc_to_system_mul does not overflow.
		A positive value denotes a left shift, a negative value
		a right shift.

		With this information, guests can derive per-CPU time by
		doing:
		The conversion from tsc to nanoseconds involves an additional
		right shift by 32 bits. With this information, guests can
		derive per-CPU time by doing:

		time = (current_tsc - tsc_timestamp)
		time = (time * tsc_to_system_mul) >> tsc_shift
		if (tsc_shift >= 0)
		time <<= tsc_shift;
		else
		time >>= -tsc_shift;
		time = (time * tsc_to_system_mul) >> 32
		time = time + system_time

		flags: bits in this field indicate extended capabilities

Documentation/virtual/kvm/ppc-pv.txt

+22 −0

Original line number	Diff line number	Diff line
		@@ -174,3 +174,25 @@ following:
		That way we can inject an arbitrary amount of code as replacement for a single
		instruction. This allows us to check for pending interrupts when setting EE=1
		for example.

		Hypercall ABIs in KVM on PowerPC
		=================================
		1) KVM hypercalls (ePAPR)

		These are ePAPR compliant hypercall implementation (mentioned above). Even
		generic hypercalls are implemented here, like the ePAPR idle hcall. These are
		available on all targets.

		2) PAPR hypercalls

		PAPR hypercalls are needed to run server PowerPC PAPR guests (-M pseries in QEMU).
		These are the same hypercalls that pHyp, the POWER hypervisor implements. Some of
		them are handled in the kernel, some are handled in user space. This is only
		available on book3s_64.

		3) OSI hypercalls

		Mac-on-Linux is another user of KVM on PowerPC, which has its own hypercall (long
		before KVM). This is supported to maintain compatibility. All these hypercalls get
		forwarded to user space. This is only useful on book3s_32, but can be used with
		book3s_64 as well.

arch/ia64/kvm/kvm-ia64.c

+17 −24

Original line number	Diff line number	Diff line
		@@ -924,6 +924,16 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu vcpu, struct kvm_regs regs)
		return 0;
		}

		int kvm_vm_ioctl_irq_line(struct kvm kvm, struct kvm_irq_level irq_event)
		{
		if (!irqchip_in_kernel(kvm))
		return -ENXIO;

		irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
		irq_event->irq, irq_event->level);
		return 0;
		}

		long kvm_arch_vm_ioctl(struct file *filp,
		unsigned int ioctl, unsigned long arg)
		{
		@@ -963,29 +973,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
		goto out;
		}
		break;
		case KVM_IRQ_LINE_STATUS:
		case KVM_IRQ_LINE: {
		struct kvm_irq_level irq_event;

		r = -EFAULT;
		if (copy_from_user(&irq_event, argp, sizeof irq_event))
		goto out;
		r = -ENXIO;
		if (irqchip_in_kernel(kvm)) {
		__s32 status;
		status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
		irq_event.irq, irq_event.level);
		if (ioctl == KVM_IRQ_LINE_STATUS) {
		r = -EFAULT;
		irq_event.status = status;
		if (copy_to_user(argp, &irq_event,
		sizeof irq_event))
		goto out;
		}
		r = 0;
		}
		break;
		}
		case KVM_GET_IRQCHIP: {
		/* 0: PIC master, 1: PIC slave, 2: IOAPIC */
		struct kvm_irqchip chip;
		@@ -1626,11 +1613,17 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
		return;
		}

		void kvm_arch_flush_shadow(struct kvm *kvm)
		void kvm_arch_flush_shadow_all(struct kvm *kvm)
		{
		kvm_flush_remote_tlbs(kvm);
		}

		void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
		struct kvm_memory_slot *slot)
		{
		kvm_arch_flush_shadow_all();
		}

		long kvm_arch_dev_ioctl(struct file *filp,
		unsigned int ioctl, unsigned long arg)
		{