Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 4e241557 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull first batch of KVM updates from Paolo Bonzini:
 "The bulk of the changes here is for x86.  And for once it's not for
  silicon that no one owns: these are really new features for everyone.

  Details:

   - ARM:
        several features are in progress but missed the 4.2 deadline.
        So here is just a smattering of bug fixes, plus enabling the
        VFIO integration.

   - s390:
        Some fixes/refactorings/optimizations, plus support for 2GB
        pages.

   - x86:
        * host and guest support for marking kvmclock as a stable
          scheduler clock.
        * support for write combining.
        * support for system management mode, needed for secure boot in
          guests.
        * a bunch of cleanups required for the above
        * support for virtualized performance counters on AMD
        * legacy PCI device assignment is deprecated and defaults to "n"
          in Kconfig; VFIO replaces it

        On top of this there are also bug fixes and eager FPU context
        loading for FPU-heavy guests.

   - Common code:
        Support for multiple address spaces; for now it is used only for
        x86 SMM but the s390 folks also have plans"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits)
  KVM: s390: clear floating interrupt bitmap and parameters
  KVM: x86/vPMU: Enable PMU handling for AMD PERFCTRn and EVNTSELn MSRs
  KVM: x86/vPMU: Implement AMD vPMU code for KVM
  KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch
  KVM: x86/vPMU: introduce kvm_pmu_msr_idx_to_pmc
  KVM: x86/vPMU: reorder PMU functions
  KVM: x86/vPMU: whitespace and stylistic adjustments in PMU code
  KVM: x86/vPMU: use the new macros to go between PMC, PMU and VCPU
  KVM: x86/vPMU: introduce pmu.h header
  KVM: x86/vPMU: rename a few PMU functions
  KVM: MTRR: do not map huge page for non-consistent range
  KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type
  KVM: MTRR: introduce mtrr_for_each_mem_type
  KVM: MTRR: introduce fixed_mtrr_addr_* functions
  KVM: MTRR: sort variable MTRRs
  KVM: MTRR: introduce var_mtrr_range
  KVM: MTRR: introduce fixed_mtrr_segment table
  KVM: MTRR: improve kvm_mtrr_get_guest_memory_type
  KVM: MTRR: do not split 64 bits MSR content
  KVM: MTRR: clean up mtrr default type
  ...
parents 08d183e3 f2ae45ed
Loading
Loading
Loading
Loading
+55 −14
Original line number Diff line number Diff line
@@ -254,6 +254,11 @@ since the last call to this ioctl. Bit 0 is the first page in the
memory slot.  Ensure the entire structure is cleared to avoid padding
issues.

If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 specifies
the address space for which you want to return the dirty bitmap.
They must be less than the value that KVM_CHECK_EXTENSION returns for
the KVM_CAP_MULTI_ADDRESS_SPACE capability.


4.9 KVM_SET_MEMORY_ALIAS

@@ -820,11 +825,21 @@ struct kvm_vcpu_events {
	} nmi;
	__u32 sipi_vector;
	__u32 flags;
	struct {
		__u8 smm;
		__u8 pending;
		__u8 smm_inside_nmi;
		__u8 latched_init;
	} smi;
};

KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
interrupt.shadow contains a valid state. Otherwise, this field is undefined.
Only two fields are defined in the flags field:

- KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
  interrupt.shadow contains a valid state.

- KVM_VCPUEVENT_VALID_SMM may be set in the flags field to signal that
  smi contains a valid state.

4.32 KVM_SET_VCPU_EVENTS

@@ -841,17 +856,20 @@ vcpu.
See KVM_GET_VCPU_EVENTS for the data structure.

Fields that may be modified asynchronously by running VCPUs can be excluded
from the update. These fields are nmi.pending and sipi_vector. Keep the
corresponding bits in the flags field cleared to suppress overwriting the
current in-kernel state. The bits are:
from the update. These fields are nmi.pending, sipi_vector, smi.smm,
smi.pending. Keep the corresponding bits in the flags field cleared to
suppress overwriting the current in-kernel state. The bits are:

KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel
KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector
KVM_VCPUEVENT_VALID_SMM         - transfer the smi sub-struct.

If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
the flags field to signal that interrupt.shadow contains a valid state and
shall be written into the VCPU.

KVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available.


4.33 KVM_GET_DEBUGREGS

@@ -911,6 +929,13 @@ slot. When changing an existing slot, it may be moved in the guest
physical memory space, or its flags may be modified.  It may not be
resized.  Slots may not overlap in guest physical address space.

If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of "slot"
specifies the address space which is being modified.  They must be
less than the value that KVM_CHECK_EXTENSION returns for the
KVM_CAP_MULTI_ADDRESS_SPACE capability.  Slots in separate address spaces
are unrelated; the restriction on overlapping slots only applies within
each address space.

Memory for the region is taken starting at the address denoted by the
field userspace_addr, which must point at user addressable memory for
the entire memory slot size.  Any object may back this memory, including
@@ -959,7 +984,8 @@ documentation when it pops into existence).
4.37 KVM_ENABLE_CAP

Capability: KVM_CAP_ENABLE_CAP, KVM_CAP_ENABLE_CAP_VM
Architectures: ppc, s390
Architectures: x86 (only KVM_CAP_ENABLE_CAP_VM),
	       mips (only KVM_CAP_ENABLE_CAP), ppc, s390
Type: vcpu ioctl, vm ioctl (with KVM_CAP_ENABLE_CAP_VM)
Parameters: struct kvm_enable_cap (in)
Returns: 0 on success; -1 on error
@@ -1268,7 +1294,7 @@ The flags bitmap is defined as:
   /* the host supports the ePAPR idle hcall
   #define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)

4.48 KVM_ASSIGN_PCI_DEVICE
4.48 KVM_ASSIGN_PCI_DEVICE (deprecated)

Capability: none
Architectures: x86
@@ -1318,7 +1344,7 @@ Errors:
  have their standard meanings.


4.49 KVM_DEASSIGN_PCI_DEVICE
4.49 KVM_DEASSIGN_PCI_DEVICE (deprecated)

Capability: none
Architectures: x86
@@ -1337,7 +1363,7 @@ Errors:
  Other error conditions may be defined by individual device types or
  have their standard meanings.

4.50 KVM_ASSIGN_DEV_IRQ
4.50 KVM_ASSIGN_DEV_IRQ (deprecated)

Capability: KVM_CAP_ASSIGN_DEV_IRQ
Architectures: x86
@@ -1377,7 +1403,7 @@ Errors:
  have their standard meanings.


4.51 KVM_DEASSIGN_DEV_IRQ
4.51 KVM_DEASSIGN_DEV_IRQ (deprecated)

Capability: KVM_CAP_ASSIGN_DEV_IRQ
Architectures: x86
@@ -1451,7 +1477,7 @@ struct kvm_irq_routing_s390_adapter {
};


4.53 KVM_ASSIGN_SET_MSIX_NR
4.53 KVM_ASSIGN_SET_MSIX_NR (deprecated)

Capability: none
Architectures: x86
@@ -1473,7 +1499,7 @@ struct kvm_assigned_msix_nr {
#define KVM_MAX_MSIX_PER_DEV		256


4.54 KVM_ASSIGN_SET_MSIX_ENTRY
4.54 KVM_ASSIGN_SET_MSIX_ENTRY (deprecated)

Capability: none
Architectures: x86
@@ -1629,7 +1655,7 @@ should skip processing the bitmap and just invalidate everything. It must
be set to the number of set bits in the bitmap.


4.61 KVM_ASSIGN_SET_INTX_MASK
4.61 KVM_ASSIGN_SET_INTX_MASK (deprecated)

Capability: KVM_CAP_PCI_2_3
Architectures: x86
@@ -2978,6 +3004,16 @@ len must be a multiple of sizeof(struct kvm_s390_irq). It must be > 0
and it must not exceed (max_vcpus + 32) * sizeof(struct kvm_s390_irq),
which is the maximum number of possibly pending cpu-local interrupts.

4.90 KVM_SMI

Capability: KVM_CAP_X86_SMM
Architectures: x86
Type: vcpu ioctl
Parameters: none
Returns: 0 on success, -1 on error

Queues an SMI on the thread's vcpu.

5. The kvm_run structure
------------------------

@@ -3013,7 +3049,12 @@ an interrupt can be injected now with KVM_INTERRUPT.
The value of the current interrupt flag.  Only valid if in-kernel
local APIC is not used.

	__u8 padding2[2];
	__u16 flags;

More architecture-specific flags detailing state of the VCPU that may
affect the device's behavior.  The only currently defined flag is
KVM_RUN_X86_SMM, which is valid on x86 machines and is set if the
VCPU is in system management mode.

	/* in (pre_kvm_run), out (post_kvm_run) */
	__u64 cr8;
+6 −0
Original line number Diff line number Diff line
@@ -173,6 +173,12 @@ Shadow pages contain the following information:
    Contains the value of cr4.smap && !cr0.wp for which the page is valid
    (pages for which this is true are different from other pages; see the
    treatment of cr0.wp=0 below).
  role.smm:
    Is 1 if the page is valid in system management mode.  This field
    determines which of the kvm_memslots array was used to build this
    shadow page; it is also used to go back from a struct kvm_mmu_page
    to a memslot, through the kvm_memslots_for_spte_role macro and
    __gfn_to_memslot.
  gfn:
    Either the guest page table containing the translations shadowed by this
    page, or the base page frame for linear translations.  See role.direct.
+1 −0
Original line number Diff line number Diff line
@@ -28,6 +28,7 @@ config KVM
	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
	select SRCU
	select MMU_NOTIFIER
	select KVM_VFIO
	select HAVE_KVM_EVENTFD
	select HAVE_KVM_IRQFD
	depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
+1 −1
Original line number Diff line number Diff line
@@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt)
AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt)

KVM := ../../../virt/kvm
kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o
kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o $(KVM)/vfio.o

obj-y += kvm-arm.o init.o interrupts.o
obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
+18 −6
Original line number Diff line number Diff line
@@ -171,7 +171,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
	int r;
	switch (ext) {
	case KVM_CAP_IRQCHIP:
	case KVM_CAP_IRQFD:
	case KVM_CAP_IOEVENTFD:
	case KVM_CAP_DEVICE_CTRL:
	case KVM_CAP_USER_MEMORY:
@@ -532,6 +531,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
		kvm_vgic_flush_hwstate(vcpu);
		kvm_timer_flush_hwstate(vcpu);

		preempt_disable();
		local_irq_disable();

		/*
@@ -544,6 +544,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)

		if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
			local_irq_enable();
			preempt_enable();
			kvm_timer_sync_hwstate(vcpu);
			kvm_vgic_sync_hwstate(vcpu);
			continue;
@@ -553,14 +554,16 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
		 * Enter the guest
		 */
		trace_kvm_entry(*vcpu_pc(vcpu));
		kvm_guest_enter();
		__kvm_guest_enter();
		vcpu->mode = IN_GUEST_MODE;

		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);

		vcpu->mode = OUTSIDE_GUEST_MODE;
		kvm_guest_exit();
		trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
		/*
		 * Back from guest
		 *************************************************************/

		/*
		 * We may have taken a host interrupt in HYP mode (ie
		 * while executing the guest). This interrupt is still
@@ -574,8 +577,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
		local_irq_enable();

		/*
		 * Back from guest
		 *************************************************************/
		 * We do local_irq_enable() before calling kvm_guest_exit() so
		 * that if a timer interrupt hits while running the guest we
		 * account that tick as being spent in the guest.  We enable
		 * preemption after calling kvm_guest_exit() so that if we get
		 * preempted we make sure ticks after that is not counted as
		 * guest time.
		 */
		kvm_guest_exit();
		trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
		preempt_enable();


		kvm_timer_sync_hwstate(vcpu);
		kvm_vgic_sync_hwstate(vcpu);
Loading