Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 3195a35b authored by Paolo Bonzini's avatar Paolo Bonzini
Browse files

Merge tag 'kvm-s390-next-4.13-1' of...

Merge tag 'kvm-s390-next-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: fixes and features for 4.13

- initial machine check forwarding
- migration support for the CMMA page hinting information
- cleanups
- fixes
parents 40352605 d52cd207
Loading
Loading
Loading
Loading
+135 −0
Original line number Diff line number Diff line
@@ -3255,6 +3255,141 @@ Otherwise, if the MCE is a corrected error, KVM will just
store it in the corresponding bank (provided this bank is
not holding a previously reported uncorrected error).

4.107 KVM_S390_GET_CMMA_BITS

Capability: KVM_CAP_S390_CMMA_MIGRATION
Architectures: s390
Type: vm ioctl
Parameters: struct kvm_s390_cmma_log (in, out)
Returns: 0 on success, a negative value on error

This ioctl is used to get the values of the CMMA bits on the s390
architecture. It is meant to be used in two scenarios:
- During live migration to save the CMMA values. Live migration needs
  to be enabled via the KVM_REQ_START_MIGRATION VM property.
- To non-destructively peek at the CMMA values, with the flag
  KVM_S390_CMMA_PEEK set.

The ioctl takes parameters via the kvm_s390_cmma_log struct. The desired
values are written to a buffer whose location is indicated via the "values"
member in the kvm_s390_cmma_log struct.  The values in the input struct are
also updated as needed.
Each CMMA value takes up one byte.

struct kvm_s390_cmma_log {
	__u64 start_gfn;
	__u32 count;
	__u32 flags;
	union {
		__u64 remaining;
		__u64 mask;
	};
	__u64 values;
};

start_gfn is the number of the first guest frame whose CMMA values are
to be retrieved,

count is the length of the buffer in bytes,

values points to the buffer where the result will be written to.

If count is greater than KVM_S390_SKEYS_MAX, then it is considered to be
KVM_S390_SKEYS_MAX. KVM_S390_SKEYS_MAX is re-used for consistency with
other ioctls.

The result is written in the buffer pointed to by the field values, and
the values of the input parameter are updated as follows.

Depending on the flags, different actions are performed. The only
supported flag so far is KVM_S390_CMMA_PEEK.

The default behaviour if KVM_S390_CMMA_PEEK is not set is:
start_gfn will indicate the first page frame whose CMMA bits were dirty.
It is not necessarily the same as the one passed as input, as clean pages
are skipped.

count will indicate the number of bytes actually written in the buffer.
It can (and very often will) be smaller than the input value, since the
buffer is only filled until 16 bytes of clean values are found (which
are then not copied in the buffer). Since a CMMA migration block needs
the base address and the length, for a total of 16 bytes, we will send
back some clean data if there is some dirty data afterwards, as long as
the size of the clean data does not exceed the size of the header. This
allows to minimize the amount of data to be saved or transferred over
the network at the expense of more roundtrips to userspace. The next
invocation of the ioctl will skip over all the clean values, saving
potentially more than just the 16 bytes we found.

If KVM_S390_CMMA_PEEK is set:
the existing storage attributes are read even when not in migration
mode, and no other action is performed;

the output start_gfn will be equal to the input start_gfn,

the output count will be equal to the input count, except if the end of
memory has been reached.

In both cases:
the field "remaining" will indicate the total number of dirty CMMA values
still remaining, or 0 if KVM_S390_CMMA_PEEK is set and migration mode is
not enabled.

mask is unused.

values points to the userspace buffer where the result will be stored.

This ioctl can fail with -ENOMEM if not enough memory can be allocated to
complete the task, with -ENXIO if CMMA is not enabled, with -EINVAL if
KVM_S390_CMMA_PEEK is not set but migration mode was not enabled, with
-EFAULT if the userspace address is invalid or if no page table is
present for the addresses (e.g. when using hugepages).

4.108 KVM_S390_SET_CMMA_BITS

Capability: KVM_CAP_S390_CMMA_MIGRATION
Architectures: s390
Type: vm ioctl
Parameters: struct kvm_s390_cmma_log (in)
Returns: 0 on success, a negative value on error

This ioctl is used to set the values of the CMMA bits on the s390
architecture. It is meant to be used during live migration to restore
the CMMA values, but there are no restrictions on its use.
The ioctl takes parameters via the kvm_s390_cmma_values struct.
Each CMMA value takes up one byte.

struct kvm_s390_cmma_log {
	__u64 start_gfn;
	__u32 count;
	__u32 flags;
	union {
		__u64 remaining;
		__u64 mask;
	};
	__u64 values;
};

start_gfn indicates the starting guest frame number,

count indicates how many values are to be considered in the buffer,

flags is not used and must be 0.

mask indicates which PGSTE bits are to be considered.

remaining is not used.

values points to the buffer in userspace where to store the values.

This ioctl can fail with -ENOMEM if not enough memory can be allocated to
complete the task, with -ENXIO if CMMA is not enabled, with -EINVAL if
the count field is too large (e.g. more than KVM_S390_CMMA_SIZE_MAX) or
if the flags field was not 0, with -EFAULT if the userspace address is
invalid, if invalid pages are written to (e.g. after the end of memory)
or if no page table is present for the addresses (e.g. when using
hugepages).

5. The kvm_run structure
------------------------

+15 −0
Original line number Diff line number Diff line
@@ -16,6 +16,7 @@ FLIC provides support to
- register and modify adapter interrupt sources (KVM_DEV_FLIC_ADAPTER_*)
- modify AIS (adapter-interruption-suppression) mode state (KVM_DEV_FLIC_AISM)
- inject adapter interrupts on a specified adapter (KVM_DEV_FLIC_AIRQ_INJECT)
- get/set all AIS mode states (KVM_DEV_FLIC_AISM_ALL)

Groups:
  KVM_DEV_FLIC_ENQUEUE
@@ -136,6 +137,20 @@ struct kvm_s390_ais_req {
    an isc according to the adapter-interruption-suppression mode on condition
    that the AIS capability is enabled.

  KVM_DEV_FLIC_AISM_ALL
    Gets or sets the adapter-interruption-suppression mode for all ISCs. Takes
    a kvm_s390_ais_all describing:

struct kvm_s390_ais_all {
       __u8 simm; /* Single-Interruption-Mode mask */
       __u8 nimm; /* No-Interruption-Mode mask *
};

    simm contains Single-Interruption-Mode mask for all ISCs, nimm contains
    No-Interruption-Mode mask for all ISCs. Each bit in simm and nimm corresponds
    to an ISC (MSB0 bit 0 to ISC 0 and so on). The combination of simm bit and
    nimm bit presents AIS mode for a ISC.

Note: The KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR device ioctls executed on
FLIC with an unknown group or attribute gives the error code EINVAL (instead of
ENXIO, as specified in the API documentation). It is not possible to conclude
+33 −0
Original line number Diff line number Diff line
@@ -222,3 +222,36 @@ Allows user space to disable dea key wrapping, clearing the wrapping key.

Parameters: none
Returns:    0

5. GROUP: KVM_S390_VM_MIGRATION
Architectures: s390

5.1. ATTRIBUTE: KVM_S390_VM_MIGRATION_STOP (w/o)

Allows userspace to stop migration mode, needed for PGSTE migration.
Setting this attribute when migration mode is not active will have no
effects.

Parameters: none
Returns:    0

5.2. ATTRIBUTE: KVM_S390_VM_MIGRATION_START (w/o)

Allows userspace to start migration mode, needed for PGSTE migration.
Setting this attribute when migration mode is already active will have
no effects.

Parameters: none
Returns:    -ENOMEM if there is not enough free memory to start migration mode
	    -EINVAL if the state of the VM is invalid (e.g. no memory defined)
	    0 in case of success.

5.3. ATTRIBUTE: KVM_S390_VM_MIGRATION_STATUS (r/o)

Allows userspace to query the status of migration mode.

Parameters: address of a buffer in user space to store the data (u64) to;
	    the data itself is either 0 if migration mode is disabled or 1
	    if it is enabled
Returns:    -EFAULT if the given address is not accessible from kernel space
	    0 in case of success.
+3 −1
Original line number Diff line number Diff line
@@ -59,7 +59,9 @@ union ctlreg0 {
		unsigned long lap  : 1; /* Low-address-protection control */
		unsigned long	   : 4;
		unsigned long edat : 1; /* Enhanced-DAT-enablement control */
		unsigned long	   : 4;
		unsigned long	   : 2;
		unsigned long iep  : 1; /* Instruction-Execution-Protection */
		unsigned long	   : 1;
		unsigned long afp  : 1; /* AFP-register control */
		unsigned long vx   : 1; /* Vector enablement control */
		unsigned long	   : 7;
+34 −10
Original line number Diff line number Diff line
@@ -45,6 +45,8 @@
#define KVM_REQ_ENABLE_IBS         8
#define KVM_REQ_DISABLE_IBS        9
#define KVM_REQ_ICPT_OPEREXC       10
#define KVM_REQ_START_MIGRATION   11
#define KVM_REQ_STOP_MIGRATION    12

#define SIGP_CTRL_C		0x80
#define SIGP_CTRL_SCN_MASK	0x3f
@@ -56,7 +58,7 @@ union bsca_sigp_ctrl {
		__u8 r : 1;
		__u8 scn : 6;
	};
} __packed;
};

union esca_sigp_ctrl {
	__u16 value;
@@ -65,14 +67,14 @@ union esca_sigp_ctrl {
		__u8 reserved: 7;
		__u8 scn;
	};
} __packed;
};

struct esca_entry {
	union esca_sigp_ctrl sigp_ctrl;
	__u16   reserved1[3];
	__u64   sda;
	__u64   reserved2[6];
} __packed;
};

struct bsca_entry {
	__u8	reserved0;
@@ -80,7 +82,7 @@ struct bsca_entry {
	__u16	reserved[3];
	__u64	sda;
	__u64	reserved2[2];
} __attribute__((packed));
};

union ipte_control {
	unsigned long val;
@@ -97,7 +99,7 @@ struct bsca_block {
	__u64	mcn;
	__u64	reserved2;
	struct bsca_entry cpu[KVM_S390_BSCA_CPU_SLOTS];
} __attribute__((packed));
};

struct esca_block {
	union ipte_control ipte_control;
@@ -105,7 +107,21 @@ struct esca_block {
	__u64   mcn[4];
	__u64   reserved2[20];
	struct esca_entry cpu[KVM_S390_ESCA_CPU_SLOTS];
} __packed;
};

/*
 * This struct is used to store some machine check info from lowcore
 * for machine checks that happen while the guest is running.
 * This info in host's lowcore might be overwritten by a second machine
 * check from host when host is in the machine check's high-level handling.
 * The size is 24 bytes.
 */
struct mcck_volatile_info {
	__u64 mcic;
	__u64 failing_storage_address;
	__u32 ext_damage_code;
	__u32 reserved;
};

#define CPUSTAT_STOPPED    0x80000000
#define CPUSTAT_WAIT       0x10000000
@@ -260,14 +276,15 @@ struct kvm_s390_sie_block {

struct kvm_s390_itdb {
	__u8	data[256];
} __packed;
};

struct sie_page {
	struct kvm_s390_sie_block sie_block;
	__u8 reserved200[1024];		/* 0x0200 */
	struct mcck_volatile_info mcck_info;	/* 0x0200 */
	__u8 reserved218[1000];		/* 0x0218 */
	struct kvm_s390_itdb itdb;	/* 0x0600 */
	__u8 reserved700[2304];		/* 0x0700 */
} __packed;
};

struct kvm_vcpu_stat {
	u64 exit_userspace;
@@ -681,7 +698,7 @@ struct sie_page2 {
	__u64 fac_list[S390_ARCH_FAC_LIST_SIZE_U64];	/* 0x0000 */
	struct kvm_s390_crypto_cb crycb;		/* 0x0800 */
	u8 reserved900[0x1000 - 0x900];			/* 0x0900 */
} __packed;
};

struct kvm_s390_vsie {
	struct mutex mutex;
@@ -691,6 +708,12 @@ struct kvm_s390_vsie {
	struct page *pages[KVM_MAX_VCPUS];
};

struct kvm_s390_migration_state {
	unsigned long bitmap_size;	/* in bits (number of guest pages) */
	atomic64_t dirty_pages;		/* number of dirty pages */
	unsigned long *pgste_bitmap;
};

struct kvm_arch{
	void *sca;
	int use_esca;
@@ -718,6 +741,7 @@ struct kvm_arch{
	struct kvm_s390_crypto crypto;
	struct kvm_s390_vsie vsie;
	u64 epoch;
	struct kvm_s390_migration_state *migration_state;
	/* subset of available cpu features enabled by user space */
	DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS);
};
Loading