Merge "Merge android-4.9.204(72e8598) into msm-4.9" (4da5efed) · Commits · e / devices / android_kernel_fairphone_FP3

Documentation/ABI/testing/sysfs-devices-system-cpu

+2 −0

Original line number	Diff line number	Diff line
		@@ -358,6 +358,8 @@ What: /sys/devices/system/cpu/vulnerabilities
		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
		/sys/devices/system/cpu/vulnerabilities/l1tf
		/sys/devices/system/cpu/vulnerabilities/mds
		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
		Date: January 2018
		Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
		Description: Information about CPU vulnerabilities

Documentation/hw-vuln/index.rst

+2 −0

Original line number	Diff line number	Diff line
		@@ -11,3 +11,5 @@ are configurable at compile, boot or run time.

		l1tf
		mds
		tsx_async_abort
		multihit.rst

Documentation/hw-vuln/mds.rst

+5 −2

Original line number	Diff line number	Diff line
		@@ -265,8 +265,11 @@ time with the option "mds=". The valid arguments for this option are:

		============ =============================================================

		Not specifying this option is equivalent to "mds=full".

		Not specifying this option is equivalent to "mds=full". For processors
		that are affected by both TAA (TSX Asynchronous Abort) and MDS,
		specifying just "mds=off" without an accompanying "tsx_async_abort=off"
		will have no effect as the same mitigation is used for both
		vulnerabilities.

		Mitigation selection guide
		--------------------------

Documentation/hw-vuln/multihit.rst

0 → 100644

+163 −0

Original line number	Diff line number	Diff line
		iTLB multihit
		=============

		iTLB multihit is an erratum where some processors may incur a machine check
		error, possibly resulting in an unrecoverable CPU lockup, when an
		instruction fetch hits multiple entries in the instruction TLB. This can
		occur when the page size is changed along with either the physical address
		or cache type. A malicious guest running on a virtualized system can
		exploit this erratum to perform a denial of service attack.


		Affected processors
		-------------------

		Variations of this erratum are present on most Intel Core and Xeon processor
		models. The erratum is not present on:

		- non-Intel processors

		- Some Atoms (Airmont, Bonnell, Goldmont, GoldmontPlus, Saltwell, Silvermont)

		- Intel processors that have the PSCHANGE_MC_NO bit set in the
		IA32_ARCH_CAPABILITIES MSR.


		Related CVEs
		------------

		The following CVE entry is related to this issue:

		============== =================================================
		CVE-2018-12207 Machine Check Error Avoidance on Page Size Change
		============== =================================================


		Problem
		-------

		Privileged software, including OS and virtual machine managers (VMM), are in
		charge of memory management. A key component in memory management is the control
		of the page tables. Modern processors use virtual memory, a technique that creates
		the illusion of a very large memory for processors. This virtual space is split
		into pages of a given size. Page tables translate virtual addresses to physical
		addresses.

		To reduce latency when performing a virtual to physical address translation,
		processors include a structure, called TLB, that caches recent translations.
		There are separate TLBs for instruction (iTLB) and data (dTLB).

		Under this errata, instructions are fetched from a linear address translated
		using a 4 KB translation cached in the iTLB. Privileged software modifies the
		paging structure so that the same linear address using large page size (2 MB, 4
		MB, 1 GB) with a different physical address or memory type. After the page
		structure modification but before the software invalidates any iTLB entries for
		the linear address, a code fetch that happens on the same linear address may
		cause a machine-check error which can result in a system hang or shutdown.


		Attack scenarios
		----------------

		Attacks against the iTLB multihit erratum can be mounted from malicious
		guests in a virtualized system.


		iTLB multihit system information
		--------------------------------

		The Linux kernel provides a sysfs interface to enumerate the current iTLB
		multihit status of the system:whether the system is vulnerable and which
		mitigations are active. The relevant sysfs file is:

		/sys/devices/system/cpu/vulnerabilities/itlb_multihit

		The possible values in this file are:

		.. list-table::

		* - Not affected
		- The processor is not vulnerable.
		* - KVM: Mitigation: Split huge pages
		- Software changes mitigate this issue.
		* - KVM: Vulnerable
		- The processor is vulnerable, but no mitigation enabled


		Enumeration of the erratum
		--------------------------------

		A new bit has been allocated in the IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) msr
		and will be set on CPU's which are mitigated against this issue.

		======================================= =========== ===============================
		IA32_ARCH_CAPABILITIES MSR Not present Possibly vulnerable,check model
		IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '0' Likely vulnerable,check model
		IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '1' Not vulnerable
		======================================= =========== ===============================


		Mitigation mechanism
		-------------------------

		This erratum can be mitigated by restricting the use of large page sizes to
		non-executable pages. This forces all iTLB entries to be 4K, and removes
		the possibility of multiple hits.

		In order to mitigate the vulnerability, KVM initially marks all huge pages
		as non-executable. If the guest attempts to execute in one of those pages,
		the page is broken down into 4K pages, which are then marked executable.

		If EPT is disabled or not available on the host, KVM is in control of TLB
		flushes and the problematic situation cannot happen. However, the shadow
		EPT paging mechanism used by nested virtualization is vulnerable, because
		the nested guest can trigger multiple iTLB hits by modifying its own
		(non-nested) page tables. For simplicity, KVM will make large pages
		non-executable in all shadow paging modes.

		Mitigation control on the kernel command line and KVM - module parameter
		------------------------------------------------------------------------

		The KVM hypervisor mitigation mechanism for marking huge pages as
		non-executable can be controlled with a module parameter "nx_huge_pages=".
		The kernel command line allows to control the iTLB multihit mitigations at
		boot time with the option "kvm.nx_huge_pages=".

		The valid arguments for these options are:

		========== ================================================================
		force Mitigation is enabled. In this case, the mitigation implements
		non-executable huge pages in Linux kernel KVM module. All huge
		pages in the EPT are marked as non-executable.
		If a guest attempts to execute in one of those pages, the page is
		broken down into 4K pages, which are then marked executable.

		off Mitigation is disabled.

		auto Enable mitigation only if the platform is affected and the kernel
		was not booted with the "mitigations=off" command line parameter.
		This is the default option.
		========== ================================================================


		Mitigation selection guide
		--------------------------

		1. No virtualization in use
		^^^^^^^^^^^^^^^^^^^^^^^^^^^

		The system is protected by the kernel unconditionally and no further
		action is required.

		2. Virtualization with trusted guests
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		If the guest comes from a trusted source, you may assume that the guest will
		not attempt to maliciously exploit these errata and no further action is
		required.

		3. Virtualization with untrusted guests
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
		If the guest comes from an untrusted source, the guest host kernel will need
		to apply iTLB multihit mitigation via the kernel command line or kvm
		module parameter.

Documentation/hw-vuln/tsx_async_abort.rst

0 → 100644

+279 −0

Original line number	Diff line number	Diff line
		.. SPDX-License-Identifier: GPL-2.0

		TAA - TSX Asynchronous Abort
		======================================

		TAA is a hardware vulnerability that allows unprivileged speculative access to
		data which is available in various CPU internal buffers by using asynchronous
		aborts within an Intel TSX transactional region.

		Affected processors
		-------------------

		This vulnerability only affects Intel processors that support Intel
		Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8)
		is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit
		(bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations
		also mitigate against TAA.

		Whether a processor is affected or not can be read out from the TAA
		vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`.

		Related CVEs
		------------

		The following CVE entry is related to this TAA issue:

		============== ===== ===================================================
		CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some
		microprocessors utilizing speculative execution may
		allow an authenticated user to potentially enable
		information disclosure via a side channel with
		local access.
		============== ===== ===================================================

		Problem
		-------

		When performing store, load or L1 refill operations, processors write
		data into temporary microarchitectural structures (buffers). The data in
		those buffers can be forwarded to load operations as an optimization.

		Intel TSX is an extension to the x86 instruction set architecture that adds
		hardware transactional memory support to improve performance of multi-threaded
		software. TSX lets the processor expose and exploit concurrency hidden in an
		application due to dynamically avoiding unnecessary synchronization.

		TSX supports atomic memory transactions that are either committed (success) or
		aborted. During an abort, operations that happened within the transactional region
		are rolled back. An asynchronous abort takes place, among other options, when a
		different thread accesses a cache line that is also used within the transactional
		region when that access might lead to a data race.

		Immediately after an uncompleted asynchronous abort, certain speculatively
		executed loads may read data from those internal buffers and pass it to dependent
		operations. This can be then used to infer the value via a cache side channel
		attack.

		Because the buffers are potentially shared between Hyper-Threads cross
		Hyper-Thread attacks are possible.

		The victim of a malicious actor does not need to make use of TSX. Only the
		attacker needs to begin a TSX transaction and raise an asynchronous abort
		which in turn potenitally leaks data stored in the buffers.

		More detailed technical information is available in the TAA specific x86
		architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.


		Attack scenarios
		----------------

		Attacks against the TAA vulnerability can be implemented from unprivileged
		applications running on hosts or guests.

		As for MDS, the attacker has no control over the memory addresses that can
		be leaked. Only the victim is responsible for bringing data to the CPU. As
		a result, the malicious actor has to sample as much data as possible and
		then postprocess it to try to infer any useful information from it.

		A potential attacker only has read access to the data. Also, there is no direct
		privilege escalation by using this technique.


		.. _tsx_async_abort_sys_info:

		TAA system information
		-----------------------

		The Linux kernel provides a sysfs interface to enumerate the current TAA status
		of mitigated systems. The relevant sysfs file is:

		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort

		The possible values in this file are:

		.. list-table::

		* - 'Vulnerable'
		- The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied.
		* - 'Vulnerable: Clear CPU buffers attempted, no microcode'
		- The system tries to clear the buffers but the microcode might not support the operation.
		* - 'Mitigation: Clear CPU buffers'
		- The microcode has been updated to clear the buffers. TSX is still enabled.
		* - 'Mitigation: TSX disabled'
		- TSX is disabled.
		* - 'Not affected'
		- The CPU is not affected by this issue.

		.. _ucode_needed:

		Best effort mitigation mode
		^^^^^^^^^^^^^^^^^^^^^^^^^^^

		If the processor is vulnerable, but the availability of the microcode-based
		mitigation mechanism is not advertised via CPUID the kernel selects a best
		effort mitigation mode. This mode invokes the mitigation instructions
		without a guarantee that they clear the CPU buffers.

		This is done to address virtualization scenarios where the host has the
		microcode update applied, but the hypervisor is not yet updated to expose the
		CPUID to the guest. If the host has updated microcode the protection takes
		effect; otherwise a few CPU cycles are wasted pointlessly.

		The state in the tsx_async_abort sysfs file reflects this situation
		accordingly.


		Mitigation mechanism
		--------------------

		The kernel detects the affected CPUs and the presence of the microcode which is
		required. If a CPU is affected and the microcode is available, then the kernel
		enables the mitigation by default.


		The mitigation can be controlled at boot time via a kernel command line option.
		See :ref:`taa_mitigation_control_command_line`.

		.. _virt_mechanism:

		Virtualization mitigation
		^^^^^^^^^^^^^^^^^^^^^^^^^

		Affected systems where the host has TAA microcode and TAA is mitigated by
		having disabled TSX previously, are not vulnerable regardless of the status
		of the VMs.

		In all other cases, if the host either does not have the TAA microcode or
		the kernel is not mitigated, the system might be vulnerable.


		.. _taa_mitigation_control_command_line:

		Mitigation control on the kernel command line
		---------------------------------------------

		The kernel command line allows to control the TAA mitigations at boot time with
		the option "tsx_async_abort=". The valid arguments for this option are:

		============ =============================================================
		off This option disables the TAA mitigation on affected platforms.
		If the system has TSX enabled (see next parameter) and the CPU
		is affected, the system is vulnerable.

		full TAA mitigation is enabled. If TSX is enabled, on an affected
		system it will clear CPU buffers on ring transitions. On
		systems which are MDS-affected and deploy MDS mitigation,
		TAA is also mitigated. Specifying this option on those
		systems will have no effect.

		full,nosmt The same as tsx_async_abort=full, with SMT disabled on
		vulnerable CPUs that have TSX enabled. This is the complete
		mitigation. When TSX is disabled, SMT is not disabled because
		CPU is not vulnerable to cross-thread TAA attacks.
		============ =============================================================

		Not specifying this option is equivalent to "tsx_async_abort=full". For
		processors that are affected by both TAA and MDS, specifying just
		"tsx_async_abort=off" without an accompanying "mds=off" will have no
		effect as the same mitigation is used for both vulnerabilities.

		The kernel command line also allows to control the TSX feature using the
		parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used
		to control the TSX feature and the enumeration of the TSX feature bits (RTM
		and HLE) in CPUID.

		The valid options are:

		============ =============================================================
		off Disables TSX on the system.

		Note that this option takes effect only on newer CPUs which are
		not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1
		and which get the new IA32_TSX_CTRL MSR through a microcode
		update. This new MSR allows for the reliable deactivation of
		the TSX functionality.

		on Enables TSX.

		Although there are mitigations for all known security
		vulnerabilities, TSX has been known to be an accelerator for
		several previous speculation-related CVEs, and so there may be
		unknown security risks associated with leaving it enabled.

		auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX
		on the system.
		============ =============================================================

		Not specifying this option is equivalent to "tsx=off".

		The following combinations of the "tsx_async_abort" and "tsx" are possible. For
		affected platforms tsx=auto is equivalent to tsx=off and the result will be:

		========= ========================== =========================================
		tsx=on tsx_async_abort=full The system will use VERW to clear CPU
		buffers. Cross-thread attacks are still
		possible on SMT machines.
		tsx=on tsx_async_abort=full,nosmt As above, cross-thread attacks on SMT
		mitigated.
		tsx=on tsx_async_abort=off The system is vulnerable.
		tsx=off tsx_async_abort=full TSX might be disabled if microcode
		provides a TSX control MSR. If so,
		system is not vulnerable.
		tsx=off tsx_async_abort=full,nosmt Ditto
		tsx=off tsx_async_abort=off ditto
		========= ========================== =========================================


		For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU
		buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0)
		"tsx" command line argument has no effect.

		For the affected platforms below table indicates the mitigation status for the
		combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO
		and TSX_CTRL_MSR.

		======= ========= ============= ========================================
		MDS_NO MD_CLEAR TSX_CTRL_MSR Status
		======= ========= ============= ========================================
		0 0 0 Vulnerable (needs microcode)
		0 1 0 MDS and TAA mitigated via VERW
		1 1 0 MDS fixed, TAA vulnerable if TSX enabled
		because MD_CLEAR has no meaning and
		VERW is not guaranteed to clear buffers
		1 X 1 MDS fixed, TAA can be mitigated by
		VERW or TSX_CTRL_MSR
		======= ========= ============= ========================================

		Mitigation selection guide
		--------------------------

		1. Trusted userspace and guests
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		If all user space applications are from a trusted source and do not execute
		untrusted code which is supplied externally, then the mitigation can be
		disabled. The same applies to virtualized environments with trusted guests.


		2. Untrusted userspace and guests
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		If there are untrusted applications or guests on the system, enabling TSX
		might allow a malicious actor to leak data from the host or from other
		processes running on the same physical core.

		If the microcode is available and the TSX is disabled on the host, attacks
		are prevented in a virtualized environment as well, even if the VMs do not
		explicitly enable the mitigation.


		.. _taa_default_mitigations:

		Default mitigations
		-------------------

		The kernel's default action for vulnerable processors is:

		- Deploy TSX disable mitigation (tsx_async_abort=full tsx=off).