Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 780ec0d5 authored by Srinivasarao P's avatar Srinivasarao P
Browse files

Merge android-4.4.180 (71cb827c) into msm-4.4



* refs/heads/tmp-71cb827c
  Linux 4.4.180
  powerpc/lib: fix book3s/32 boot failure due to code patching
  powerpc/booke64: set RI in default MSR
  drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl
  drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl
  bonding: fix arp_validate toggling in active-backup mode
  ipv4: Fix raw socket lookup for local traffic
  vrf: sit mtu should not be updated when vrf netdev is the link
  vlan: disable SIOCSHWTSTAMP in container
  packet: Fix error path in packet_init
  net: ucc_geth - fix Oops when changing number of buffers in the ring
  bridge: Fix error path for kobject_init_and_add()
  powerpc/64s: Include cpu header
  USB: serial: fix unthrottle races
  USB: serial: use variable for status
  x86/bugs: Change L1TF mitigation string to match upstream
  x86/speculation/mds: Fix documentation typo
  Documentation: Correct the possible MDS sysfs values
  x86/mds: Add MDSUM variant to the MDS documentation
  x86/speculation/mds: Add 'mitigations=' support for MDS
  x86/speculation: Support 'mitigations=' cmdline option
  cpu/speculation: Add 'mitigations=' cmdline option
  x86/speculation/mds: Print SMT vulnerable on MSBDS with mitigations off
  x86/speculation/mds: Fix comment
  x86/speculation/mds: Add SMT warning message
  x86/speculation: Move arch_smt_update() call to after mitigation decisions
  x86/cpu/bugs: Use __initconst for 'const' init data
  Documentation: Add MDS vulnerability documentation
  Documentation: Move L1TF to separate directory
  x86/speculation/mds: Add mitigation mode VMWERV
  x86/speculation/mds: Add sysfs reporting for MDS
  x86/speculation/l1tf: Document l1tf in sysfs
  x86/speculation/mds: Add mitigation control for MDS
  x86/speculation/mds: Conditionally clear CPU buffers on idle entry
  x86/speculation/mds: Clear CPU buffers on exit to user
  x86/speculation/mds: Add mds_clear_cpu_buffers()
  x86/kvm: Expose X86_FEATURE_MD_CLEAR to guests
  x86/speculation/mds: Add BUG_MSBDS_ONLY
  x86/speculation/mds: Add basic bug infrastructure for MDS
  x86/speculation: Consolidate CPU whitelists
  x86/msr-index: Cleanup bit defines
  kvm: x86: Report STIBP on GET_SUPPORTED_CPUID
  x86/speculation: Provide IBPB always command line options
  x86/speculation: Add seccomp Spectre v2 user space protection mode
  x86/speculation: Enable prctl mode for spectre_v2_user
  x86/speculation: Add prctl() control for indirect branch speculation
  x86/speculation: Prevent stale SPEC_CTRL msr content
  x86/speculation: Prepare arch_smt_update() for PRCTL mode
  x86/speculation: Split out TIF update
  x86/speculation: Prepare for conditional IBPB in switch_mm()
  x86/speculation: Avoid __switch_to_xtra() calls
  x86/process: Consolidate and simplify switch_to_xtra() code
  x86/speculation: Prepare for per task indirect branch speculation control
  x86/speculation: Add command line control for indirect branch speculation
  x86/speculation: Unify conditional spectre v2 print functions
  x86/speculataion: Mark command line parser data __initdata
  x86/speculation: Mark string arrays const correctly
  x86/speculation: Reorder the spec_v2 code
  x86/speculation: Rework SMT state change
  sched: Add sched_smt_active()
  x86/Kconfig: Select SCHED_SMT if SMP enabled
  x86/speculation: Reorganize speculation control MSRs update
  x86/speculation: Rename SSBD update functions
  x86/speculation: Disable STIBP when enhanced IBRS is in use
  x86/speculation: Move STIPB/IBPB string conditionals out of cpu_show_common()
  x86/speculation: Remove unnecessary ret variable in cpu_show_common()
  x86/speculation: Clean up spectre_v2_parse_cmdline()
  x86/speculation: Update the TIF_SSBD comment
  x86/speculation: Propagate information about RSB filling mitigation to sysfs
  x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
  x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
  x86/mm: Use WRITE_ONCE() when setting PTEs
  KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled
  x86/cpu: Sanitize FAM6_ATOM naming
  x86/microcode: Update the new microcode revision unconditionally
  x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
  x86/speculation: Remove SPECTRE_V2_IBRS in enum spectre_v2_mitigation
  x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR
  locking/atomics, asm-generic: Move some macros from <linux/bitops.h> to a new <linux/bits.h> file
  x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features
  x86/bugs: Add AMD's SPEC_CTRL MSR usage
  x86/bugs: Add AMD's variant of SSB_NO
  x86/speculation: Simplify the CPU bug detection logic
  x86/speculation: Support Enhanced IBRS on future CPUs
  x86/cpufeatures: Hide AMD-specific speculation flags
  x86/MCE: Save microcode revision in machine check records
  x86/microcode/intel: Check microcode revision before updating sibling threads
  bitops: avoid integer overflow in GENMASK(_ULL)
  x86: stop exporting msr-index.h to userland
  x86/microcode/intel: Add a helper which gives the microcode revision
  locking/static_keys: Provide DECLARE and well as DEFINE macros
  Don't jump to compute_result state from check_result state
  x86/vdso: Pass --eh-frame-hdr to the linker
  cw1200: fix missing unlock on error in cw1200_hw_scan()
  gpu: ipu-v3: dp: fix CSC handling
  selftests/net: correct the return value for run_netsocktests
  s390: ctcm: fix ctcm_new_device error return code
  ipvs: do not schedule icmp errors from tunnels
  init: initialize jump labels before command line option parsing
  tools lib traceevent: Fix missing equality check for strcmp
  KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracing
  s390/3270: fix lockdep false positive on view->lock
  s390/dasd: Fix capacity calculation for large volumes
  libnvdimm/btt: Fix a kmemdup failure check
  HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys
  HID: input: add mapping for Expose/Overview key
  iio: adc: xilinx: fix potential use-after-free on remove
  platform/x86: sony-laptop: Fix unintentional fall-through
  netfilter: compat: initialize all fields in xt_init
  timer/debug: Change /proc/timer_stats from 0644 to 0600
  ASoC: Intel: avoid Oops if DMA setup fails
  ipv6: fix a potential deadlock in do_ipv6_setsockopt()
  UAS: fix alignment of scatter/gather segments
  Bluetooth: Align minimum encryption key size for LE and BR/EDR connections
  Bluetooth: hidp: fix buffer overflow
  scsi: qla2xxx: Fix incorrect region-size setting in optrom SYSFS routines
  usb: dwc3: Fix default lpm_nyet_threshold value
  genirq: Prevent use-after-free and work list corruption
  iommu/amd: Set exclusion range correctly
  scsi: csiostor: fix missing data copy in csio_scsi_err_handler()
  perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS
  ASoC: tlv320aic32x4: Fix Common Pins
  ASoC: cs4270: Set auto-increment bit for register writes
  ASoC:soc-pcm:fix a codec fixup issue in TDM case
  scsi: libsas: fix a race condition when smp task timeout
  media: v4l2: i2c: ov7670: Fix PLL bypass register values
  x86/mce: Improve error message when kernel cannot recover, p2
  selinux: never allow relabeling on context mounts
  Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ
  staging: iio: adt7316: fix the dac write calculation
  staging: iio: adt7316: fix the dac read calculation
  staging: iio: adt7316: allow adt751x to use internal vref for all dacs
  usb: usbip: fix isoc packet num validation in get_pipe
  ARM: iop: don't use using 64-bit DMA masks
  ARM: orion: don't use using 64-bit DMA masks
  xsysace: Fix error handling in ace_setup
  hugetlbfs: fix memory leak for resv_map
  net: hns: Fix WARNING when remove HNS driver with SMMU enabled
  net: hns: Use NAPI_POLL_WEIGHT for hns driver
  scsi: storvsc: Fix calculation of sub-channel count
  vfio/pci: use correct format characters
  rtc: da9063: set uie_unsupported when relevant
  debugfs: fix use-after-free on symlink traversal
  jffs2: fix use-after-free on symlink traversal
  bonding: show full hw address in sysfs for slave entries
  igb: Fix WARN_ONCE on runtime suspend
  rtc: sh: Fix invalid alarm warning for non-enabled alarm
  HID: debug: fix race condition with between rdesc_show() and device removal
  USB: core: Fix bug caused by duplicate interface PM usage counter
  USB: core: Fix unterminated string returned by usb_string()
  USB: w1 ds2490: Fix bug caused by improper use of altsetting array
  USB: yurex: Fix protection fault after device removal
  packet: validate msg_namelen in send directly
  bnxt_en: Improve multicast address setup logic.
  ipv6: invert flowlabel sharing check in process and user mode
  ipv6/flowlabel: wait rcu grace period before put_pid()
  ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
  ALSA: line6: use dynamic buffers
  vfio/type1: Limit DMA mappings per container
  kconfig/[mn]conf: handle backspace (^H) key
  libata: fix using DMA buffers on stack
  scsi: zfcp: reduce flood of fcrscn1 trace records on multi-element RSCN
  ceph: fix use-after-free on symlink traversal
  usb: u132-hcd: fix resource leak
  scsi: qla4xxx: fix a potential NULL pointer dereference
  net: ethernet: ti: fix possible object reference leak
  net: ibm: fix possible object reference leak
  net: xilinx: fix possible object reference leak
  net: ks8851: Set initial carrier state to down
  net: ks8851: Delay requesting IRQ until opened
  net: ks8851: Reassert reset pin if chip ID check fails
  net: ks8851: Dequeue RX packets explicitly
  ARM: dts: pfla02: increase phy reset duration
  usb: gadget: net2272: Fix net2272_dequeue()
  usb: gadget: net2280: Fix net2280_dequeue()
  usb: gadget: net2280: Fix overrun of OUT messages
  sc16is7xx: missing unregister/delete driver on error in sc16is7xx_init()
  netfilter: bridge: set skb transport_header before entering NF_INET_PRE_ROUTING
  qlcnic: Avoid potential NULL pointer dereference
  usbnet: ipheth: fix potential null pointer dereference in ipheth_carrier_set
  usbnet: ipheth: prevent TX queue timeouts when device not ready
  Documentation: Add nospectre_v1 parameter
  powerpc/fsl: Add FSL_PPC_BOOK3E as supported arch for nospectre_v2 boot arg
  powerpc/fsl: Fixed warning: orphan section `__btb_flush_fixup'
  powerpc/fsl: Sanitize the syscall table for NXP PowerPC 32 bit platforms
  powerpc/fsl: Flush the branch predictor at each kernel entry (32 bit)
  powerpc/fsl: Emulate SPRN_BUCSR register
  powerpc/fsl: Flush branch predictor when entering KVM
  powerpc/fsl: Enable runtime patching if nospectre_v2 boot arg is used
  ipv4: set the tcp_min_rtt_wlen range from 0 to one day
  net: stmmac: move stmmac_check_ether_addr() to driver probe
  team: fix possible recursive locking when add slaves
  ipv4: add sanity checks in ipv4_link_failure()
  Revert "block/loop: Use global lock for ioctl() operation."
  bpf: reject wrong sized filters earlier
  tipc: check link name with right length in tipc_nl_compat_link_set
  tipc: check bearer name with right length in tipc_nl_compat_bearer_enable
  netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ON
  NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.
  fs/proc/proc_sysctl.c: Fix a NULL pointer dereference
  intel_th: gth: Fix an off-by-one in output unassigning
  slip: make slhc_free() silently accept an error pointer
  tipc: handle the err returned from cmd header function
  powerpc/fsl: Fix the flush of branch predictor.
  powerpc/security: Fix spectre_v2 reporting
  powerpc/fsl: Update Spectre v2 reporting
  powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)
  powerpc/fsl: Add nospectre_v2 command line argument
  powerpc/fsl: Fix spectre_v2 mitigations reporting
  powerpc/fsl: Add macro to flush the branch predictor
  powerpc/fsl: Add infrastructure to fixup branch predictor flush
  powerpc: Avoid code patching freed init sections
  powerpc/powernv: Query firmware for count cache flush settings
  powerpc/pseries: Query hypervisor for count cache flush settings
  powerpc/64s: Add support for software count cache flush
  powerpc/64s: Add new security feature flags for count cache flush
  powerpc/asm: Add a patch_site macro & helpers for patching instructions
  powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E
  powerpc/64: Make meltdown reporting Book3S 64 specific
  powerpc/64: Call setup_barrier_nospec() from setup_arch()
  powerpc/64: Add CONFIG_PPC_BARRIER_NOSPEC
  powerpc/64: Make stf barrier PPC_BOOK3S_64 specific.
  powerpc/64: Disable the speculation barrier from the command line
  powerpc64s: Show ori31 availability in spectre_v1 sysfs file not v2
  powerpc/64s: Enhance the information in cpu_show_spectre_v1()
  powerpc: Use barrier_nospec in copy_from_user()
  powerpc/64: Use barrier_nospec in syscall entry
  powerpc/64s: Enable barrier_nospec based on firmware settings
  powerpc/64s: Patch barrier_nospec in modules
  powerpc/64s: Add support for ori barrier_nospec patching
  powerpc/64s: Add barrier_nospec
  powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit
  powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
  powerpc/pseries: Restore default security feature flags on setup
  powerpc: Move default security feature flags
  powerpc/pseries: Fix clearing of security feature flags
  powerpc/64s: Wire up cpu_show_spectre_v2()
  powerpc/64s: Wire up cpu_show_spectre_v1()
  powerpc/pseries: Use the security flags in pseries_setup_rfi_flush()
  powerpc/powernv: Use the security flags in pnv_setup_rfi_flush()
  powerpc/64s: Enhance the information in cpu_show_meltdown()
  powerpc/64s: Move cpu_show_meltdown()
  powerpc/powernv: Set or clear security feature flags
  powerpc/pseries: Set or clear security feature flags
  powerpc: Add security feature flags for Spectre/Meltdown
  powerpc/rfi-flush: Call setup_rfi_flush() after LPM migration
  powerpc/pseries: Add new H_GET_CPU_CHARACTERISTICS flags
  powerpc/rfi-flush: Differentiate enabled and patched flush types
  powerpc/rfi-flush: Always enable fallback flush on pseries
  powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again
  powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs code
  powerpc/powernv: Support firmware disable of RFI flush
  powerpc/pseries: Support firmware disable of RFI flush
  powerpc/64s: Improve RFI L1-D cache flush fallback
  powerpc/xmon: Add RFI flush related fields to paca dump
  USB: Consolidate LPM checks to avoid enabling LPM twice
  USB: Add new USB LPM helpers
  sunrpc: don't mark uninitialised items as VALID.
  nfsd: Don't release the callback slot unless it was actually held
  ceph: fix ci->i_head_snapc leak
  ceph: ensure d_name stability in ceph_dentry_hash()
  sched/numa: Fix a possible divide-by-zero
  trace: Fix preempt_enable_no_resched() abuse
  MIPS: scall64-o32: Fix indirect syscall number load
  cifs: do not attempt cifs operation on smb2+ rename error
  KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number
  kbuild: simplify ld-option implementation
  ANDROID: cuttlefish_defconfig: Disable DEVTMPFS
  ANDROID: Move from clang r349610 to r353983c.
  f2fs: fix to avoid accessing xattr across the boundary
  f2fs: fix to avoid potential race on sbi->unusable_block_count access/update
  f2fs: add tracepoint for f2fs_filemap_fault()
  f2fs: introduce DATA_GENERIC_ENHANCE
  f2fs: fix to handle error in f2fs_disable_checkpoint()
  f2fs: remove redundant check in f2fs_file_write_iter()
  f2fs: fix to be aware of readonly device in write_checkpoint()
  f2fs: fix to skip recovery on readonly device
  f2fs: fix to consider multiple device for readonly check
  f2fs: relocate chksum_offset for large_nat_bitmap feature
  f2fs: allow unfixed f2fs_checkpoint.checksum_offset
  f2fs: Replace spaces with tab
  f2fs: insert space before the open parenthesis '('
  f2fs: allow address pointer number of dnode aligning to specified size
  f2fs: introduce f2fs_read_single_page() for cleanup
  f2fs: mark is_extension_exist() inline
  f2fs: fix to set FI_UPDATE_WRITE correctly
  f2fs: fix to avoid panic in f2fs_inplace_write_data()
  f2fs: fix to do sanity check on valid block count of segment
  f2fs: fix to do sanity check on valid node/block count
  f2fs: fix to avoid panic in do_recover_data()
  f2fs: fix to do sanity check on free nid
  f2fs: fix to do checksum even if inode page is uptodate
  f2fs: fix to avoid panic in f2fs_remove_inode_page()
  f2fs: fix to clear dirty inode in error path of f2fs_iget()
  f2fs: remove new blank line of f2fs kernel message
  f2fs: fix wrong __is_meta_io() macro
  f2fs: fix to avoid panic in dec_valid_node_count()
  f2fs: fix to avoid panic in dec_valid_block_count()
  f2fs: fix to use inline space only if inline_xattr is enable
  f2fs: fix to retrieve inline xattr space
  f2fs: fix error path of recovery
  f2fs: fix to avoid deadloop in foreground GC
  f2fs: data: fix warning Using plain integer as NULL pointer
  f2fs: add tracepoint for f2fs_file_write_iter()
  f2fs: add comment for conditional compilation statement
  f2fs: fix potential recursive call when enabling data_flush
  f2fs: improve discard handling with multi-device volumes
  f2fs: Reduce zoned block device memory usage
  f2fs: Fix use of number of devices

Sleepable function handle_lmk_event() is called in atomic context,
so ignored the commit "ANDROID: Communicates LMK events to userland
where they can be logged"

Conflicts:
	arch/powerpc/include/asm/uaccess.h
	kernel/cpu.c
	kernel/irq/manage.c
	kernel/time/timer_stats.c
	net/ipv4/sysctl_net_ipv4.c

Change-Id: I3e5bd447057b44a28fc5000403198ae0fd644480
Signed-off-by: default avatarSrinivasarao P <spathi@codeaurora.org>
parents d1a5c038 71cb827c
Loading
Loading
Loading
Loading
+2 −0
Original line number Diff line number Diff line
@@ -277,6 +277,8 @@ What: /sys/devices/system/cpu/vulnerabilities
		/sys/devices/system/cpu/vulnerabilities/spectre_v1
		/sys/devices/system/cpu/vulnerabilities/spectre_v2
		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
		/sys/devices/system/cpu/vulnerabilities/l1tf
		/sys/devices/system/cpu/vulnerabilities/mds
Date:		January 2018
Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description:	Information about CPU vulnerabilities
+305 −0
Original line number Diff line number Diff line
MDS - Microarchitectural Data Sampling
======================================

Microarchitectural Data Sampling is a hardware vulnerability which allows
unprivileged speculative access to data which is available in various CPU
internal buffers.

Affected processors
-------------------

This vulnerability affects a wide range of Intel processors. The
vulnerability is not present on:

   - Processors from AMD, Centaur and other non Intel vendors

   - Older processor models, where the CPU family is < 6

   - Some Atoms (Bonnell, Saltwell, Goldmont, GoldmontPlus)

   - Intel processors which have the ARCH_CAP_MDS_NO bit set in the
     IA32_ARCH_CAPABILITIES MSR.

Whether a processor is affected or not can be read out from the MDS
vulnerability file in sysfs. See :ref:`mds_sys_info`.

Not all processors are affected by all variants of MDS, but the mitigation
is identical for all of them so the kernel treats them as a single
vulnerability.

Related CVEs
------------

The following CVE entries are related to the MDS vulnerability:

   ==============  =====  ===================================================
   CVE-2018-12126  MSBDS  Microarchitectural Store Buffer Data Sampling
   CVE-2018-12130  MFBDS  Microarchitectural Fill Buffer Data Sampling
   CVE-2018-12127  MLPDS  Microarchitectural Load Port Data Sampling
   CVE-2019-11091  MDSUM  Microarchitectural Data Sampling Uncacheable Memory
   ==============  =====  ===================================================

Problem
-------

When performing store, load, L1 refill operations, processors write data
into temporary microarchitectural structures (buffers). The data in the
buffer can be forwarded to load operations as an optimization.

Under certain conditions, usually a fault/assist caused by a load
operation, data unrelated to the load memory address can be speculatively
forwarded from the buffers. Because the load operation causes a fault or
assist and its result will be discarded, the forwarded data will not cause
incorrect program execution or state changes. But a malicious operation
may be able to forward this speculative data to a disclosure gadget which
allows in turn to infer the value via a cache side channel attack.

Because the buffers are potentially shared between Hyper-Threads cross
Hyper-Thread attacks are possible.

Deeper technical information is available in the MDS specific x86
architecture section: :ref:`Documentation/x86/mds.rst <mds>`.


Attack scenarios
----------------

Attacks against the MDS vulnerabilities can be mounted from malicious non
priviledged user space applications running on hosts or guest. Malicious
guest OSes can obviously mount attacks as well.

Contrary to other speculation based vulnerabilities the MDS vulnerability
does not allow the attacker to control the memory target address. As a
consequence the attacks are purely sampling based, but as demonstrated with
the TLBleed attack samples can be postprocessed successfully.

Web-Browsers
^^^^^^^^^^^^

  It's unclear whether attacks through Web-Browsers are possible at
  all. The exploitation through Java-Script is considered very unlikely,
  but other widely used web technologies like Webassembly could possibly be
  abused.


.. _mds_sys_info:

MDS system information
-----------------------

The Linux kernel provides a sysfs interface to enumerate the current MDS
status of the system: whether the system is vulnerable, and which
mitigations are active. The relevant sysfs file is:

/sys/devices/system/cpu/vulnerabilities/mds

The possible values in this file are:

  .. list-table::

     * - 'Not affected'
       - The processor is not vulnerable
     * - 'Vulnerable'
       - The processor is vulnerable, but no mitigation enabled
     * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
       - The processor is vulnerable but microcode is not updated.

         The mitigation is enabled on a best effort basis. See :ref:`vmwerv`
     * - 'Mitigation: Clear CPU buffers'
       - The processor is vulnerable and the CPU buffer clearing mitigation is
         enabled.

If the processor is vulnerable then the following information is appended
to the above information:

    ========================  ============================================
    'SMT vulnerable'          SMT is enabled
    'SMT mitigated'           SMT is enabled and mitigated
    'SMT disabled'            SMT is disabled
    'SMT Host state unknown'  Kernel runs in a VM, Host SMT state unknown
    ========================  ============================================

.. _vmwerv:

Best effort mitigation mode
^^^^^^^^^^^^^^^^^^^^^^^^^^^

  If the processor is vulnerable, but the availability of the microcode based
  mitigation mechanism is not advertised via CPUID the kernel selects a best
  effort mitigation mode.  This mode invokes the mitigation instructions
  without a guarantee that they clear the CPU buffers.

  This is done to address virtualization scenarios where the host has the
  microcode update applied, but the hypervisor is not yet updated to expose
  the CPUID to the guest. If the host has updated microcode the protection
  takes effect otherwise a few cpu cycles are wasted pointlessly.

  The state in the mds sysfs file reflects this situation accordingly.


Mitigation mechanism
-------------------------

The kernel detects the affected CPUs and the presence of the microcode
which is required.

If a CPU is affected and the microcode is available, then the kernel
enables the mitigation by default. The mitigation can be controlled at boot
time via a kernel command line option. See
:ref:`mds_mitigation_control_command_line`.

.. _cpu_buffer_clear:

CPU buffer clearing
^^^^^^^^^^^^^^^^^^^

  The mitigation for MDS clears the affected CPU buffers on return to user
  space and when entering a guest.

  If SMT is enabled it also clears the buffers on idle entry when the CPU
  is only affected by MSBDS and not any other MDS variant, because the
  other variants cannot be protected against cross Hyper-Thread attacks.

  For CPUs which are only affected by MSBDS the user space, guest and idle
  transition mitigations are sufficient and SMT is not affected.

.. _virt_mechanism:

Virtualization mitigation
^^^^^^^^^^^^^^^^^^^^^^^^^

  The protection for host to guest transition depends on the L1TF
  vulnerability of the CPU:

  - CPU is affected by L1TF:

    If the L1D flush mitigation is enabled and up to date microcode is
    available, the L1D flush mitigation is automatically protecting the
    guest transition.

    If the L1D flush mitigation is disabled then the MDS mitigation is
    invoked explicit when the host MDS mitigation is enabled.

    For details on L1TF and virtualization see:
    :ref:`Documentation/hw-vuln//l1tf.rst <mitigation_control_kvm>`.

  - CPU is not affected by L1TF:

    CPU buffers are flushed before entering the guest when the host MDS
    mitigation is enabled.

  The resulting MDS protection matrix for the host to guest transition:

  ============ ===== ============= ============ =================
   L1TF         MDS   VMX-L1FLUSH   Host MDS     MDS-State

   Don't care   No    Don't care    N/A          Not affected

   Yes          Yes   Disabled      Off          Vulnerable

   Yes          Yes   Disabled      Full         Mitigated

   Yes          Yes   Enabled       Don't care   Mitigated

   No           Yes   N/A           Off          Vulnerable

   No           Yes   N/A           Full         Mitigated
  ============ ===== ============= ============ =================

  This only covers the host to guest transition, i.e. prevents leakage from
  host to guest, but does not protect the guest internally. Guests need to
  have their own protections.

.. _xeon_phi:

XEON PHI specific considerations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  The XEON PHI processor family is affected by MSBDS which can be exploited
  cross Hyper-Threads when entering idle states. Some XEON PHI variants allow
  to use MWAIT in user space (Ring 3) which opens an potential attack vector
  for malicious user space. The exposure can be disabled on the kernel
  command line with the 'ring3mwait=disable' command line option.

  XEON PHI is not affected by the other MDS variants and MSBDS is mitigated
  before the CPU enters a idle state. As XEON PHI is not affected by L1TF
  either disabling SMT is not required for full protection.

.. _mds_smt_control:

SMT control
^^^^^^^^^^^

  All MDS variants except MSBDS can be attacked cross Hyper-Threads. That
  means on CPUs which are affected by MFBDS or MLPDS it is necessary to
  disable SMT for full protection. These are most of the affected CPUs; the
  exception is XEON PHI, see :ref:`xeon_phi`.

  Disabling SMT can have a significant performance impact, but the impact
  depends on the type of workloads.

  See the relevant chapter in the L1TF mitigation documentation for details:
  :ref:`Documentation/hw-vuln/l1tf.rst <smt_control>`.


.. _mds_mitigation_control_command_line:

Mitigation control on the kernel command line
---------------------------------------------

The kernel command line allows to control the MDS mitigations at boot
time with the option "mds=". The valid arguments for this option are:

  ============  =============================================================
  full		If the CPU is vulnerable, enable all available mitigations
		for the MDS vulnerability, CPU buffer clearing on exit to
		userspace and when entering a VM. Idle transitions are
		protected as well if SMT is enabled.

		It does not automatically disable SMT.

  off		Disables MDS mitigations completely.

  ============  =============================================================

Not specifying this option is equivalent to "mds=full".


Mitigation selection guide
--------------------------

1. Trusted userspace
^^^^^^^^^^^^^^^^^^^^

   If all userspace applications are from a trusted source and do not
   execute untrusted code which is supplied externally, then the mitigation
   can be disabled.


2. Virtualization with trusted guests
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

   The same considerations as above versus trusted user space apply.

3. Virtualization with untrusted guests
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

   The protection depends on the state of the L1TF mitigations.
   See :ref:`virt_mechanism`.

   If the MDS mitigation is enabled and SMT is disabled, guest to host and
   guest to guest attacks are prevented.

.. _mds_default_mitigations:

Default mitigations
-------------------

  The kernel default mitigations for vulnerable processors are:

  - Enable CPU buffer clearing

  The kernel does not by default enforce the disabling of SMT, which leaves
  SMT systems vulnerable when running untrusted code. The same rationale as
  for L1TF applies.
  See :ref:`Documentation/hw-vuln//l1tf.rst <default_mitigations>`.
+107 −3
Original line number Diff line number Diff line
@@ -2086,6 +2086,30 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
			Format: <first>,<last>
			Specifies range of consoles to be captured by the MDA.

	mds=		[X86,INTEL]
			Control mitigation for the Micro-architectural Data
			Sampling (MDS) vulnerability.

			Certain CPUs are vulnerable to an exploit against CPU
			internal buffers which can forward information to a
			disclosure gadget under certain conditions.

			In vulnerable processors, the speculatively
			forwarded data can be used in a cache side channel
			attack, to access data to which the attacker does
			not have direct access.

			This parameter controls the MDS mitigation. The
			options are:

			full    - Enable MDS mitigation on vulnerable CPUs
			off     - Unconditionally disable MDS mitigation

			Not specifying this option is equivalent to
			mds=full.

			For details see: Documentation/hw-vuln/mds.rst

	mem=nn[KMG]	[KNL,BOOT] Force usage of a specific amount of memory
			Amount of memory to be used when the kernel is not able
			to see the whole system memory or for test.
@@ -2200,6 +2224,30 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
			in the "bleeding edge" mini2440 support kernel at
			http://repo.or.cz/w/linux-2.6/mini2440.git

	mitigations=
			[X86] Control optional mitigations for CPU
			vulnerabilities.  This is a set of curated,
			arch-independent options, each of which is an
			aggregation of existing arch-specific options.

			off
				Disable all optional CPU mitigations.  This
				improves system performance, but it may also
				expose users to several CPU vulnerabilities.
				Equivalent to: nopti [X86]
					       nospectre_v2 [X86]
					       spectre_v2_user=off [X86]
					       spec_store_bypass_disable=off [X86]
					       mds=off [X86]

			auto (default)
				Mitigate all CPU vulnerabilities, but leave SMT
				enabled, even if it's vulnerable.  This is for
				users who don't want to be surprised by SMT
				getting disabled across kernel upgrades, or who
				have other ways of avoiding SMT-based attacks.
				Equivalent to: (default behavior)

	mminit_loglevel=
			[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
			parameter allows control of the logging verbosity for
@@ -2520,7 +2568,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.

	nohugeiomap	[KNL,x86] Disable kernel huge I/O mappings.

	nospectre_v2	[X86] Disable all mitigations for the Spectre variant 2
	nospectre_v1	[PPC] Disable mitigations for Spectre Variant 1 (bounds
			check bypass). With this option data leaks are possible
			in the system.

	nospectre_v2	[X86,PPC_FSL_BOOK3E] Disable all mitigations for the Spectre variant 2
			(indirect branch prediction) vulnerability. System may
			allow data leaks with this option, which is equivalent
			to spectre_v2=off.
@@ -3679,9 +3731,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.

	spectre_v2=	[X86] Control mitigation of Spectre variant 2
			(indirect branch speculation) vulnerability.
			The default operation protects the kernel from
			user space attacks.

			on   - unconditionally enable
			off  - unconditionally disable
			on   - unconditionally enable, implies
			       spectre_v2_user=on
			off  - unconditionally disable, implies
			       spectre_v2_user=off
			auto - kernel detects whether your CPU model is
			       vulnerable

@@ -3691,6 +3747,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
			CONFIG_RETPOLINE configuration option, and the
			compiler with which the kernel was built.

			Selecting 'on' will also enable the mitigation
			against user space to user space task attacks.

			Selecting 'off' will disable both the kernel and
			the user space protections.

			Specific mitigations can also be selected manually:

			retpoline	  - replace indirect branches
@@ -3700,6 +3762,48 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
			Not specifying this option is equivalent to
			spectre_v2=auto.

	spectre_v2_user=
			[X86] Control mitigation of Spectre variant 2
		        (indirect branch speculation) vulnerability between
		        user space tasks

			on	- Unconditionally enable mitigations. Is
				  enforced by spectre_v2=on

			off     - Unconditionally disable mitigations. Is
				  enforced by spectre_v2=off

			prctl   - Indirect branch speculation is enabled,
				  but mitigation can be enabled via prctl
				  per thread.  The mitigation control state
				  is inherited on fork.

			prctl,ibpb
				- Like "prctl" above, but only STIBP is
				  controlled per thread. IBPB is issued
				  always when switching between different user
				  space processes.

			seccomp
				- Same as "prctl" above, but all seccomp
				  threads will enable the mitigation unless
				  they explicitly opt out.

			seccomp,ibpb
				- Like "seccomp" above, but only STIBP is
				  controlled per thread. IBPB is issued
				  always when switching between different
				  user space processes.

			auto    - Kernel selects the mitigation depending on
				  the available CPU features and vulnerability.

			Default mitigation:
			If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"

			Not specifying this option is equivalent to
			spectre_v2_user=auto.

	spec_store_bypass_disable=
			[HW] Control Speculative Store Bypass (SSB) Disable mitigation
			(Speculative Store Bypass vulnerability)
+1 −0
Original line number Diff line number Diff line
@@ -387,6 +387,7 @@ tcp_min_rtt_wlen - INTEGER
	minimum RTT when it is moved to a longer path (e.g., due to traffic
	engineering). A longer window makes the filter more resistant to RTT
	inflations such as transient congestion. The unit is seconds.
	Possible values: 0 - 86400 (1 day)
	Default: 300

tcp_moderate_rcvbuf - BOOLEAN
+9 −0
Original line number Diff line number Diff line
@@ -92,3 +92,12 @@ Speculation misfeature controls
   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0);
   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0);
   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0);

- PR_SPEC_INDIR_BRANCH: Indirect Branch Speculation in User Processes
                        (Mitigate Spectre V2 style attacks against user processes)

  Invocations:
   * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);
   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
Loading