Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 070370f0 authored by Blagovest Kolenichev's avatar Blagovest Kolenichev
Browse files

Merge android-4.14.108 (4344de2f) into msm-4.14



* refs/heads/tmp-4344de2f:
  Linux 4.14.108
  s390/setup: fix boot crash for machine without EDAT-1
  KVM: nVMX: Ignore limit checks on VMX instructions using flat segments
  KVM: nVMX: Apply addr size mask to effective address for VMX instructions
  KVM: nVMX: Sign extend displacements of VMX instr's mem operands
  KVM: x86/mmu: Do not cache MMIO accesses while memslots are in flux
  KVM: x86/mmu: Detect MMIO generation wrap in any address space
  KVM: Call kvm_arch_memslots_updated() before updating memslots
  drm/radeon/evergreen_cs: fix missing break in switch statement
  media: imx: csi: Stop upstream before disabling IDMA channel
  media: imx: csi: Disable CSI immediately after last EOF
  media: vimc: Add vimc-streamer for stream control
  media: uvcvideo: Avoid NULL pointer dereference at the end of streaming
  media: imx: prpencvf: Stop upstream before disabling IDMA channel
  rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
  tpm: Unify the send callback behaviour
  tpm/tpm_crb: Avoid unaligned reads in crb_recv()
  md: Fix failed allocation of md_register_thread
  perf intel-pt: Fix divide by zero when TSC is not available
  perf intel-pt: Fix overlap calculation for padding
  perf auxtrace: Define auxtrace record alignment
  perf intel-pt: Fix CYC timestamp calculation after OVF
  x86/unwind/orc: Fix ORC unwind table alignment
  bcache: never writeback a discard operation
  PM / wakeup: Rework wakeup source timer cancellation
  NFSv4.1: Reinitialise sequence results before retransmitting a request
  nfsd: fix wrong check in write_v4_end_grace()
  nfsd: fix memory corruption caused by readdir
  NFS: Don't recoalesce on error in nfs_pageio_complete_mirror()
  NFS: Fix an I/O request leakage in nfs_do_recoalesce
  NFS: Fix I/O request leakages
  cpcap-charger: generate events for userspace
  dm integrity: limit the rate of error messages
  dm: fix to_sector() for 32bit
  arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
  arm64: debug: Ensure debug handlers check triggering exception level
  arm64: Fix HCR.TGE status for NMI contexts
  ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify
  powerpc/traps: Fix the message printed when stack overflows
  powerpc/traps: fix recoverability of machine check handling on book3s/32
  powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR configuration
  powerpc/ptrace: Simplify vr_get/set() to avoid GCC warning
  powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guest
  powerpc/83xx: Also save/restore SPRG4-7 during suspend
  powerpc/powernv: Make opal log only readable by root
  powerpc/wii: properly disable use of BATs when requested.
  powerpc/32: Clear on-stack exception marker upon exception return
  security/selinux: fix SECURITY_LSM_NATIVE_LABELS on reused superblock
  jbd2: fix compile warning when using JBUFFER_TRACE
  jbd2: clear dirty flag when revoking a buffer from an older transaction
  serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
  serial: 8250_pci: Fix number of ports for ACCES serial cards
  serial: 8250_of: assume reg-shift of 2 for mrvl,mmp-uart
  serial: uartps: Fix stuck ISR if RX disabled with non-empty FIFO
  drm/i915: Relax mmap VMA check
  crypto: arm64/aes-neonbs - fix returning final keystream block
  i2c: tegra: fix maximum transfer size
  parport_pc: fix find_superio io compare code, should use equal test.
  intel_th: Don't reference unassigned outputs
  device property: Fix the length used in PROPERTY_ENTRY_STRING()
  kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
  mm/vmalloc: fix size check for remap_vmalloc_range_partial()
  mm: hwpoison: fix thp split handing in soft_offline_in_use_page()
  nfit: acpi_nfit_ctl(): Check out_obj->type in the right place
  usb: chipidea: tegra: Fix missed ci_hdrc_remove_device()
  clk: ingenic: Fix doc of ingenic_cgu_div_info
  clk: ingenic: Fix round_rate misbehaving with non-integer dividers
  clk: clk-twl6040: Fix imprecise external abort for pdmclk
  clk: uniphier: Fix update register for CPU-gear
  ext2: Fix underflow in ext2_max_size()
  cxl: Wrap iterations over afu slices inside 'afu_list_lock'
  IB/hfi1: Close race condition on user context disable and close
  ext4: fix crash during online resizing
  ext4: add mask of ext4 flags to swap
  cpufreq: pxa2xx: remove incorrect __init annotation
  cpufreq: tegra124: add missing of_node_put()
  x86/kprobes: Prohibit probing on optprobe template code
  irqchip/gic-v3-its: Avoid parsing _indirect_ twice for Device table
  libertas_tf: don't set URB_ZERO_PACKET on IN USB transfer
  crypto: pcbc - remove bogus memcpy()s with src == dest
  Btrfs: fix corruption reading shared and compressed extents after hole punching
  btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
  Btrfs: setup a nofs context for memory allocation at __btrfs_set_acl
  m68k: Add -ffreestanding to CFLAGS
  splice: don't merge into linked buffers
  fs/devpts: always delete dcache dentry-s in dput()
  scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
  scsi: sd: Optimal I/O size should be a multiple of physical block size
  scsi: aacraid: Fix performance issue on logical drives
  scsi: virtio_scsi: don't send sc payload with tmfs
  s390/virtio: handle find on invalid queue gracefully
  s390/setup: fix early warning messages
  clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
  clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
  regulator: s2mpa01: Fix step values for some LDOs
  regulator: max77620: Initialize values for DT properties
  regulator: s2mps11: Fix steps for buck7, buck8 and LDO35
  spi: pxa2xx: Setup maximum supported DMA transfer length
  spi: ti-qspi: Fix mmap read when more than one CS in use
  mmc: sdhci-esdhc-imx: fix HS400 timing issue
  ACPI / device_sysfs: Avoid OF modalias creation for removed device
  xen: fix dom0 boot on huge systems
  tracing: Do not free iter->trace in fail path of tracing_open_pipe()
  tracing: Use strncpy instead of memcpy for string keys in hist triggers
  CIFS: Fix read after write for files with read caching
  CIFS: Do not reset lease state to NONE on lease break
  crypto: arm64/aes-ccm - fix bugs in non-NEON fallback routine
  crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling
  crypto: testmgr - skip crc32c context test for ahash algorithms
  crypto: hash - set CRYPTO_TFM_NEED_KEY if ->setkey() fails
  crypto: arm64/crct10dif - revert to C code for short inputs
  crypto: arm/crct10dif - revert to C code for short inputs
  fix cgroup_do_mount() handling of failure exits
  libnvdimm: Fix altmap reservation size calculation
  libnvdimm/pmem: Honor force_raw for legacy pmem regions
  libnvdimm, pfn: Fix over-trim in trim_pfn_device()
  libnvdimm/label: Clear 'updating' flag after label-set update
  stm class: Prevent division by zero
  media: videobuf2-v4l2: drop WARN_ON in vb2_warn_zero_bytesused()
  tmpfs: fix uninitialized return value in shmem_link
  net: set static variable an initial value in atl2_probe()
  nfp: bpf: fix ALU32 high bits clearance bug
  nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K
  net: thunderx: make CFG_DONE message to run through generic send-ack sequence
  mac80211_hwsim: propagate genlmsg_reply return code
  phonet: fix building with clang
  ARCv2: support manual regfile save on interrupts
  ARC: uacces: remove lp_start, lp_end from clobber list
  ARCv2: lib: memcpy: fix doing prefetchw outside of buffer
  ixgbe: fix older devices that do not support IXGBE_MRQC_L3L4TXSWEN
  tmpfs: fix link accounting when a tmpfile is linked in
  net: marvell: mvneta: fix DMA debug warning
  arm64: Relax GIC version check during early boot
  qed: Fix iWARP syn packet mac address validation.
  ASoC: topology: free created components in tplg load error
  mailbox: bcm-flexrm-mailbox: Fix FlexRM ring flush timeout issue
  net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
  qmi_wwan: apply SET_DTR quirk to Sierra WP7607
  pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins
  net: systemport: Fix reception of BPDUs
  scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task
  keys: Fix dependency loop between construction record and auth key
  assoc_array: Fix shortcut creation
  af_key: unconditionally clone on broadcast
  ARM: 8824/1: fix a migrating irq bug when hotplug cpu
  esp: Skip TX bytes accounting when sending from a request socket
  clk: sunxi: A31: Fix wrong AHB gate number
  clk: sunxi-ng: v3s: Fix TCON reset de-assert bit
  Input: st-keyscan - fix potential zalloc NULL dereference
  auxdisplay: ht16k33: fix potential user-after-free on module unload
  i2c: bcm2835: Clear current buffer pointers and counts after a transfer
  i2c: cadence: Fix the hold bit setting
  net: hns: Fix object reference leaks in hns_dsaf_roce_reset()
  mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs
  Revert "mm: use early_pfn_to_nid in page_ext_init"
  mm/gup: fix gup_pmd_range() for dax
  NFS: Don't use page_file_mapping after removing the page
  floppy: check_events callback should not return a negative number
  ipvs: fix dependency on nf_defrag_ipv6
  mac80211: Fix Tx aggregation session tear down with ITXQs
  Input: matrix_keypad - use flush_delayed_work()
  Input: ps2-gpio - flush TX work when closing port
  Input: cap11xx - switch to using set_brightness_blocking()
  ARM: OMAP2+: fix lack of timer interrupts on CPU1 after hotplug
  KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded
  ASoC: rsnd: fixup rsnd_ssi_master_clk_start() user count check
  ASoC: dapm: fix out-of-bounds accesses to DAPM lookup tables
  ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads() could be uninitialized
  Input: pwm-vibra - stop regulator after disabling pwm, not before
  Input: pwm-vibra - prevent unbalanced regulator
  s390/dasd: fix using offset into zero size array error
  gpu: ipu-v3: Fix CSI offsets for imx53
  drm/imx: imx-ldb: add missing of_node_puts
  gpu: ipu-v3: Fix i.MX51 CSI control registers offset
  drm/imx: ignore plane updates on disabled crtcs
  crypto: rockchip - update new iv to device in multiple operations
  crypto: rockchip - fix scatterlist nents error
  crypto: ahash - fix another early termination in hash walk
  crypto: caam - fixed handling of sg list
  stm class: Fix an endless loop in channel allocation
  iio: adc: exynos-adc: Fix NULL pointer exception on unbind
  ASoC: fsl_esai: fix register setting issue in RIGHT_J mode
  9p/net: fix memory leak in p9_client_create
  9p: use inode->i_lock to protect i_size_write() under 32-bit
  FROMLIST: psi: introduce psi monitor
  FROMLIST: refactor header includes to allow kthread.h inclusion in psi_types.h
  FROMLIST: psi: track changed states
  FROMLIST: psi: split update_stats into parts
  FROMLIST: psi: rename psi fields in preparation for psi trigger addition
  FROMLIST: psi: make psi_enable static
  FROMLIST: psi: introduce state_mask to represent stalled psi states
  ANDROID: cuttlefish_defconfig: Enable CONFIG_INPUT_MOUSEDEV
  ANDROID: cuttlefish_defconfig: Enable CONFIG_PSI
  BACKPORT: kernel: cgroup: add poll file operation
  BACKPORT: fs: kernfs: add poll file operation
  UPSTREAM: psi: avoid divide-by-zero crash inside virtual machines
  UPSTREAM: psi: clarify the Kconfig text for the default-disable option
  UPSTREAM: psi: fix aggregation idle shut-off
  UPSTREAM: psi: fix reference to kernel commandline enable
  UPSTREAM: psi: make disabling/enabling easier for vendor kernels
  UPSTREAM: kernel/sched/psi.c: simplify cgroup_move_task()
  BACKPORT: psi: cgroup support
  UPSTREAM: psi: pressure stall information for CPU, memory, and IO
  UPSTREAM: sched: introduce this_rq_lock_irq()
  UPSTREAM: sched: sched.h: make rq locking and clock functions available in stats.h
  UPSTREAM: sched: loadavg: make calc_load_n() public
  BACKPORT: sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
  UPSTREAM: delayacct: track delays from thrashing cache pages
  UPSTREAM: mm: workingset: tell cache transitions from workingset thrashing
  sched/fair: fix energy compute when a cluster is only a cpu core in multi-cluster system

Conflicts:
	arch/arm/kernel/irq.c
	drivers/scsi/sd.c
	include/linux/sched.h
	include/uapi/linux/taskstats.h
	kernel/sched/Makefile
	sound/soc/soc-dapm.c

Change-Id: I12ebb57a34da9101ee19458d7e1f96ecc769c39a
Signed-off-by: default avatarBlagovest Kolenichev <bkolenichev@codeaurora.org>
parents d9f6c79d 4344de2f
Loading
Loading
Loading
Loading
+180 −0
Original line number Diff line number Diff line
================================
PSI - Pressure Stall Information
================================

:Date: April, 2018
:Author: Johannes Weiner <hannes@cmpxchg.org>

When CPU, memory or IO devices are contended, workloads experience
latency spikes, throughput losses, and run the risk of OOM kills.

Without an accurate measure of such contention, users are forced to
either play it safe and under-utilize their hardware resources, or
roll the dice and frequently suffer the disruptions resulting from
excessive overcommit.

The psi feature identifies and quantifies the disruptions caused by
such resource crunches and the time impact it has on complex workloads
or even entire systems.

Having an accurate measure of productivity losses caused by resource
scarcity aids users in sizing workloads to hardware--or provisioning
hardware according to workload demand.

As psi aggregates this information in realtime, systems can be managed
dynamically using techniques such as load shedding, migrating jobs to
other systems or data centers, or strategically pausing or killing low
priority or restartable batch jobs.

This allows maximizing hardware utilization without sacrificing
workload health or risking major disruptions such as OOM kills.

Pressure interface
==================

Pressure information for each resource is exported through the
respective file in /proc/pressure/ -- cpu, memory, and io.

The format for CPU is as such:

some avg10=0.00 avg60=0.00 avg300=0.00 total=0

and for memory and IO:

some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0

The "some" line indicates the share of time in which at least some
tasks are stalled on a given resource.

The "full" line indicates the share of time in which all non-idle
tasks are stalled on a given resource simultaneously. In this state
actual CPU cycles are going to waste, and a workload that spends
extended time in this state is considered to be thrashing. This has
severe impact on performance, and it's useful to distinguish this
situation from a state where some tasks are stalled but the CPU is
still doing productive work. As such, time spent in this subset of the
stall state is tracked separately and exported in the "full" averages.

The ratios are tracked as recent trends over ten, sixty, and three
hundred second windows, which gives insight into short term events as
well as medium and long term trends. The total absolute stall time is
tracked and exported as well, to allow detection of latency spikes
which wouldn't necessarily make a dent in the time averages, or to
average trends over custom time frames.

Monitoring for pressure thresholds
==================================

Users can register triggers and use poll() to be woken up when resource
pressure exceeds certain thresholds.

A trigger describes the maximum cumulative stall time over a specific
time window, e.g. 100ms of total stall time within any 500ms window to
generate a wakeup event.

To register a trigger user has to open psi interface file under
/proc/pressure/ representing the resource to be monitored and write the
desired threshold and time window. The open file descriptor should be
used to wait for trigger events using select(), poll() or epoll().
The following format is used:

<some|full> <stall amount in us> <time window in us>

For example writing "some 150000 1000000" into /proc/pressure/memory
would add 150ms threshold for partial memory stall measured within
1sec time window. Writing "full 50000 1000000" into /proc/pressure/io
would add 50ms threshold for full io stall measured within 1sec time window.

Triggers can be set on more than one psi metric and more than one trigger
for the same psi metric can be specified. However for each trigger a separate
file descriptor is required to be able to poll it separately from others,
therefore for each trigger a separate open() syscall should be made even
when opening the same psi interface file.

Monitors activate only when system enters stall state for the monitored
psi metric and deactivates upon exit from the stall state. While system is
in the stall state psi signal growth is monitored at a rate of 10 times per
tracking window.

The kernel accepts window sizes ranging from 500ms to 10s, therefore min
monitoring update interval is 50ms and max is 1s. Min limit is set to
prevent overly frequent polling. Max limit is chosen as a high enough number
after which monitors are most likely not needed and psi averages can be used
instead.

When activated, psi monitor stays active for at least the duration of one
tracking window to avoid repeated activations/deactivations when system is
bouncing in and out of the stall state.

Notifications to the userspace are rate-limited to one per tracking window.

The trigger will de-register when the file descriptor used to define the
trigger  is closed.

Userspace monitor usage example
===============================

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <poll.h>
#include <string.h>
#include <unistd.h>

/*
 * Monitor memory partial stall with 1s tracking window size
 * and 150ms threshold.
 */
int main() {
	const char trig[] = "some 150000 1000000";
	struct pollfd fds;
	int n;

	fds.fd = open("/proc/pressure/memory", O_RDWR | O_NONBLOCK);
	if (fds.fd < 0) {
		printf("/proc/pressure/memory open error: %s\n",
			strerror(errno));
		return 1;
	}
	fds.events = POLLPRI;

	if (write(fds.fd, trig, strlen(trig) + 1) < 0) {
		printf("/proc/pressure/memory write error: %s\n",
			strerror(errno));
		return 1;
	}

	printf("waiting for events...\n");
	while (1) {
		n = poll(&fds, 1, -1);
		if (n < 0) {
			printf("poll error: %s\n", strerror(errno));
			return 1;
		}
		if (fds.revents & POLLERR) {
			printf("got POLLERR, event source is gone\n");
			return 0;
		}
		if (fds.revents & POLLPRI) {
			printf("event triggered!\n");
		} else {
			printf("unknown event received: 0x%x\n", fds.revents);
			return 1;
		}
	}

	return 0;
}

Cgroup2 interface
=================

In a system with a CONFIG_CGROUP=y kernel and the cgroup2 filesystem
mounted, pressure stall information is also tracked for tasks grouped
into cgroups. Each subdirectory in the cgroupfs mountpoint contains
cpu.pressure, memory.pressure, and io.pressure files; the format is
the same as the /proc/pressure/ files.

Per-cgroup psi monitors can be specified and used the same way as
system-wide ones.
+4 −0
Original line number Diff line number Diff line
@@ -3331,6 +3331,10 @@
			before loading.
			See Documentation/blockdev/ramdisk.txt.

	psi=		[KNL] Enable or disable pressure stall information
			tracking.
			Format: <bool>

	psmouse.proto=	[HW,MOUSE] Highest PS2 mouse protocol extension to
			probe for; one of (bare|imps|exps|lifebook|any).
	psmouse.rate=	[HW,MOUSE] Set desired mouse report rate, in reports
+18 −0
Original line number Diff line number Diff line
@@ -958,6 +958,12 @@ All time durations are in microseconds.
	$PERIOD duration.  If only one number is written, $MAX is
	updated.

  cpu.pressure
	A read-only nested-key file which exists on non-root cgroups.

	Shows pressure stall information for CPU. See
	Documentation/accounting/psi.txt for details.


Memory
------
@@ -1194,6 +1200,12 @@ PAGE_SIZE multiple when read back.
	Swap usage hard limit.  If a cgroup's swap usage reaches this
	limit, anonymous meomry of the cgroup will not be swapped out.

  memory.pressure
	A read-only nested-key file which exists on non-root cgroups.

	Shows pressure stall information for memory. See
	Documentation/accounting/psi.txt for details.


Usage Guidelines
~~~~~~~~~~~~~~~~
@@ -1329,6 +1341,12 @@ IO Interface Files

	  8:16 rbps=2097152 wbps=max riops=max wiops=max

  io.pressure
	A read-only nested-key file which exists on non-root cgroups.

	Shows pressure stall information for IO. See
	Documentation/accounting/psi.txt for details.


Writeback
~~~~~~~~~
+1 −1
Original line number Diff line number Diff line
# SPDX-License-Identifier: GPL-2.0
VERSION = 4
PATCHLEVEL = 14
SUBLEVEL = 107
SUBLEVEL = 108
EXTRAVERSION =
NAME = Petit Gorille

+8 −0
Original line number Diff line number Diff line
@@ -417,6 +417,14 @@ config ARC_HAS_ACCL_REGS
	  (also referred to as r58:r59). These can also be used by gcc as GPR so
	  kernel needs to save/restore per process

config ARC_IRQ_NO_AUTOSAVE
	bool "Disable hardware autosave regfile on interrupts"
	default n
	help
	  On HS cores, taken interrupt auto saves the regfile on stack.
	  This is programmable and can be optionally disabled in which case
	  software INTERRUPT_PROLOGUE/EPILGUE do the needed work

endif	# ISA_ARCV2

endmenu   # "ARC CPU Configuration"
Loading