Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 5e2d059b authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull powerpc updates from Michael Ellerman:
 "Notable changes:

   - A fix for a bug in our page table fragment allocator, where a page
     table page could be freed and reallocated for something else while
     still in use, leading to memory corruption etc. The fix reuses
     pt_mm in struct page (x86 only) for a powerpc only refcount.

   - Fixes to our pkey support. Several are user-visible changes, but
     bring us in to line with x86 behaviour and/or fix outright bugs.
     Thanks to Florian Weimer for reporting many of these.

   - A series to improve the hvc driver & related OPAL console code,
     which have been seen to cause hardlockups at times. The hvc driver
     changes in particular have been in linux-next for ~month.

   - Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y.

   - Remove Power8 DD1 and Power9 DD1 support, neither chip should be in
     use anywhere other than as a paper weight.

   - An optimised memcmp implementation using Power7-or-later VMX
     instructions

   - Support for barrier_nospec on some NXP CPUs.

   - Support for flushing the count cache on context switch on some IBM
     CPUs (controlled by firmware), as a Spectre v2 mitigation.

   - A series to enhance the information we print on unhandled signals
     to bring it into line with other arches, including showing the
     offending VMA and dumping the instructions around the fault.

  Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey
  Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan,
  Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski,
  Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng,
  Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph
  Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave
  Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer,
  Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand,
  Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel
  Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh
  Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues,
  Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha,
  Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul
  Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza
  Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood,
  Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago
  Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat
  Rao, zhong jiang"

* tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (234 commits)
  powerpc/mm/book3s/radix: Add mapping statistics
  powerpc/uaccess: Enable get_user(u64, *p) on 32-bit
  powerpc/mm/hash: Remove unnecessary do { } while(0) loop
  powerpc/64s: move machine check SLB flushing to mm/slb.c
  powerpc/powernv/idle: Fix build error
  powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range
  powerpc/mm: remove warning about ‘type’ being set
  powerpc/32: Include setup.h header file to fix warnings
  powerpc: Move `path` variable inside DEBUG_PROM
  powerpc/powermac: Make some functions static
  powerpc/powermac: Remove variable x that's never read
  cxl: remove a dead branch
  powerpc/powermac: Add missing include of header pmac.h
  powerpc/kexec: Use common error handling code in setup_new_fdt()
  powerpc/xmon: Add address lookup for percpu symbols
  powerpc/mm: remove huge_pte_offset_and_shift() prototype
  powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled
  powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.
  powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements
  powerpc/fadump: handle crash memory ranges array index overflow
  ...
parents d1907752 a2dc009a
Loading
Loading
Loading
Loading
+5 −4
Original line number Diff line number Diff line
@@ -13,10 +13,11 @@ Contact: linuxppc-dev@lists.ozlabs.org
Description:	Write an integer containing the size in bytes of the memory
		you want removed from each NUMA node to this file - it must be
		aligned to the memblock size. This amount of RAM will be removed
		from the kernel mappings and the following debugfs files will be
		created. This can only be successfully done once per boot. Once
		memory is successfully removed from each node, the following
		files are created.
		from each NUMA node in the kernel mappings and the following
		debugfs files will be created. Once memory is successfully
		removed from each node, the following files are created. To
		re-add memory to the kernel, echo 0 into this file (it will be
		automatically onlined).

What:		/sys/kernel/debug/powerpc/memtrace/<node-id>
Date:		Aug 2017
+4 −0
Original line number Diff line number Diff line
@@ -2784,6 +2784,10 @@
			nosmt=force: Force disable SMT, cannot be undone
				     via the sysfs control file.

	nospectre_v1	[PPC] Disable mitigations for Spectre Variant 1 (bounds
			check bypass). With this option data leaks are possible
			in the system.

	nospectre_v2	[X86] Disable all mitigations for the Spectre variant 2
			(indirect branch prediction) vulnerability. System may
			allow data leaks with this option, which is equivalent
+41 −2
Original line number Diff line number Diff line
@@ -33,9 +33,48 @@ fanX_input Measured RPM value.
fanX_min		Threshold RPM for alert generation.
fanX_fault		0: No fail condition
			1: Failing fan

tempX_input		Measured ambient temperature.
tempX_max		Threshold ambient temperature for alert generation.
inX_input		Measured power supply voltage
tempX_highest		Historical maximum temperature
tempX_lowest		Historical minimum temperature
tempX_enable		Enable/disable all temperature sensors belonging to the
			sub-group. In POWER9, this attribute corresponds to
			each OCC. Using this attribute each OCC can be asked to
			disable/enable all of its temperature sensors.
			1: Enable
			0: Disable

inX_input		Measured power supply voltage (millivolt)
inX_fault		0: No fail condition.
			1: Failing power supply.
power1_input		System power consumption (microWatt)
inX_highest		Historical maximum voltage
inX_lowest		Historical minimum voltage
inX_enable		Enable/disable all voltage sensors belonging to the
			sub-group. In POWER9, this attribute corresponds to
			each OCC. Using this attribute each OCC can be asked to
			disable/enable all of its voltage sensors.
			1: Enable
			0: Disable

powerX_input		Power consumption (microWatt)
powerX_input_highest	Historical maximum power
powerX_input_lowest	Historical minimum power
powerX_enable		Enable/disable all power sensors belonging to the
			sub-group. In POWER9, this attribute corresponds to
			each OCC. Using this attribute each OCC can be asked to
			disable/enable all of its power sensors.
			1: Enable
			0: Disable

currX_input		Measured current (milliampere)
currX_highest		Historical maximum current
currX_lowest		Historical minimum current
currX_enable		Enable/disable all current sensors belonging to the
			sub-group. In POWER9, this attribute corresponds to
			each OCC. Using this attribute each OCC can be asked to
			disable/enable all of its current sensors.
			1: Enable
			0: Disable

energyX_input		Cumulative energy (microJoule)
+58 −0
Original line number Diff line number Diff line
DAWR issues on POWER9
============================

On POWER9 the DAWR can cause a checkstop if it points to cache
inhibited (CI) memory. Currently Linux has no way to disinguish CI
memory when configuring the DAWR, so (for now) the DAWR is disabled by
this commit:

    commit 9654153158d3e0684a1bdb76dbababdb7111d5a0
    Author: Michael Neuling <mikey@neuling.org>
    Date:   Tue Mar 27 15:37:24 2018 +1100
    powerpc: Disable DAWR in the base POWER9 CPU features

Technical Details:
============================

DAWR has 6 different ways of being set.
1) ptrace
2) h_set_mode(DAWR)
3) h_set_dabr()
4) kvmppc_set_one_reg()
5) xmon

For ptrace, we now advertise zero breakpoints on POWER9 via the
PPC_PTRACE_GETHWDBGINFO call. This results in GDB falling back to
software emulation of the watchpoint (which is slow).

h_set_mode(DAWR) and h_set_dabr() will now return an error to the
guest on a POWER9 host. Current Linux guests ignore this error, so
they will silently not get the DAWR.

kvmppc_set_one_reg() will store the value in the vcpu but won't
actually set it on POWER9 hardware. This is done so we don't break
migration from POWER8 to POWER9, at the cost of silently losing the
DAWR on the migration.

For xmon, the 'bd' command will return an error on P9.

Consequences for users
============================

For GDB watchpoints (ie 'watch' command) on POWER9 bare metal , GDB
will accept the command. Unfortunately since there is no hardware
support for the watchpoint, GDB will software emulate the watchpoint
making it run very slowly.

The same will also be true for any guests started on a POWER9
host. The watchpoint will fail and GDB will fall back to software
emulation.

If a guest is started on a POWER8 host, GDB will accept the watchpoint
and configure the hardware to use the DAWR. This will run at full
speed since it can use the hardware emulation. Unfortunately if this
guest is migrated to a POWER9 host, the watchpoint will be lost on the
POWER9. Loads and stores to the watchpoint locations will not be
trapped in GDB. The watchpoint is remembered, so if the guest is
migrated back to the POWER8 host, it will start working again.
+44 −0
Original line number Diff line number Diff line
@@ -198,3 +198,47 @@ presented). The transaction cannot then be continued and will take the failure
handler route.  Furthermore, the transactional 2nd register state will be
inaccessible.  GDB can currently be used on programs using TM, but not sensibly
in parts within transactions.

POWER9
======

TM on POWER9 has issues with storing the complete register state. This
is described in this commit:

    commit 4bb3c7a0208fc13ca70598efd109901a7cd45ae7
    Author: Paul Mackerras <paulus@ozlabs.org>
    Date:   Wed Mar 21 21:32:01 2018 +1100
    KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9

To account for this different POWER9 chips have TM enabled in
different ways.

On POWER9N DD2.01 and below, TM is disabled. ie
HWCAP2[PPC_FEATURE2_HTM] is not set.

On POWER9N DD2.1 TM is configured by firmware to always abort a
transaction when tm suspend occurs. So tsuspend will cause a
transaction to be aborted and rolled back. Kernel exceptions will also
cause the transaction to be aborted and rolled back and the exception
will not occur. If userspace constructs a sigcontext that enables TM
suspend, the sigcontext will be rejected by the kernel. This mode is
advertised to users with HWCAP2[PPC_FEATURE2_HTM_NO_SUSPEND] set.
HWCAP2[PPC_FEATURE2_HTM] is not set in this mode.

On POWER9N DD2.2 and above, KVM and POWERVM emulate TM for guests (as
described in commit 4bb3c7a0208f), hence TM is enabled for guests
ie. HWCAP2[PPC_FEATURE2_HTM] is set for guest userspace. Guests that
makes heavy use of TM suspend (tsuspend or kernel suspend) will result
in traps into the hypervisor and hence will suffer a performance
degradation. Host userspace has TM disabled
ie. HWCAP2[PPC_FEATURE2_HTM] is not set. (although we make enable it
at some point in the future if we bring the emulation into host
userspace context switching).

POWER9C DD1.2 and above are only available with POWERVM and hence
Linux only runs as a guest. On these systems TM is emulated like on
POWER9N DD2.2.

Guest migration from POWER8 to POWER9 will work with POWER9N DD2.2 and
POWER9C DD1.2. Since earlier POWER9 processors don't support TM
emulation, migration from POWER8 to POWER9 is not supported there.
Loading