Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 772a9aca authored by Ingo Molnar's avatar Ingo Molnar
Browse files

Merge tag 'pr-20150114-x86-entry' of...

Merge tag 'pr-20150114-x86-entry' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux into x86/asm

Pull x86/entry enhancements from Andy Lutomirski:

" This is my accumulated x86 entry work, part 1, for 3.20.  The meat
  of this is an IST rework.  When an IST exception interrupts user
  space, we will handle it on the per-thread kernel stack instead of
  on the IST stack.  This sounds messy, but it actually simplifies the
  IST entry/exit code, because it eliminates some ugly games we used
  to play in order to handle rescheduling, signal delivery, etc on the
  way out of an IST exception.

  The IST rework introduces proper context tracking to IST exception
  handlers.  I haven't seen any bug reports, but the old code could
  have incorrectly treated an IST exception handler as an RCU extended
  quiescent state.

  The memory failure change (included in this pull request with
  Borislav and Tony's permission) eliminates a bunch of code that
  is no longer needed now that user memory failure handlers are
  called in process context.

  Finally, this includes a few on Denys' uncontroversial and Obviously
  Correct (tm) cleanups.

  The IST and memory failure changes have been in -next for a while.

  LKML references:

  IST rework:
  http://lkml.kernel.org/r/cover.1416604491.git.luto@amacapital.net

  Memory failure change:
  http://lkml.kernel.org/r/54ab2ffa301102cd6e@agluck-desk.sc.intel.com

  Denys' cleanups:
  http://lkml.kernel.org/r/1420927210-19738-1-git-send-email-dvlasenk@redhat.com


"

This tree semantically depends on and is based on the following RCU commit:

  734d1680 ("rcu: Make rcu_nmi_enter() handle nesting")

... and for that reason won't be pushed upstream before the RCU bits hit Linus's tree.

Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parents 41ca5d4e f6f64681
Loading
Loading
Loading
Loading
+12 −6
Original line number Diff line number Diff line
@@ -78,9 +78,6 @@ The expensive (paranoid) way is to read back the MSR_GS_BASE value
	xorl %ebx,%ebx
1:	ret

and the whole paranoid non-paranoid macro complexity is about whether
to suffer that RDMSR cost.

If we are at an interrupt or user-trap/gate-alike boundary then we can
use the faster check: the stack will be a reliable indicator of
whether SWAPGS was already done: if we see that we are a secondary
@@ -93,6 +90,15 @@ which might have triggered right after a normal entry wrote CS to the
stack but before we executed SWAPGS, then the only safe way to check
for GS is the slower method: the RDMSR.

So we try only to mark those entry methods 'paranoid' that absolutely
need the more expensive check for the GS base - and we generate all
'normal' entry points with the regular (faster) entry macros.
Therefore, super-atomic entries (except NMI, which is handled separately)
must use idtentry with paranoid=1 to handle gsbase correctly.  This
triggers three main behavior changes:

 - Interrupt entry will use the slower gsbase check.
 - Interrupt entry from user mode will switch off the IST stack.
 - Interrupt exit to kernel mode will not attempt to reschedule.

We try to only use IST entries and the paranoid entry code for vectors
that absolutely need the more expensive check for the GS base - and we
generate all 'normal' entry points with the regular (faster) paranoid=0
variant.
+5 −3
Original line number Diff line number Diff line
@@ -40,9 +40,11 @@ An IST is selected by a non-zero value in the IST field of an
interrupt-gate descriptor.  When an interrupt occurs and the hardware
loads such a descriptor, the hardware automatically sets the new stack
pointer based on the IST value, then invokes the interrupt handler.  If
software wants to allow nested IST interrupts then the handler must
adjust the IST values on entry to and exit from the interrupt handler.
(This is occasionally done, e.g. for debug exceptions.)
the interrupt came from user mode, then the interrupt handler prologue
will switch back to the per-thread stack.  If software wants to allow
nested IST interrupts then the handler must adjust the IST values on
entry to and exit from the interrupt handler.  (This is occasionally
done, e.g. for debug exceptions.)

Events with different IST codes (i.e. with different stacks) can be
nested.  For example, a debug interrupt can safely be interrupted by an
+2 −2
Original line number Diff line number Diff line
@@ -179,8 +179,8 @@ sysenter_dispatch:
sysexit_from_sys_call:
	andl    $~TS_COMPAT,TI_status+THREAD_INFO(%rsp,RIP-ARGOFFSET)
	/* clear IF, that popfq doesn't enable interrupts early */
	andl  $~0x200,EFLAGS-R11(%rsp) 
	movl	RIP-R11(%rsp),%edx		/* User %eip */
	andl	$~0x200,EFLAGS-ARGOFFSET(%rsp)
	movl	RIP-ARGOFFSET(%rsp),%edx		/* User %eip */
	CFI_REGISTER rip,rdx
	RESTORE_ARGS 0,24,0,0,0,0
	xorq	%r8,%r8
+0 −1
Original line number Diff line number Diff line
@@ -83,7 +83,6 @@ For 32-bit we have the following conventions - kernel is built with
#define SS		160

#define ARGOFFSET	R11
#define SWFRAME		ORIG_RAX

	.macro SAVE_ARGS addskip=0, save_rcx=1, save_r891011=1, rax_enosys=0
	subq  $9*8+\addskip, %rsp
+0 −1
Original line number Diff line number Diff line
@@ -190,7 +190,6 @@ enum mcp_flags {
void machine_check_poll(enum mcp_flags flags, mce_banks_t *b);

int mce_notify_irq(void);
void mce_notify_process(void);

DECLARE_PER_CPU(struct mce, injectm);

Loading