Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit fca515fb authored by Tony Luck's avatar Tony Luck
Browse files

Pull pvops into release branch

parents 2b04be7e 4d58bbcc
Loading
Loading
Loading
Loading
+137 −0
Original line number Diff line number Diff line
Paravirt_ops on IA64
====================
                          21 May 2008, Isaku Yamahata <yamahata@valinux.co.jp>


Introduction
------------
The aim of this documentation is to help with maintainability and/or to
encourage people to use paravirt_ops/IA64.

paravirt_ops (pv_ops in short) is a way for virtualization support of
Linux kernel on x86. Several ways for virtualization support were
proposed, paravirt_ops is the winner.
On the other hand, now there are also several IA64 virtualization
technologies like kvm/IA64, xen/IA64 and many other academic IA64
hypervisors so that it is good to add generic virtualization
infrastructure on Linux/IA64.


What is paravirt_ops?
---------------------
It has been developed on x86 as virtualization support via API, not ABI.
It allows each hypervisor to override operations which are important for
hypervisors at API level. And it allows a single kernel binary to run on
all supported execution environments including native machine.
Essentially paravirt_ops is a set of function pointers which represent
operations corresponding to low level sensitive instructions and high
level functionalities in various area. But one significant difference
from usual function pointer table is that it allows optimization with
binary patch. It is because some of these operations are very
performance sensitive and indirect call overhead is not negligible.
With binary patch, indirect C function call can be transformed into
direct C function call or in-place execution to eliminate the overhead.

Thus, operations of paravirt_ops are classified into three categories.
- simple indirect call
  These operations correspond to high level functionality so that the
  overhead of indirect call isn't very important.

- indirect call which allows optimization with binary patch
  Usually these operations correspond to low level instructions. They
  are called frequently and performance critical. So the overhead is
  very important.

- a set of macros for hand written assembly code
  Hand written assembly codes (.S files) also need paravirtualization
  because they include sensitive instructions or some of code paths in
  them are very performance critical.


The relation to the IA64 machine vector
---------------------------------------
Linux/IA64 has the IA64 machine vector functionality which allows the
kernel to switch implementations (e.g. initialization, ipi, dma api...)
depending on executing platform.
We can replace some implementations very easily defining a new machine
vector. Thus another approach for virtualization support would be
enhancing the machine vector functionality.
But paravirt_ops approach was taken because
- virtualization support needs wider support than machine vector does.
  e.g. low level instruction paravirtualization. It must be
       initialized very early before platform detection.

- virtualization support needs more functionality like binary patch.
  Probably the calling overhead might not be very large compared to the
  emulation overhead of virtualization. However in the native case, the
  overhead should be eliminated completely.
  A single kernel binary should run on each environment including native,
  and the overhead of paravirt_ops on native environment should be as
  small as possible.

- for full virtualization technology, e.g. KVM/IA64 or
  Xen/IA64 HVM domain, the result would be
  (the emulated platform machine vector. probably dig) + (pv_ops).
  This means that the virtualization support layer should be under
  the machine vector layer.

Possibly it might be better to move some function pointers from
paravirt_ops to machine vector. In fact, Xen domU case utilizes both
pv_ops and machine vector.


IA64 paravirt_ops
-----------------
In this section, the concrete paravirt_ops will be discussed.
Because of the architecture difference between ia64 and x86, the
resulting set of functions is very different from x86 pv_ops.

- C function pointer tables
They are not very performance critical so that simple C indirect
function call is acceptable. The following structures are defined at
this moment. For details see linux/include/asm-ia64/paravirt.h
  - struct pv_info
    This structure describes the execution environment.
  - struct pv_init_ops
    This structure describes the various initialization hooks.
  - struct pv_iosapic_ops
    This structure describes hooks to iosapic operations.
  - struct pv_irq_ops
    This structure describes hooks to irq related operations
  - struct pv_time_op
    This structure describes hooks to steal time accounting.

- a set of indirect calls which need optimization
Currently this class of functions correspond to a subset of IA64
intrinsics. At this moment the optimization with binary patch isn't
implemented yet.
struct pv_cpu_op is defined. For details see
linux/include/asm-ia64/paravirt_privop.h
Mostly they correspond to ia64 intrinsics 1-to-1.
Caveat: Now they are defined as C indirect function pointers, but in
order to support binary patch optimization, they will be changed
using GCC extended inline assembly code.

- a set of macros for hand written assembly code (.S files)
For maintenance purpose, the taken approach for .S files is single
source code and compile multiple times with different macros definitions.
Each pv_ops instance must define those macros to compile.
The important thing here is that sensitive, but non-privileged
instructions must be paravirtualized and that some privileged
instructions also need paravirtualization for reasonable performance.
Developers who modify .S files must be aware of that. At this moment
an easy checker is implemented to detect paravirtualization breakage.
But it doesn't cover all the cases.

Sometimes this set of macros is called pv_cpu_asm_op. But there is no
corresponding structure in the source code.
Those macros mostly 1:1 correspond to a subset of privileged
instructions. See linux/include/asm-ia64/native/inst.h.
And some functions written in assembly also need to be overrided so
that each pv_ops instance have to define some macros. Again see
linux/include/asm-ia64/native/inst.h.


Those structures must be initialized very early before start_kernel.
Probably initialized in head.S using multi entry point or some other trick.
For native case implementation see linux/arch/ia64/kernel/paravirt.c.
+6 −0
Original line number Diff line number Diff line
@@ -100,3 +100,9 @@ define archhelp
  echo '  boot		- Build vmlinux and bootloader for Ski simulator'
  echo '* unwcheck	- Check vmlinux for invalid unwind info'
endef

archprepare: make_nr_irqs_h FORCE
PHONY += make_nr_irqs_h FORCE

make_nr_irqs_h: FORCE
	$(Q)$(MAKE) $(build)=arch/ia64/kernel include/asm-ia64/nr-irqs.h
+44 −0
Original line number Diff line number Diff line
@@ -36,6 +36,8 @@ obj-$(CONFIG_PCI_MSI) += msi_ia64.o
mca_recovery-y			+= mca_drv.o mca_drv_asm.o
obj-$(CONFIG_IA64_MC_ERR_INJECT)+= err_inject.o

obj-$(CONFIG_PARAVIRT)		+= paravirt.o paravirtentry.o

obj-$(CONFIG_IA64_ESI)		+= esi.o
ifneq ($(CONFIG_IA64_ESI),)
obj-y				+= esi_stub.o	# must be in kernel proper
@@ -70,3 +72,45 @@ $(obj)/gate-syms.o: $(obj)/gate.lds $(obj)/gate.o FORCE
# We must build gate.so before we can assemble it.
# Note: kbuild does not track this dependency due to usage of .incbin
$(obj)/gate-data.o: $(obj)/gate.so

# Calculate NR_IRQ = max(IA64_NATIVE_NR_IRQS, XEN_NR_IRQS, ...) based on config
define sed-y
	"/^->/{s:^->\([^ ]*\) [\$$#]*\([^ ]*\) \(.*\):#define \1 \2 /* \3 */:; s:->::; p;}"
endef
quiet_cmd_nr_irqs = GEN     $@
define cmd_nr_irqs
	(set -e; \
	 echo "#ifndef __ASM_NR_IRQS_H__"; \
	 echo "#define __ASM_NR_IRQS_H__"; \
	 echo "/*"; \
	 echo " * DO NOT MODIFY."; \
	 echo " *"; \
	 echo " * This file was generated by Kbuild"; \
	 echo " *"; \
	 echo " */"; \
	 echo ""; \
	 sed -ne $(sed-y) $<; \
	 echo ""; \
	 echo "#endif" ) > $@
endef

# We use internal kbuild rules to avoid the "is up to date" message from make
arch/$(SRCARCH)/kernel/nr-irqs.s: $(srctree)/arch/$(SRCARCH)/kernel/nr-irqs.c \
				$(wildcard $(srctree)/include/asm-ia64/*/irq.h)
	$(Q)mkdir -p $(dir $@)
	$(call if_changed_dep,cc_s_c)

include/asm-ia64/nr-irqs.h: arch/$(SRCARCH)/kernel/nr-irqs.s
	$(Q)mkdir -p $(dir $@)
	$(call cmd,nr_irqs)

clean-files += $(objtree)/include/asm-ia64/nr-irqs.h

#
# native ivt.S and entry.S
#
ASM_PARAVIRT_OBJS = ivt.o entry.o
define paravirtualized_native
AFLAGS_$(1) += -D__IA64_ASM_PARAVIRTUALIZED_NATIVE
endef
$(foreach obj,$(ASM_PARAVIRT_OBJS),$(eval $(call paravirtualized_native,$(obj))))
+72 −43
Original line number Diff line number Diff line
@@ -22,6 +22,11 @@
 * Patrick O'Rourke	<orourke@missioncriticallinux.com>
 * 11/07/2000
 */
/*
 * Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
 *                    VA Linux Systems Japan K.K.
 *                    pv_ops.
 */
/*
 * Global (preserved) predicate usage on syscall entry/exit path:
 *
@@ -45,6 +50,7 @@

#include "minstate.h"

#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
	/*
	 * execve() is special because in case of success, we need to
	 * setup a null register window frame.
@@ -173,6 +179,7 @@ GLOBAL_ENTRY(sys_clone)
	mov rp=loc0
	br.ret.sptk.many rp
END(sys_clone)
#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */

/*
 * prev_task <- ia64_switch_to(struct task_struct *next)
@@ -180,7 +187,7 @@ END(sys_clone)
 *	called.  The code starting at .map relies on this.  The rest of the code
 *	doesn't care about the interrupt masking status.
 */
GLOBAL_ENTRY(ia64_switch_to)
GLOBAL_ENTRY(__paravirt_switch_to)
	.prologue
	alloc r16=ar.pfs,1,0,0,0
	DO_SAVE_SWITCH_STACK
@@ -204,7 +211,7 @@ GLOBAL_ENTRY(ia64_switch_to)
	;;
.done:
	ld8 sp=[r21]			// load kernel stack pointer of new task
	mov IA64_KR(CURRENT)=in0	// update "current" application register
	MOV_TO_KR(CURRENT, in0, r8, r9)		// update "current" application register
	mov r8=r13			// return pointer to previously running task
	mov r13=in0			// set "current" pointer
	;;
@@ -216,26 +223,25 @@ GLOBAL_ENTRY(ia64_switch_to)
	br.ret.sptk.many rp		// boogie on out in new context

.map:
	rsm psr.ic			// interrupts (psr.i) are already disabled here
	RSM_PSR_IC(r25)			// interrupts (psr.i) are already disabled here
	movl r25=PAGE_KERNEL
	;;
	srlz.d
	or r23=r25,r20			// construct PA | page properties
	mov r25=IA64_GRANULE_SHIFT<<2
	;;
	mov cr.itir=r25
	mov cr.ifa=in0			// VA of next task...
	MOV_TO_ITIR(p0, r25, r8)
	MOV_TO_IFA(in0, r8)		// VA of next task...
	;;
	mov r25=IA64_TR_CURRENT_STACK
	mov IA64_KR(CURRENT_STACK)=r26	// remember last page we mapped...
	MOV_TO_KR(CURRENT_STACK, r26, r8, r9)	// remember last page we mapped...
	;;
	itr.d dtr[r25]=r23		// wire in new mapping...
	ssm psr.ic			// reenable the psr.ic bit
	;;
	srlz.d
	SSM_PSR_IC_AND_SRLZ_D(r8, r9)	// reenable the psr.ic bit
	br.cond.sptk .done
END(ia64_switch_to)
END(__paravirt_switch_to)

#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
/*
 * Note that interrupts are enabled during save_switch_stack and load_switch_stack.  This
 * means that we may get an interrupt with "sp" pointing to the new kernel stack while
@@ -375,7 +381,7 @@ END(save_switch_stack)
 *	- b7 holds address to return to
 *	- must not touch r8-r11
 */
ENTRY(load_switch_stack)
GLOBAL_ENTRY(load_switch_stack)
	.prologue
	.altrp b7

@@ -571,7 +577,7 @@ GLOBAL_ENTRY(ia64_trace_syscall)
.ret3:
(pUStk)	cmp.eq.unc p6,p0=r0,r0			// p6 <- pUStk
(pUStk)	rsm psr.i				// disable interrupts
	br.cond.sptk .work_pending_syscall_end
	br.cond.sptk ia64_work_pending_syscall_end

strace_error:
	ld8 r3=[r2]				// load pt_regs.r8
@@ -636,8 +642,17 @@ GLOBAL_ENTRY(ia64_ret_from_syscall)
	adds r2=PT(R8)+16,sp			// r2 = &pt_regs.r8
	mov r10=r0				// clear error indication in r10
(p7)	br.cond.spnt handle_syscall_error	// handle potential syscall failure
#ifdef CONFIG_PARAVIRT
	;;
	br.cond.sptk.few ia64_leave_syscall
	;;
#endif /* CONFIG_PARAVIRT */
END(ia64_ret_from_syscall)
#ifndef CONFIG_PARAVIRT
	// fall through
#endif
#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */

/*
 * ia64_leave_syscall(): Same as ia64_leave_kernel, except that it doesn't
 *	need to switch to bank 0 and doesn't restore the scratch registers.
@@ -682,7 +697,7 @@ END(ia64_ret_from_syscall)
 *	      ar.csd: cleared
 *	      ar.ssd: cleared
 */
ENTRY(ia64_leave_syscall)
GLOBAL_ENTRY(__paravirt_leave_syscall)
	PT_REGS_UNWIND_INFO(0)
	/*
	 * work.need_resched etc. mustn't get changed by this CPU before it returns to
@@ -692,11 +707,11 @@ ENTRY(ia64_leave_syscall)
	 * extra work.  We always check for extra work when returning to user-level.
	 * With CONFIG_PREEMPT, we also check for extra work when the preempt_count
	 * is 0.  After extra work processing has been completed, execution
	 * resumes at .work_processed_syscall with p6 set to 1 if the extra-work-check
	 * resumes at ia64_work_processed_syscall with p6 set to 1 if the extra-work-check
	 * needs to be redone.
	 */
#ifdef CONFIG_PREEMPT
	rsm psr.i				// disable interrupts
	RSM_PSR_I(p0, r2, r18)			// disable interrupts
	cmp.eq pLvSys,p0=r0,r0			// pLvSys=1: leave from syscall
(pKStk) adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
	;;
@@ -706,11 +721,12 @@ ENTRY(ia64_leave_syscall)
	;;
	cmp.eq p6,p0=r21,r0		// p6 <- pUStk || (preempt_count == 0)
#else /* !CONFIG_PREEMPT */
(pUStk)	rsm psr.i
	RSM_PSR_I(pUStk, r2, r18)
	cmp.eq pLvSys,p0=r0,r0		// pLvSys=1: leave from syscall
(pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
#endif
.work_processed_syscall:
.global __paravirt_work_processed_syscall;
__paravirt_work_processed_syscall:
#ifdef CONFIG_VIRT_CPU_ACCOUNTING
	adds r2=PT(LOADRS)+16,r12
(pUStk)	mov.m r22=ar.itc			// fetch time at leave
@@ -744,7 +760,7 @@ ENTRY(ia64_leave_syscall)
(pNonSys) break 0		//      bug check: we shouldn't be here if pNonSys is TRUE!
	;;
	invala			// M0|1 invalidate ALAT
	rsm psr.i | psr.ic	// M2   turn off interrupts and interruption collection
	RSM_PSR_I_IC(r28, r29, r30)	// M2   turn off interrupts and interruption collection
	cmp.eq p9,p0=r0,r0	// A    set p9 to indicate that we should restore cr.ifs

	ld8 r29=[r2],16		// M0|1 load cr.ipsr
@@ -765,7 +781,7 @@ ENTRY(ia64_leave_syscall)
	;;
#endif
	ld8 r26=[r2],PT(B0)-PT(AR_PFS)	// M0|1 load ar.pfs
(pKStk)	mov r22=psr			// M2   read PSR now that interrupts are disabled
	MOV_FROM_PSR(pKStk, r22, r21)	// M2   read PSR now that interrupts are disabled
	nop 0
	;;
	ld8 r21=[r2],PT(AR_RNAT)-PT(B0) // M0|1 load b0
@@ -798,7 +814,7 @@ ENTRY(ia64_leave_syscall)

	srlz.d				// M0   ensure interruption collection is off (for cover)
	shr.u r18=r19,16		// I0|1 get byte size of existing "dirty" partition
	cover				// B    add current frame into dirty partition & set cr.ifs
	COVER				// B    add current frame into dirty partition & set cr.ifs
	;;
#ifdef CONFIG_VIRT_CPU_ACCOUNTING
	mov r19=ar.bsp			// M2   get new backing store pointer
@@ -823,8 +839,9 @@ ENTRY(ia64_leave_syscall)
	mov.m ar.ssd=r0			// M2   clear ar.ssd
	mov f11=f0			// F    clear f11
	br.cond.sptk.many rbs_switch	// B
END(ia64_leave_syscall)
END(__paravirt_leave_syscall)

#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
#ifdef CONFIG_IA32_SUPPORT
GLOBAL_ENTRY(ia64_ret_from_ia32_execve)
	PT_REGS_UNWIND_INFO(0)
@@ -835,10 +852,20 @@ GLOBAL_ENTRY(ia64_ret_from_ia32_execve)
	st8.spill [r2]=r8	// store return value in slot for r8 and set unat bit
	.mem.offset 8,0
	st8.spill [r3]=r0	// clear error indication in slot for r10 and set unat bit
#ifdef CONFIG_PARAVIRT
	;;
	// don't fall through, ia64_leave_kernel may be #define'd
	br.cond.sptk.few ia64_leave_kernel
	;;
#endif /* CONFIG_PARAVIRT */
END(ia64_ret_from_ia32_execve)
#ifndef CONFIG_PARAVIRT
	// fall through
#endif
#endif /* CONFIG_IA32_SUPPORT */
GLOBAL_ENTRY(ia64_leave_kernel)
#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */

GLOBAL_ENTRY(__paravirt_leave_kernel)
	PT_REGS_UNWIND_INFO(0)
	/*
	 * work.need_resched etc. mustn't get changed by this CPU before it returns to
@@ -852,7 +879,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
	 * needs to be redone.
	 */
#ifdef CONFIG_PREEMPT
	rsm psr.i				// disable interrupts
	RSM_PSR_I(p0, r17, r31)			// disable interrupts
	cmp.eq p0,pLvSys=r0,r0			// pLvSys=0: leave from kernel
(pKStk)	adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
	;;
@@ -862,7 +889,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
	;;
	cmp.eq p6,p0=r21,r0		// p6 <- pUStk || (preempt_count == 0)
#else
(pUStk)	rsm psr.i
	RSM_PSR_I(pUStk, r17, r31)
	cmp.eq p0,pLvSys=r0,r0		// pLvSys=0: leave from kernel
(pUStk)	cmp.eq.unc p6,p0=r0,r0		// p6 <- pUStk
#endif
@@ -910,7 +937,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
	mov ar.csd=r30
	mov ar.ssd=r31
	;;
	rsm psr.i | psr.ic	// initiate turning off of interrupt and interruption collection
	RSM_PSR_I_IC(r23, r22, r25)	// initiate turning off of interrupt and interruption collection
	invala			// invalidate ALAT
	;;
	ld8.fill r22=[r2],24
@@ -942,7 +969,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
	mov ar.ccv=r15
	;;
	ldf.fill f11=[r2]
	bsw.0			// switch back to bank 0 (no stop bit required beforehand...)
	BSW_0(r2, r3, r15)	// switch back to bank 0 (no stop bit required beforehand...)
	;;
(pUStk)	mov r18=IA64_KR(CURRENT)// M2 (12 cycle read latency)
	adds r16=PT(CR_IPSR)+16,r12
@@ -950,12 +977,12 @@ GLOBAL_ENTRY(ia64_leave_kernel)

#ifdef CONFIG_VIRT_CPU_ACCOUNTING
	.pred.rel.mutex pUStk,pKStk
(pKStk)	mov r22=psr		// M2 read PSR now that interrupts are disabled
	MOV_FROM_PSR(pKStk, r22, r29)	// M2 read PSR now that interrupts are disabled
(pUStk)	mov.m r22=ar.itc	// M  fetch time at leave
	nop.i 0
	;;
#else
(pKStk)	mov r22=psr		// M2 read PSR now that interrupts are disabled
	MOV_FROM_PSR(pKStk, r22, r29)	// M2 read PSR now that interrupts are disabled
	nop.i 0
	nop.i 0
	;;
@@ -1027,7 +1054,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
	 * NOTE: alloc, loadrs, and cover can't be predicated.
	 */
(pNonSys) br.cond.dpnt dont_preserve_current_frame
	cover				// add current frame into dirty partition and set cr.ifs
	COVER				// add current frame into dirty partition and set cr.ifs
	;;
	mov r19=ar.bsp			// get new backing store pointer
rbs_switch:
@@ -1130,16 +1157,16 @@ skip_rbs_switch:
(pKStk)	dep r29=r22,r29,21,1	// I0 update ipsr.pp with psr.pp
(pLvSys)mov r16=r0		// A  clear r16 for leave_syscall, no-op otherwise
	;;
	mov cr.ipsr=r29		// M2
	MOV_TO_IPSR(p0, r29, r25)	// M2
	mov ar.pfs=r26		// I0
(pLvSys)mov r17=r0		// A  clear r17 for leave_syscall, no-op otherwise

(p9)	mov cr.ifs=r30		// M2
	MOV_TO_IFS(p9, r30, r25)// M2
	mov b0=r21		// I0
(pLvSys)mov r18=r0		// A  clear r18 for leave_syscall, no-op otherwise

	mov ar.fpsr=r20		// M2
	mov cr.iip=r28		// M2
	MOV_TO_IIP(r28, r25)	// M2
	nop 0
	;;
(pUStk)	mov ar.rnat=r24		// M2 must happen with RSE in lazy mode
@@ -1148,7 +1175,7 @@ skip_rbs_switch:

	mov ar.rsc=r27		// M2
	mov pr=r31,-1		// I0
	rfi			// B
	RFI			// B

	/*
	 * On entry:
@@ -1174,35 +1201,36 @@ skip_rbs_switch:
	;;
(pKStk) st4 [r20]=r21
#endif
	ssm psr.i		// enable interrupts
	SSM_PSR_I(p0, p6, r2)	// enable interrupts
	br.call.spnt.many rp=schedule
.ret9:	cmp.eq p6,p0=r0,r0	// p6 <- 1 (re-check)
	rsm psr.i		// disable interrupts
	RSM_PSR_I(p0, r2, r20)	// disable interrupts
	;;
#ifdef CONFIG_PREEMPT
(pKStk)	adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
	;;
(pKStk)	st4 [r20]=r0		// preempt_count() <- 0
#endif
(pLvSys)br.cond.sptk.few  .work_pending_syscall_end
(pLvSys)br.cond.sptk.few  __paravirt_pending_syscall_end
	br.cond.sptk.many .work_processed_kernel

.notify:
(pUStk)	br.call.spnt.many rp=notify_resume_user
.ret10:	cmp.ne p6,p0=r0,r0	// p6 <- 0 (don't re-check)
(pLvSys)br.cond.sptk.few  .work_pending_syscall_end
(pLvSys)br.cond.sptk.few  __paravirt_pending_syscall_end
	br.cond.sptk.many .work_processed_kernel

.work_pending_syscall_end:
.global __paravirt_pending_syscall_end;
__paravirt_pending_syscall_end:
	adds r2=PT(R8)+16,r12
	adds r3=PT(R10)+16,r12
	;;
	ld8 r8=[r2]
	ld8 r10=[r3]
	br.cond.sptk.many .work_processed_syscall

END(ia64_leave_kernel)
	br.cond.sptk.many __paravirt_work_processed_syscall_target
END(__paravirt_leave_kernel)

#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
ENTRY(handle_syscall_error)
	/*
	 * Some system calls (e.g., ptrace, mmap) can return arbitrary values which could
@@ -1244,7 +1272,7 @@ END(ia64_invoke_schedule_tail)
	 * We declare 8 input registers so the system call args get preserved,
	 * in case we need to restart a system call.
	 */
ENTRY(notify_resume_user)
GLOBAL_ENTRY(notify_resume_user)
	.prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(8)
	alloc loc1=ar.pfs,8,2,3,0 // preserve all eight input regs in case of syscall restart!
	mov r9=ar.unat
@@ -1306,7 +1334,7 @@ ENTRY(sys_rt_sigreturn)
	adds sp=16,sp
	;;
	ld8 r9=[sp]				// load new ar.unat
	mov.sptk b7=r8,ia64_leave_kernel
	mov.sptk b7=r8,ia64_native_leave_kernel
	;;
	mov ar.unat=r9
	br.many b7
@@ -1665,3 +1693,4 @@ sys_call_table:
	data8 sys_timerfd_gettime

	.org sys_call_table + 8*NR_syscalls	// guard against failures to increase NR_syscalls
#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */
+41 −0
Original line number Diff line number Diff line
@@ -26,11 +26,14 @@
#include <asm/mmu_context.h>
#include <asm/asm-offsets.h>
#include <asm/pal.h>
#include <asm/paravirt.h>
#include <asm/pgtable.h>
#include <asm/processor.h>
#include <asm/ptrace.h>
#include <asm/system.h>
#include <asm/mca_asm.h>
#include <linux/init.h>
#include <linux/linkage.h>

#ifdef CONFIG_HOTPLUG_CPU
#define SAL_PSR_BITS_TO_SET				\
@@ -367,6 +370,44 @@ start_ap:
	;;
(isBP)	st8 [r2]=r28		// save the address of the boot param area passed by the bootloader

#ifdef CONFIG_PARAVIRT

	movl r14=hypervisor_setup_hooks
	movl r15=hypervisor_type
	mov r16=num_hypervisor_hooks
	;;
	ld8 r2=[r15]
	;;
	cmp.ltu p7,p0=r2,r16	// array size check
	shladd r8=r2,3,r14
	;;
(p7)	ld8 r9=[r8]
	;;
(p7)	mov b1=r9
(p7)	cmp.ne.unc p7,p0=r9,r0	// no actual branch to NULL
	;;
(p7)	br.call.sptk.many rp=b1

	__INITDATA

default_setup_hook = 0		// Currently nothing needs to be done.

	.weak xen_setup_hook

	.global hypervisor_type
hypervisor_type:
	data8		PARAVIRT_HYPERVISOR_TYPE_DEFAULT

	// must have the same order with PARAVIRT_HYPERVISOR_TYPE_xxx

hypervisor_setup_hooks:
	data8		default_setup_hook
	data8		xen_setup_hook
num_hypervisor_hooks = (. - hypervisor_setup_hooks) / 8
	.previous

#endif

#ifdef CONFIG_SMP
(isAP)	br.call.sptk.many rp=start_secondary
.ret0:
Loading