Pull pvops into release branch (fca515fb) · Commits · e / devices / android_kernel_fairphone_FP3

Documentation/ia64/paravirt_ops.txt

0 → 100644

+137 −0

Original line number	Diff line number	Diff line
		Paravirt_ops on IA64
		====================
		21 May 2008, Isaku Yamahata <yamahata@valinux.co.jp>


		Introduction
		------------
		The aim of this documentation is to help with maintainability and/or to
		encourage people to use paravirt_ops/IA64.

		paravirt_ops (pv_ops in short) is a way for virtualization support of
		Linux kernel on x86. Several ways for virtualization support were
		proposed, paravirt_ops is the winner.
		On the other hand, now there are also several IA64 virtualization
		technologies like kvm/IA64, xen/IA64 and many other academic IA64
		hypervisors so that it is good to add generic virtualization
		infrastructure on Linux/IA64.


		What is paravirt_ops?
		---------------------
		It has been developed on x86 as virtualization support via API, not ABI.
		It allows each hypervisor to override operations which are important for
		hypervisors at API level. And it allows a single kernel binary to run on
		all supported execution environments including native machine.
		Essentially paravirt_ops is a set of function pointers which represent
		operations corresponding to low level sensitive instructions and high
		level functionalities in various area. But one significant difference
		from usual function pointer table is that it allows optimization with
		binary patch. It is because some of these operations are very
		performance sensitive and indirect call overhead is not negligible.
		With binary patch, indirect C function call can be transformed into
		direct C function call or in-place execution to eliminate the overhead.

		Thus, operations of paravirt_ops are classified into three categories.
		- simple indirect call
		These operations correspond to high level functionality so that the
		overhead of indirect call isn't very important.

		- indirect call which allows optimization with binary patch
		Usually these operations correspond to low level instructions. They
		are called frequently and performance critical. So the overhead is
		very important.

		- a set of macros for hand written assembly code
		Hand written assembly codes (.S files) also need paravirtualization
		because they include sensitive instructions or some of code paths in
		them are very performance critical.


		The relation to the IA64 machine vector
		---------------------------------------
		Linux/IA64 has the IA64 machine vector functionality which allows the
		kernel to switch implementations (e.g. initialization, ipi, dma api...)
		depending on executing platform.
		We can replace some implementations very easily defining a new machine
		vector. Thus another approach for virtualization support would be
		enhancing the machine vector functionality.
		But paravirt_ops approach was taken because
		- virtualization support needs wider support than machine vector does.
		e.g. low level instruction paravirtualization. It must be
		initialized very early before platform detection.

		- virtualization support needs more functionality like binary patch.
		Probably the calling overhead might not be very large compared to the
		emulation overhead of virtualization. However in the native case, the
		overhead should be eliminated completely.
		A single kernel binary should run on each environment including native,
		and the overhead of paravirt_ops on native environment should be as
		small as possible.

		- for full virtualization technology, e.g. KVM/IA64 or
		Xen/IA64 HVM domain, the result would be
		(the emulated platform machine vector. probably dig) + (pv_ops).
		This means that the virtualization support layer should be under
		the machine vector layer.

		Possibly it might be better to move some function pointers from
		paravirt_ops to machine vector. In fact, Xen domU case utilizes both
		pv_ops and machine vector.


		IA64 paravirt_ops
		-----------------
		In this section, the concrete paravirt_ops will be discussed.
		Because of the architecture difference between ia64 and x86, the
		resulting set of functions is very different from x86 pv_ops.

		- C function pointer tables
		They are not very performance critical so that simple C indirect
		function call is acceptable. The following structures are defined at
		this moment. For details see linux/include/asm-ia64/paravirt.h
		- struct pv_info
		This structure describes the execution environment.
		- struct pv_init_ops
		This structure describes the various initialization hooks.
		- struct pv_iosapic_ops
		This structure describes hooks to iosapic operations.
		- struct pv_irq_ops
		This structure describes hooks to irq related operations
		- struct pv_time_op
		This structure describes hooks to steal time accounting.

		- a set of indirect calls which need optimization
		Currently this class of functions correspond to a subset of IA64
		intrinsics. At this moment the optimization with binary patch isn't
		implemented yet.
		struct pv_cpu_op is defined. For details see
		linux/include/asm-ia64/paravirt_privop.h
		Mostly they correspond to ia64 intrinsics 1-to-1.
		Caveat: Now they are defined as C indirect function pointers, but in
		order to support binary patch optimization, they will be changed
		using GCC extended inline assembly code.

		- a set of macros for hand written assembly code (.S files)
		For maintenance purpose, the taken approach for .S files is single
		source code and compile multiple times with different macros definitions.
		Each pv_ops instance must define those macros to compile.
		The important thing here is that sensitive, but non-privileged
		instructions must be paravirtualized and that some privileged
		instructions also need paravirtualization for reasonable performance.
		Developers who modify .S files must be aware of that. At this moment
		an easy checker is implemented to detect paravirtualization breakage.
		But it doesn't cover all the cases.

		Sometimes this set of macros is called pv_cpu_asm_op. But there is no
		corresponding structure in the source code.
		Those macros mostly 1:1 correspond to a subset of privileged
		instructions. See linux/include/asm-ia64/native/inst.h.
		And some functions written in assembly also need to be overrided so
		that each pv_ops instance have to define some macros. Again see
		linux/include/asm-ia64/native/inst.h.


		Those structures must be initialized very early before start_kernel.
		Probably initialized in head.S using multi entry point or some other trick.
		For native case implementation see linux/arch/ia64/kernel/paravirt.c.

arch/ia64/Makefile

+6 −0

Original line number	Diff line number	Diff line
		@@ -100,3 +100,9 @@ define archhelp
		echo ' boot - Build vmlinux and bootloader for Ski simulator'
		echo '* unwcheck - Check vmlinux for invalid unwind info'
		endef

		archprepare: make_nr_irqs_h FORCE
		PHONY += make_nr_irqs_h FORCE

		make_nr_irqs_h: FORCE
		$(Q)$(MAKE) $(build)=arch/ia64/kernel include/asm-ia64/nr-irqs.h

arch/ia64/kernel/Makefile

+44 −0

Original line number	Diff line number	Diff line
		@@ -36,6 +36,8 @@ obj-$(CONFIG_PCI_MSI) += msi_ia64.o
		mca_recovery-y += mca_drv.o mca_drv_asm.o
		obj-$(CONFIG_IA64_MC_ERR_INJECT)+= err_inject.o

		obj-$(CONFIG_PARAVIRT) += paravirt.o paravirtentry.o

		obj-$(CONFIG_IA64_ESI) += esi.o
		ifneq ($(CONFIG_IA64_ESI),)
		obj-y += esi_stub.o # must be in kernel proper
		@@ -70,3 +72,45 @@ $(obj)/gate-syms.o: $(obj)/gate.lds $(obj)/gate.o FORCE
		# We must build gate.so before we can assemble it.
		# Note: kbuild does not track this dependency due to usage of .incbin
		$(obj)/gate-data.o: $(obj)/gate.so

		# Calculate NR_IRQ = max(IA64_NATIVE_NR_IRQS, XEN_NR_IRQS, ...) based on config
		define sed-y
		"/^->/{s:^->$[^ ]$ [\$$#]$[^ ]$ $.$:#define \1 \2 /* \3 */:; s:->::; p;}"
		endef
		quiet_cmd_nr_irqs = GEN $@
		define cmd_nr_irqs
		(set -e; \
		echo "#ifndef __ASM_NR_IRQS_H__"; \
		echo "#define __ASM_NR_IRQS_H__"; \
		echo "/*"; \
		echo " * DO NOT MODIFY."; \
		echo " *"; \
		echo " * This file was generated by Kbuild"; \
		echo " *"; \
		echo " */"; \
		echo ""; \
		sed -ne $(sed-y) $<; \
		echo ""; \
		echo "#endif" ) > $@
		endef

		# We use internal kbuild rules to avoid the "is up to date" message from make
		arch/$(SRCARCH)/kernel/nr-irqs.s: $(srctree)/arch/$(SRCARCH)/kernel/nr-irqs.c \
		$(wildcard $(srctree)/include/asm-ia64/*/irq.h)
		$(Q)mkdir -p $(dir $@)
		$(call if_changed_dep,cc_s_c)

		include/asm-ia64/nr-irqs.h: arch/$(SRCARCH)/kernel/nr-irqs.s
		$(Q)mkdir -p $(dir $@)
		$(call cmd,nr_irqs)

		clean-files += $(objtree)/include/asm-ia64/nr-irqs.h

		#
		# native ivt.S and entry.S
		#
		ASM_PARAVIRT_OBJS = ivt.o entry.o
		define paravirtualized_native
		AFLAGS_$(1) += -D__IA64_ASM_PARAVIRTUALIZED_NATIVE
		endef
		$(foreach obj,$(ASM_PARAVIRT_OBJS),$(eval $(call paravirtualized_native,$(obj))))

arch/ia64/kernel/entry.S

+72 −43

Original line number	Diff line number	Diff line
		@@ -22,6 +22,11 @@
		* Patrick O'Rourke <orourke@missioncriticallinux.com>
		* 11/07/2000
		*/
		/*
		* Copyright (c) 2008 Isaku Yamahata <yamahata at valinux co jp>
		* VA Linux Systems Japan K.K.
		* pv_ops.
		*/
		/*
		* Global (preserved) predicate usage on syscall entry/exit path:
		*
		@@ -45,6 +50,7 @@

		#include "minstate.h"

		#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
		/*
		* execve() is special because in case of success, we need to
		* setup a null register window frame.
		@@ -173,6 +179,7 @@ GLOBAL_ENTRY(sys_clone)
		mov rp=loc0
		br.ret.sptk.many rp
		END(sys_clone)
		#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */

		/*
		* prev_task <- ia64_switch_to(struct task_struct *next)
		@@ -180,7 +187,7 @@ END(sys_clone)
		* called. The code starting at .map relies on this. The rest of the code
		* doesn't care about the interrupt masking status.
		*/
		GLOBAL_ENTRY(ia64_switch_to)
		GLOBAL_ENTRY(__paravirt_switch_to)
		.prologue
		alloc r16=ar.pfs,1,0,0,0
		DO_SAVE_SWITCH_STACK
		@@ -204,7 +211,7 @@ GLOBAL_ENTRY(ia64_switch_to)
		;;
		.done:
		ld8 sp=[r21] // load kernel stack pointer of new task
		mov IA64_KR(CURRENT)=in0 // update "current" application register
		MOV_TO_KR(CURRENT, in0, r8, r9) // update "current" application register
		mov r8=r13 // return pointer to previously running task
		mov r13=in0 // set "current" pointer
		;;
		@@ -216,26 +223,25 @@ GLOBAL_ENTRY(ia64_switch_to)
		br.ret.sptk.many rp // boogie on out in new context

		.map:
		rsm psr.ic // interrupts (psr.i) are already disabled here
		RSM_PSR_IC(r25) // interrupts (psr.i) are already disabled here
		movl r25=PAGE_KERNEL
		;;
		srlz.d
		or r23=r25,r20 // construct PA \| page properties
		mov r25=IA64_GRANULE_SHIFT<<2
		;;
		mov cr.itir=r25
		mov cr.ifa=in0 // VA of next task...
		MOV_TO_ITIR(p0, r25, r8)
		MOV_TO_IFA(in0, r8) // VA of next task...
		;;
		mov r25=IA64_TR_CURRENT_STACK
		mov IA64_KR(CURRENT_STACK)=r26 // remember last page we mapped...
		MOV_TO_KR(CURRENT_STACK, r26, r8, r9) // remember last page we mapped...
		;;
		itr.d dtr[r25]=r23 // wire in new mapping...
		ssm psr.ic // reenable the psr.ic bit
		;;
		srlz.d
		SSM_PSR_IC_AND_SRLZ_D(r8, r9) // reenable the psr.ic bit
		br.cond.sptk .done
		END(ia64_switch_to)
		END(__paravirt_switch_to)

		#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
		/*
		* Note that interrupts are enabled during save_switch_stack and load_switch_stack. This
		* means that we may get an interrupt with "sp" pointing to the new kernel stack while
		@@ -375,7 +381,7 @@ END(save_switch_stack)
		* - b7 holds address to return to
		* - must not touch r8-r11
		*/
		ENTRY(load_switch_stack)
		GLOBAL_ENTRY(load_switch_stack)
		.prologue
		.altrp b7

		@@ -571,7 +577,7 @@ GLOBAL_ENTRY(ia64_trace_syscall)
		.ret3:
		(pUStk) cmp.eq.unc p6,p0=r0,r0 // p6 <- pUStk
		(pUStk) rsm psr.i // disable interrupts
		br.cond.sptk .work_pending_syscall_end
		br.cond.sptk ia64_work_pending_syscall_end

		strace_error:
		ld8 r3=[r2] // load pt_regs.r8
		@@ -636,8 +642,17 @@ GLOBAL_ENTRY(ia64_ret_from_syscall)
		adds r2=PT(R8)+16,sp // r2 = &pt_regs.r8
		mov r10=r0 // clear error indication in r10
		(p7) br.cond.spnt handle_syscall_error // handle potential syscall failure
		#ifdef CONFIG_PARAVIRT
		;;
		br.cond.sptk.few ia64_leave_syscall
		;;
		#endif /* CONFIG_PARAVIRT */
		END(ia64_ret_from_syscall)
		#ifndef CONFIG_PARAVIRT
		// fall through
		#endif
		#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */

		/*
		* ia64_leave_syscall(): Same as ia64_leave_kernel, except that it doesn't
		* need to switch to bank 0 and doesn't restore the scratch registers.
		@@ -682,7 +697,7 @@ END(ia64_ret_from_syscall)
		* ar.csd: cleared
		* ar.ssd: cleared
		*/
		ENTRY(ia64_leave_syscall)
		GLOBAL_ENTRY(__paravirt_leave_syscall)
		PT_REGS_UNWIND_INFO(0)
		/*
		* work.need_resched etc. mustn't get changed by this CPU before it returns to
		@@ -692,11 +707,11 @@ ENTRY(ia64_leave_syscall)
		* extra work. We always check for extra work when returning to user-level.
		* With CONFIG_PREEMPT, we also check for extra work when the preempt_count
		* is 0. After extra work processing has been completed, execution
		* resumes at .work_processed_syscall with p6 set to 1 if the extra-work-check
		* resumes at ia64_work_processed_syscall with p6 set to 1 if the extra-work-check
		* needs to be redone.
		*/
		#ifdef CONFIG_PREEMPT
		rsm psr.i // disable interrupts
		RSM_PSR_I(p0, r2, r18) // disable interrupts
		cmp.eq pLvSys,p0=r0,r0 // pLvSys=1: leave from syscall
		(pKStk) adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
		;;
		@@ -706,11 +721,12 @@ ENTRY(ia64_leave_syscall)
		;;
		cmp.eq p6,p0=r21,r0 // p6 <- pUStk \|\| (preempt_count == 0)
		#else /* !CONFIG_PREEMPT */
		(pUStk) rsm psr.i
		RSM_PSR_I(pUStk, r2, r18)
		cmp.eq pLvSys,p0=r0,r0 // pLvSys=1: leave from syscall
		(pUStk) cmp.eq.unc p6,p0=r0,r0 // p6 <- pUStk
		#endif
		.work_processed_syscall:
		.global __paravirt_work_processed_syscall;
		__paravirt_work_processed_syscall:
		#ifdef CONFIG_VIRT_CPU_ACCOUNTING
		adds r2=PT(LOADRS)+16,r12
		(pUStk) mov.m r22=ar.itc // fetch time at leave
		@@ -744,7 +760,7 @@ ENTRY(ia64_leave_syscall)
		(pNonSys) break 0 // bug check: we shouldn't be here if pNonSys is TRUE!
		;;
		invala // M0\|1 invalidate ALAT
		rsm psr.i \| psr.ic // M2 turn off interrupts and interruption collection
		RSM_PSR_I_IC(r28, r29, r30) // M2 turn off interrupts and interruption collection
		cmp.eq p9,p0=r0,r0 // A set p9 to indicate that we should restore cr.ifs

		ld8 r29=[r2],16 // M0\|1 load cr.ipsr
		@@ -765,7 +781,7 @@ ENTRY(ia64_leave_syscall)
		;;
		#endif
		ld8 r26=[r2],PT(B0)-PT(AR_PFS) // M0\|1 load ar.pfs
		(pKStk) mov r22=psr // M2 read PSR now that interrupts are disabled
		MOV_FROM_PSR(pKStk, r22, r21) // M2 read PSR now that interrupts are disabled
		nop 0
		;;
		ld8 r21=[r2],PT(AR_RNAT)-PT(B0) // M0\|1 load b0
		@@ -798,7 +814,7 @@ ENTRY(ia64_leave_syscall)

		srlz.d // M0 ensure interruption collection is off (for cover)
		shr.u r18=r19,16 // I0\|1 get byte size of existing "dirty" partition
		cover // B add current frame into dirty partition & set cr.ifs
		COVER // B add current frame into dirty partition & set cr.ifs
		;;
		#ifdef CONFIG_VIRT_CPU_ACCOUNTING
		mov r19=ar.bsp // M2 get new backing store pointer
		@@ -823,8 +839,9 @@ ENTRY(ia64_leave_syscall)
		mov.m ar.ssd=r0 // M2 clear ar.ssd
		mov f11=f0 // F clear f11
		br.cond.sptk.many rbs_switch // B
		END(ia64_leave_syscall)
		END(__paravirt_leave_syscall)

		#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
		#ifdef CONFIG_IA32_SUPPORT
		GLOBAL_ENTRY(ia64_ret_from_ia32_execve)
		PT_REGS_UNWIND_INFO(0)
		@@ -835,10 +852,20 @@ GLOBAL_ENTRY(ia64_ret_from_ia32_execve)
		st8.spill [r2]=r8 // store return value in slot for r8 and set unat bit
		.mem.offset 8,0
		st8.spill [r3]=r0 // clear error indication in slot for r10 and set unat bit
		#ifdef CONFIG_PARAVIRT
		;;
		// don't fall through, ia64_leave_kernel may be #define'd
		br.cond.sptk.few ia64_leave_kernel
		;;
		#endif /* CONFIG_PARAVIRT */
		END(ia64_ret_from_ia32_execve)
		#ifndef CONFIG_PARAVIRT
		// fall through
		#endif
		#endif /* CONFIG_IA32_SUPPORT */
		GLOBAL_ENTRY(ia64_leave_kernel)
		#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */

		GLOBAL_ENTRY(__paravirt_leave_kernel)
		PT_REGS_UNWIND_INFO(0)
		/*
		* work.need_resched etc. mustn't get changed by this CPU before it returns to
		@@ -852,7 +879,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
		* needs to be redone.
		*/
		#ifdef CONFIG_PREEMPT
		rsm psr.i // disable interrupts
		RSM_PSR_I(p0, r17, r31) // disable interrupts
		cmp.eq p0,pLvSys=r0,r0 // pLvSys=0: leave from kernel
		(pKStk) adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
		;;
		@@ -862,7 +889,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
		;;
		cmp.eq p6,p0=r21,r0 // p6 <- pUStk \|\| (preempt_count == 0)
		#else
		(pUStk) rsm psr.i
		RSM_PSR_I(pUStk, r17, r31)
		cmp.eq p0,pLvSys=r0,r0 // pLvSys=0: leave from kernel
		(pUStk) cmp.eq.unc p6,p0=r0,r0 // p6 <- pUStk
		#endif
		@@ -910,7 +937,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
		mov ar.csd=r30
		mov ar.ssd=r31
		;;
		rsm psr.i \| psr.ic // initiate turning off of interrupt and interruption collection
		RSM_PSR_I_IC(r23, r22, r25) // initiate turning off of interrupt and interruption collection
		invala // invalidate ALAT
		;;
		ld8.fill r22=[r2],24
		@@ -942,7 +969,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
		mov ar.ccv=r15
		;;
		ldf.fill f11=[r2]
		bsw.0 // switch back to bank 0 (no stop bit required beforehand...)
		BSW_0(r2, r3, r15) // switch back to bank 0 (no stop bit required beforehand...)
		;;
		(pUStk) mov r18=IA64_KR(CURRENT)// M2 (12 cycle read latency)
		adds r16=PT(CR_IPSR)+16,r12
		@@ -950,12 +977,12 @@ GLOBAL_ENTRY(ia64_leave_kernel)

		#ifdef CONFIG_VIRT_CPU_ACCOUNTING
		.pred.rel.mutex pUStk,pKStk
		(pKStk) mov r22=psr // M2 read PSR now that interrupts are disabled
		MOV_FROM_PSR(pKStk, r22, r29) // M2 read PSR now that interrupts are disabled
		(pUStk) mov.m r22=ar.itc // M fetch time at leave
		nop.i 0
		;;
		#else
		(pKStk) mov r22=psr // M2 read PSR now that interrupts are disabled
		MOV_FROM_PSR(pKStk, r22, r29) // M2 read PSR now that interrupts are disabled
		nop.i 0
		nop.i 0
		;;
		@@ -1027,7 +1054,7 @@ GLOBAL_ENTRY(ia64_leave_kernel)
		* NOTE: alloc, loadrs, and cover can't be predicated.
		*/
		(pNonSys) br.cond.dpnt dont_preserve_current_frame
		cover // add current frame into dirty partition and set cr.ifs
		COVER // add current frame into dirty partition and set cr.ifs
		;;
		mov r19=ar.bsp // get new backing store pointer
		rbs_switch:
		@@ -1130,16 +1157,16 @@ skip_rbs_switch:
		(pKStk) dep r29=r22,r29,21,1 // I0 update ipsr.pp with psr.pp
		(pLvSys)mov r16=r0 // A clear r16 for leave_syscall, no-op otherwise
		;;
		mov cr.ipsr=r29 // M2
		MOV_TO_IPSR(p0, r29, r25) // M2
		mov ar.pfs=r26 // I0
		(pLvSys)mov r17=r0 // A clear r17 for leave_syscall, no-op otherwise

		(p9) mov cr.ifs=r30 // M2
		MOV_TO_IFS(p9, r30, r25)// M2
		mov b0=r21 // I0
		(pLvSys)mov r18=r0 // A clear r18 for leave_syscall, no-op otherwise

		mov ar.fpsr=r20 // M2
		mov cr.iip=r28 // M2
		MOV_TO_IIP(r28, r25) // M2
		nop 0
		;;
		(pUStk) mov ar.rnat=r24 // M2 must happen with RSE in lazy mode
		@@ -1148,7 +1175,7 @@ skip_rbs_switch:

		mov ar.rsc=r27 // M2
		mov pr=r31,-1 // I0
		rfi // B
		RFI // B

		/*
		* On entry:
		@@ -1174,35 +1201,36 @@ skip_rbs_switch:
		;;
		(pKStk) st4 [r20]=r21
		#endif
		ssm psr.i // enable interrupts
		SSM_PSR_I(p0, p6, r2) // enable interrupts
		br.call.spnt.many rp=schedule
		.ret9: cmp.eq p6,p0=r0,r0 // p6 <- 1 (re-check)
		rsm psr.i // disable interrupts
		RSM_PSR_I(p0, r2, r20) // disable interrupts
		;;
		#ifdef CONFIG_PREEMPT
		(pKStk) adds r20=TI_PRE_COUNT+IA64_TASK_SIZE,r13
		;;
		(pKStk) st4 [r20]=r0 // preempt_count() <- 0
		#endif
		(pLvSys)br.cond.sptk.few .work_pending_syscall_end
		(pLvSys)br.cond.sptk.few __paravirt_pending_syscall_end
		br.cond.sptk.many .work_processed_kernel

		.notify:
		(pUStk) br.call.spnt.many rp=notify_resume_user
		.ret10: cmp.ne p6,p0=r0,r0 // p6 <- 0 (don't re-check)
		(pLvSys)br.cond.sptk.few .work_pending_syscall_end
		(pLvSys)br.cond.sptk.few __paravirt_pending_syscall_end
		br.cond.sptk.many .work_processed_kernel

		.work_pending_syscall_end:
		.global __paravirt_pending_syscall_end;
		__paravirt_pending_syscall_end:
		adds r2=PT(R8)+16,r12
		adds r3=PT(R10)+16,r12
		;;
		ld8 r8=[r2]
		ld8 r10=[r3]
		br.cond.sptk.many .work_processed_syscall

		END(ia64_leave_kernel)
		br.cond.sptk.many __paravirt_work_processed_syscall_target
		END(__paravirt_leave_kernel)

		#ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
		ENTRY(handle_syscall_error)
		/*
		* Some system calls (e.g., ptrace, mmap) can return arbitrary values which could
		@@ -1244,7 +1272,7 @@ END(ia64_invoke_schedule_tail)
		* We declare 8 input registers so the system call args get preserved,
		* in case we need to restart a system call.
		*/
		ENTRY(notify_resume_user)
		GLOBAL_ENTRY(notify_resume_user)
		.prologue ASM_UNW_PRLG_RP\|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(8)
		alloc loc1=ar.pfs,8,2,3,0 // preserve all eight input regs in case of syscall restart!
		mov r9=ar.unat
		@@ -1306,7 +1334,7 @@ ENTRY(sys_rt_sigreturn)
		adds sp=16,sp
		;;
		ld8 r9=[sp] // load new ar.unat
		mov.sptk b7=r8,ia64_leave_kernel
		mov.sptk b7=r8,ia64_native_leave_kernel
		;;
		mov ar.unat=r9
		br.many b7
		@@ -1665,3 +1693,4 @@ sys_call_table:
		data8 sys_timerfd_gettime

		.org sys_call_table + 8*NR_syscalls // guard against failures to increase NR_syscalls
		#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */

arch/ia64/kernel/head.S

+41 −0

Original line number	Diff line number	Diff line
		@@ -26,11 +26,14 @@
		#include <asm/mmu_context.h>
		#include <asm/asm-offsets.h>
		#include <asm/pal.h>
		#include <asm/paravirt.h>
		#include <asm/pgtable.h>
		#include <asm/processor.h>
		#include <asm/ptrace.h>
		#include <asm/system.h>
		#include <asm/mca_asm.h>
		#include <linux/init.h>
		#include <linux/linkage.h>

		#ifdef CONFIG_HOTPLUG_CPU
		#define SAL_PSR_BITS_TO_SET \
		@@ -367,6 +370,44 @@ start_ap:
		;;
		(isBP) st8 [r2]=r28 // save the address of the boot param area passed by the bootloader

		#ifdef CONFIG_PARAVIRT

		movl r14=hypervisor_setup_hooks
		movl r15=hypervisor_type
		mov r16=num_hypervisor_hooks
		;;
		ld8 r2=[r15]
		;;
		cmp.ltu p7,p0=r2,r16 // array size check
		shladd r8=r2,3,r14
		;;
		(p7) ld8 r9=[r8]
		;;
		(p7) mov b1=r9
		(p7) cmp.ne.unc p7,p0=r9,r0 // no actual branch to NULL
		;;
		(p7) br.call.sptk.many rp=b1

		__INITDATA

		default_setup_hook = 0 // Currently nothing needs to be done.

		.weak xen_setup_hook

		.global hypervisor_type
		hypervisor_type:
		data8 PARAVIRT_HYPERVISOR_TYPE_DEFAULT

		// must have the same order with PARAVIRT_HYPERVISOR_TYPE_xxx

		hypervisor_setup_hooks:
		data8 default_setup_hook
		data8 xen_setup_hook
		num_hypervisor_hooks = (. - hypervisor_setup_hooks) / 8
		.previous

		#endif

		#ifdef CONFIG_SMP
		(isAP) br.call.sptk.many rp=start_secondary
		.ret0: