Loading Documentation/cpu-hotplug.txt +12 −2 Original line number Original line Diff line number Diff line Loading @@ -44,10 +44,20 @@ maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using maxcpus=2 will only boot 2. You can choose to bring the maxcpus=2 will only boot 2. You can choose to bring the other cpus later online, read FAQ's for more info. other cpus later online, read FAQ's for more info. additional_cpus=n [x86_64 only] use this to limit hotpluggable cpus. additional_cpus*=n Use this to limit hotpluggable cpus. This option sets This option sets cpu_possible_map = cpu_present_map + additional_cpus cpu_possible_map = cpu_present_map + additional_cpus (*) Option valid only for following architectures - x86_64, ia64 ia64 and x86_64 use the number of disabled local apics in ACPI tables MADT to determine the number of potentially hot-pluggable cpus. The implementation should only rely on this to count the #of cpus, but *MUST* not rely on the apicid values in those tables for disabled apics. In the event BIOS doesnt mark such hot-pluggable cpus as disabled entries, one could use this parameter "additional_cpus=x" to represent those cpus in the cpu_possible_map. CPU maps and such CPU maps and such ----------------- ----------------- [More on cpumaps and primitive to manipulate, please check [More on cpumaps and primitive to manipulate, please check Loading Documentation/fujitsu/frv/kernel-ABI.txt 0 → 100644 +234 −0 Original line number Original line Diff line number Diff line ================================= INTERNAL KERNEL ABI FOR FR-V ARCH ================================= The internal FRV kernel ABI is not quite the same as the userspace ABI. A number of the registers are used for special purposed, and the ABI is not consistent between modules vs core, and MMU vs no-MMU. This partly stems from the fact that FRV CPUs do not have a separate supervisor stack pointer, and most of them do not have any scratch registers, thus requiring at least one general purpose register to be clobbered in such an event. Also, within the kernel core, it is possible to simply jump or call directly between functions using a relative offset. This cannot be extended to modules for the displacement is likely to be too far. Thus in modules the address of a function to call must be calculated in a register and then used, requiring two extra instructions. This document has the following sections: (*) System call register ABI (*) CPU operating modes (*) Internal kernel-mode register ABI (*) Internal debug-mode register ABI (*) Virtual interrupt handling ======================== SYSTEM CALL REGISTER ABI ======================== When a system call is made, the following registers are effective: REGISTERS CALL RETURN =============== ======================= ======================= GR7 System call number Preserved GR8 Syscall arg #1 Return value GR9-GR13 Syscall arg #2-6 Preserved =================== CPU OPERATING MODES =================== The FR-V CPU has three basic operating modes. In order of increasing capability: (1) User mode. Basic userspace running mode. (2) Kernel mode. Normal kernel mode. There are many additional control registers available that may be accessed in this mode, in addition to all the stuff available to user mode. This has two submodes: (a) Exceptions enabled (PSR.T == 1). Exceptions will invoke the appropriate normal kernel mode handler. On entry to the handler, the PSR.T bit will be cleared. (b) Exceptions disabled (PSR.T == 0). No exceptions or interrupts may happen. Any mandatory exceptions will cause the CPU to halt unless the CPU is told to jump into debug mode instead. (3) Debug mode. No exceptions may happen in this mode. Memory protection and management exceptions will be flagged for later consideration, but the exception handler won't be invoked. Debugging traps such as hardware breakpoints and watchpoints will be ignored. This mode is entered only by debugging events obtained from the other two modes. All kernel mode registers may be accessed, plus a few extra debugging specific registers. ================================= INTERNAL KERNEL-MODE REGISTER ABI ================================= There are a number of permanent register assignments that are set up by entry.S in the exception prologue. Note that there is a complete set of exception prologues for each of user->kernel transition and kernel->kernel transition. There are also user->debug and kernel->debug mode transition prologues. REGISTER FLAVOUR USE =============== ======= ==================================================== GR1 Supervisor stack pointer GR15 Current thread info pointer GR16 GP-Rel base register for small data GR28 Current exception frame pointer (__frame) GR29 Current task pointer (current) GR30 Destroyed by kernel mode entry GR31 NOMMU Destroyed by debug mode entry GR31 MMU Destroyed by TLB miss kernel mode entry CCR.ICC2 Virtual interrupt disablement tracking CCCR.CC3 Cleared by exception prologue (atomic op emulation) SCR0 MMU See mmu-layout.txt. SCR1 MMU See mmu-layout.txt. SCR2 MMU Save for EAR0 (destroyed by icache insns in debug mode) SCR3 MMU Save for GR31 during debug exceptions DAMR/IAMR NOMMU Fixed memory protection layout. DAMR/IAMR MMU See mmu-layout.txt. Certain registers are also used or modified across function calls: REGISTER CALL RETURN =============== =============================== =============================== GR0 Fixed Zero - GR2 Function call frame pointer GR3 Special Preserved GR3-GR7 - Clobbered GR8 Function call arg #1 Return value (or clobbered) GR9 Function call arg #2 Return value MSW (or clobbered) GR10-GR13 Function call arg #3-#6 Clobbered GR14 - Clobbered GR15-GR16 Special Preserved GR17-GR27 - Preserved GR28-GR31 Special Only accessed explicitly LR Return address after CALL Clobbered CCR/CCCR - Mostly Clobbered ================================ INTERNAL DEBUG-MODE REGISTER ABI ================================ This is the same as the kernel-mode register ABI for functions calls. The difference is that in debug-mode there's a different stack and a different exception frame. Almost all the global registers from kernel-mode (including the stack pointer) may be changed. REGISTER FLAVOUR USE =============== ======= ==================================================== GR1 Debug stack pointer GR16 GP-Rel base register for small data GR31 Current debug exception frame pointer (__debug_frame) SCR3 MMU Saved value of GR31 Note that debug mode is able to interfere with the kernel's emulated atomic ops, so it must be exceedingly careful not to do any that would interact with the main kernel in this regard. Hence the debug mode code (gdbstub) is almost completely self-contained. The only external code used is the sprintf family of functions. Futhermore, break.S is so complicated because single-step mode does not switch off on entry to an exception. That means unless manually disabled, single-stepping will blithely go on stepping into things like interrupts. See gdbstub.txt for more information. ========================== VIRTUAL INTERRUPT HANDLING ========================== Because accesses to the PSR is so slow, and to disable interrupts we have to access it twice (once to read and once to write), we don't actually disable interrupts at all if we don't have to. What we do instead is use the ICC2 condition code flags to note virtual disablement, such that if we then do take an interrupt, we note the flag, really disable interrupts, set another flag and resume execution at the point the interrupt happened. Setting condition flags as a side effect of an arithmetic or logical instruction is really fast. This use of the ICC2 only occurs within the kernel - it does not affect userspace. The flags we use are: (*) CCR.ICC2.Z [Zero flag] Set to virtually disable interrupts, clear when interrupts are virtually enabled. Can be modified by logical instructions without affecting the Carry flag. (*) CCR.ICC2.C [Carry flag] Clear to indicate hardware interrupts are really disabled, set otherwise. What happens is this: (1) Normal kernel-mode operation. ICC2.Z is 0, ICC2.C is 1. (2) An interrupt occurs. The exception prologue examines ICC2.Z and determines that nothing needs doing. This is done simply with an unlikely BEQ instruction. (3) The interrupts are disabled (local_irq_disable) ICC2.Z is set to 1. (4) If interrupts were then re-enabled (local_irq_enable): ICC2.Z would be set to 0. A TIHI #2 instruction (trap #2 if condition HI - Z==0 && C==0) would be used to trap if interrupts were now virtually enabled, but physically disabled - which they're not, so the trap isn't taken. The kernel would then be back to state (1). (5) An interrupt occurs. The exception prologue examines ICC2.Z and determines that the interrupt shouldn't actually have happened. It jumps aside, and there disabled interrupts by setting PSR.PIL to 14 and then it clears ICC2.C. (6) If interrupts were then saved and disabled again (local_irq_save): ICC2.Z would be shifted into the save variable and masked off (giving a 1). ICC2.Z would then be set to 1 (thus unchanged), and ICC2.C would be unaffected (ie: 0). (7) If interrupts were then restored from state (6) (local_irq_restore): ICC2.Z would be set to indicate the result of XOR'ing the saved value (ie: 1) with 1, which gives a result of 0 - thus leaving ICC2.Z set. ICC2.C would remain unaffected (ie: 0). A TIHI #2 instruction would be used to again assay the current state, but this would do nothing as Z==1. (8) If interrupts were then enabled (local_irq_enable): ICC2.Z would be cleared. ICC2.C would be left unaffected. Both flags would now be 0. A TIHI #2 instruction again issued to assay the current state would then trap as both Z==0 [interrupts virtually enabled] and C==0 [interrupts really disabled] would then be true. (9) The trap #2 handler would simply enable hardware interrupts (set PSR.PIL to 0), set ICC2.C to 1 and return. (10) Immediately upon returning, the pending interrupt would be taken. (11) The interrupt handler would take the path of actually processing the interrupt (ICC2.Z is clear, BEQ fails as per step (2)). (12) The interrupt handler would then set ICC2.C to 1 since hardware interrupts are definitely enabled - or else the kernel wouldn't be here. (13) On return from the interrupt handler, things would be back to state (1). This trap (#2) is only available in kernel mode. In user mode it will result in SIGILL. Documentation/hwmon/w83627hf +4 −0 Original line number Original line Diff line number Diff line Loading @@ -36,6 +36,10 @@ Module Parameters (default is 1) (default is 1) Use 'init=0' to bypass initializing the chip. Use 'init=0' to bypass initializing the chip. Try this if your computer crashes when you load the module. Try this if your computer crashes when you load the module. * reset: int (default is 0) The driver used to reset the chip on load, but does no more. Use 'reset=1' to restore the old behavior. Report if you need to do this. Description Description ----------- ----------- Loading Documentation/kernel-parameters.txt +5 −0 Original line number Original line Diff line number Diff line Loading @@ -1133,6 +1133,8 @@ running once the system is up. Mechanism 1. Mechanism 1. conf2 [IA-32] Force use of PCI Configuration conf2 [IA-32] Force use of PCI Configuration Mechanism 2. Mechanism 2. nommconf [IA-32,X86_64] Disable use of MMCONFIG for PCI Configuration nosort [IA-32] Don't sort PCI devices according to nosort [IA-32] Don't sort PCI devices according to order given by the PCI BIOS. This sorting is order given by the PCI BIOS. This sorting is done to get a device order compatible with done to get a device order compatible with Loading Loading @@ -1636,6 +1638,9 @@ running once the system is up. Format: Format: <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]] <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]] norandmaps Don't use address space randomization Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space ______________________________________________________________________ ______________________________________________________________________ Changelog: Changelog: Loading Documentation/kprobes.txt +42 −39 Original line number Original line Diff line number Diff line Loading @@ -136,17 +136,20 @@ Kprobes, jprobes, and return probes are implemented on the following architectures: architectures: - i386 - i386 - x86_64 (AMD-64, E64MT) - x86_64 (AMD-64, EM64T) - ppc64 - ppc64 - ia64 (Support for probes on certain instruction types is still in progress.) - ia64 (Does not support probes on instruction slot1.) - sparc64 (Return probes not yet implemented.) - sparc64 (Return probes not yet implemented.) 3. Configuring Kprobes 3. Configuring Kprobes When configuring the kernel using make menuconfig/xconfig/oldconfig, When configuring the kernel using make menuconfig/xconfig/oldconfig, ensure that CONFIG_KPROBES is set to "y". Under "Kernel hacking", ensure that CONFIG_KPROBES is set to "y". Under "Instrumentation look for "Kprobes". You may have to enable "Kernel debugging" Support", look for "Kprobes". (CONFIG_DEBUG_KERNEL) before you can enable Kprobes. So that you can load and unload Kprobes-based instrumentation modules, make sure "Loadable module support" (CONFIG_MODULES) and "Module unloading" (CONFIG_MODULE_UNLOAD) are set to "y". You may also want to ensure that CONFIG_KALLSYMS and perhaps even You may also want to ensure that CONFIG_KALLSYMS and perhaps even CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name() CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name() Loading Loading @@ -262,18 +265,18 @@ at any time after the probe has been registered. 5. Kprobes Features and Limitations 5. Kprobes Features and Limitations As of Linux v2.6.12, Kprobes allows multiple probes at the same Kprobes allows multiple probes at the same address. Currently, address. Currently, however, there cannot be multiple jprobes on however, there cannot be multiple jprobes on the same function at the same function at the same time. the same time. In general, you can install a probe anywhere in the kernel. In general, you can install a probe anywhere in the kernel. In particular, you can probe interrupt handlers. Known exceptions In particular, you can probe interrupt handlers. Known exceptions are discussed in this section. are discussed in this section. For obvious reasons, it's a bad idea to install a probe in The register_*probe functions will return -EINVAL if you attempt the code that implements Kprobes (mostly kernel/kprobes.c and to install a probe in the code that implements Kprobes (mostly arch/*/kernel/kprobes.c). A patch in the v2.6.13 timeframe instructs kernel/kprobes.c and arch/*/kernel/kprobes.c, but also functions such Kprobes to reject such requests. as do_page_fault and notifier_call_chain). If you install a probe in an inline-able function, Kprobes makes If you install a probe in an inline-able function, Kprobes makes no attempt to chase down all inline instances of the function and no attempt to chase down all inline instances of the function and Loading @@ -290,18 +293,14 @@ from the accidental ones. Don't drink and probe. Kprobes makes no attempt to prevent probe handlers from stepping on Kprobes makes no attempt to prevent probe handlers from stepping on each other -- e.g., probing printk() and then calling printk() from a each other -- e.g., probing printk() and then calling printk() from a probe handler. As of Linux v2.6.12, if a probe handler hits a probe, probe handler. If a probe handler hits a probe, that second probe's that second probe's handlers won't be run in that instance. handlers won't be run in that instance, and the kprobe.nmissed member of the second probe will be incremented. In Linux v2.6.12 and previous versions, Kprobes' data structures are protected by a single lock that is held during probe registration and As of Linux v2.6.15-rc1, multiple handlers (or multiple instances of unregistration and while handlers are run. Thus, no two handlers the same handler) may run concurrently on different CPUs. can run simultaneously. To improve scalability on SMP systems, this restriction will probably be removed soon, in which case Kprobes does not use mutexes or allocate memory except during multiple handlers (or multiple instances of the same handler) may run concurrently on different CPUs. Code your handlers accordingly. Kprobes does not use semaphores or allocate memory except during registration and unregistration. registration and unregistration. Probe handlers are run with preemption disabled. Depending on the Probe handlers are run with preemption disabled. Depending on the Loading @@ -316,11 +315,18 @@ address instead of the real return address for kretprobed functions. (As far as we can tell, __builtin_return_address() is used only (As far as we can tell, __builtin_return_address() is used only for instrumentation and error reporting.) for instrumentation and error reporting.) If the number of times a function is called does not match the If the number of times a function is called does not match the number number of times it returns, registering a return probe on that of times it returns, registering a return probe on that function may function may produce undesirable results. We have the do_exit() produce undesirable results. We have the do_exit() case covered. and do_execve() cases covered. do_fork() is not an issue. We're do_execve() and do_fork() are not an issue. We're unaware of other unaware of other specific cases where this could be a problem. specific cases where this could be a problem. If, upon entry to or exit from a function, the CPU is running on a stack other than that of the current task, registering a return probe on that function may produce undesirable results. For this reason, Kprobes doesn't support return probes (or kprobes or jprobes) on the x86_64 version of __switch_to(); the registration functions return -EINVAL. 6. Probe Overhead 6. Probe Overhead Loading @@ -347,14 +353,12 @@ k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99 7. TODO 7. TODO a. SystemTap (http://sourceware.org/systemtap): Work in progress a. SystemTap (http://sourceware.org/systemtap): Provides a simplified to provide a simplified programming interface for probe-based programming interface for probe-based instrumentation. Try it out. instrumentation. b. Kernel return probes for sparc64. b. Improved SMP scalability: Currently, work is in progress to handle c. Support for other architectures. multiple kprobes in parallel. d. User-space probes. c. Kernel return probes for sparc64. e. Watchpoint probes (which fire on data references). d. Support for other architectures. e. User-space probes. 8. Kprobes Example 8. Kprobes Example Loading Loading @@ -411,8 +415,7 @@ int init_module(void) printk("Couldn't find %s to plant kprobe\n", "do_fork"); printk("Couldn't find %s to plant kprobe\n", "do_fork"); return -1; return -1; } } ret = register_kprobe(&kp); if ((ret = register_kprobe(&kp) < 0)) { if (ret < 0) { printk("register_kprobe failed, returned %d\n", ret); printk("register_kprobe failed, returned %d\n", ret); return -1; return -1; } } Loading Loading
Documentation/cpu-hotplug.txt +12 −2 Original line number Original line Diff line number Diff line Loading @@ -44,10 +44,20 @@ maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using maxcpus=2 will only boot 2. You can choose to bring the maxcpus=2 will only boot 2. You can choose to bring the other cpus later online, read FAQ's for more info. other cpus later online, read FAQ's for more info. additional_cpus=n [x86_64 only] use this to limit hotpluggable cpus. additional_cpus*=n Use this to limit hotpluggable cpus. This option sets This option sets cpu_possible_map = cpu_present_map + additional_cpus cpu_possible_map = cpu_present_map + additional_cpus (*) Option valid only for following architectures - x86_64, ia64 ia64 and x86_64 use the number of disabled local apics in ACPI tables MADT to determine the number of potentially hot-pluggable cpus. The implementation should only rely on this to count the #of cpus, but *MUST* not rely on the apicid values in those tables for disabled apics. In the event BIOS doesnt mark such hot-pluggable cpus as disabled entries, one could use this parameter "additional_cpus=x" to represent those cpus in the cpu_possible_map. CPU maps and such CPU maps and such ----------------- ----------------- [More on cpumaps and primitive to manipulate, please check [More on cpumaps and primitive to manipulate, please check Loading
Documentation/fujitsu/frv/kernel-ABI.txt 0 → 100644 +234 −0 Original line number Original line Diff line number Diff line ================================= INTERNAL KERNEL ABI FOR FR-V ARCH ================================= The internal FRV kernel ABI is not quite the same as the userspace ABI. A number of the registers are used for special purposed, and the ABI is not consistent between modules vs core, and MMU vs no-MMU. This partly stems from the fact that FRV CPUs do not have a separate supervisor stack pointer, and most of them do not have any scratch registers, thus requiring at least one general purpose register to be clobbered in such an event. Also, within the kernel core, it is possible to simply jump or call directly between functions using a relative offset. This cannot be extended to modules for the displacement is likely to be too far. Thus in modules the address of a function to call must be calculated in a register and then used, requiring two extra instructions. This document has the following sections: (*) System call register ABI (*) CPU operating modes (*) Internal kernel-mode register ABI (*) Internal debug-mode register ABI (*) Virtual interrupt handling ======================== SYSTEM CALL REGISTER ABI ======================== When a system call is made, the following registers are effective: REGISTERS CALL RETURN =============== ======================= ======================= GR7 System call number Preserved GR8 Syscall arg #1 Return value GR9-GR13 Syscall arg #2-6 Preserved =================== CPU OPERATING MODES =================== The FR-V CPU has three basic operating modes. In order of increasing capability: (1) User mode. Basic userspace running mode. (2) Kernel mode. Normal kernel mode. There are many additional control registers available that may be accessed in this mode, in addition to all the stuff available to user mode. This has two submodes: (a) Exceptions enabled (PSR.T == 1). Exceptions will invoke the appropriate normal kernel mode handler. On entry to the handler, the PSR.T bit will be cleared. (b) Exceptions disabled (PSR.T == 0). No exceptions or interrupts may happen. Any mandatory exceptions will cause the CPU to halt unless the CPU is told to jump into debug mode instead. (3) Debug mode. No exceptions may happen in this mode. Memory protection and management exceptions will be flagged for later consideration, but the exception handler won't be invoked. Debugging traps such as hardware breakpoints and watchpoints will be ignored. This mode is entered only by debugging events obtained from the other two modes. All kernel mode registers may be accessed, plus a few extra debugging specific registers. ================================= INTERNAL KERNEL-MODE REGISTER ABI ================================= There are a number of permanent register assignments that are set up by entry.S in the exception prologue. Note that there is a complete set of exception prologues for each of user->kernel transition and kernel->kernel transition. There are also user->debug and kernel->debug mode transition prologues. REGISTER FLAVOUR USE =============== ======= ==================================================== GR1 Supervisor stack pointer GR15 Current thread info pointer GR16 GP-Rel base register for small data GR28 Current exception frame pointer (__frame) GR29 Current task pointer (current) GR30 Destroyed by kernel mode entry GR31 NOMMU Destroyed by debug mode entry GR31 MMU Destroyed by TLB miss kernel mode entry CCR.ICC2 Virtual interrupt disablement tracking CCCR.CC3 Cleared by exception prologue (atomic op emulation) SCR0 MMU See mmu-layout.txt. SCR1 MMU See mmu-layout.txt. SCR2 MMU Save for EAR0 (destroyed by icache insns in debug mode) SCR3 MMU Save for GR31 during debug exceptions DAMR/IAMR NOMMU Fixed memory protection layout. DAMR/IAMR MMU See mmu-layout.txt. Certain registers are also used or modified across function calls: REGISTER CALL RETURN =============== =============================== =============================== GR0 Fixed Zero - GR2 Function call frame pointer GR3 Special Preserved GR3-GR7 - Clobbered GR8 Function call arg #1 Return value (or clobbered) GR9 Function call arg #2 Return value MSW (or clobbered) GR10-GR13 Function call arg #3-#6 Clobbered GR14 - Clobbered GR15-GR16 Special Preserved GR17-GR27 - Preserved GR28-GR31 Special Only accessed explicitly LR Return address after CALL Clobbered CCR/CCCR - Mostly Clobbered ================================ INTERNAL DEBUG-MODE REGISTER ABI ================================ This is the same as the kernel-mode register ABI for functions calls. The difference is that in debug-mode there's a different stack and a different exception frame. Almost all the global registers from kernel-mode (including the stack pointer) may be changed. REGISTER FLAVOUR USE =============== ======= ==================================================== GR1 Debug stack pointer GR16 GP-Rel base register for small data GR31 Current debug exception frame pointer (__debug_frame) SCR3 MMU Saved value of GR31 Note that debug mode is able to interfere with the kernel's emulated atomic ops, so it must be exceedingly careful not to do any that would interact with the main kernel in this regard. Hence the debug mode code (gdbstub) is almost completely self-contained. The only external code used is the sprintf family of functions. Futhermore, break.S is so complicated because single-step mode does not switch off on entry to an exception. That means unless manually disabled, single-stepping will blithely go on stepping into things like interrupts. See gdbstub.txt for more information. ========================== VIRTUAL INTERRUPT HANDLING ========================== Because accesses to the PSR is so slow, and to disable interrupts we have to access it twice (once to read and once to write), we don't actually disable interrupts at all if we don't have to. What we do instead is use the ICC2 condition code flags to note virtual disablement, such that if we then do take an interrupt, we note the flag, really disable interrupts, set another flag and resume execution at the point the interrupt happened. Setting condition flags as a side effect of an arithmetic or logical instruction is really fast. This use of the ICC2 only occurs within the kernel - it does not affect userspace. The flags we use are: (*) CCR.ICC2.Z [Zero flag] Set to virtually disable interrupts, clear when interrupts are virtually enabled. Can be modified by logical instructions without affecting the Carry flag. (*) CCR.ICC2.C [Carry flag] Clear to indicate hardware interrupts are really disabled, set otherwise. What happens is this: (1) Normal kernel-mode operation. ICC2.Z is 0, ICC2.C is 1. (2) An interrupt occurs. The exception prologue examines ICC2.Z and determines that nothing needs doing. This is done simply with an unlikely BEQ instruction. (3) The interrupts are disabled (local_irq_disable) ICC2.Z is set to 1. (4) If interrupts were then re-enabled (local_irq_enable): ICC2.Z would be set to 0. A TIHI #2 instruction (trap #2 if condition HI - Z==0 && C==0) would be used to trap if interrupts were now virtually enabled, but physically disabled - which they're not, so the trap isn't taken. The kernel would then be back to state (1). (5) An interrupt occurs. The exception prologue examines ICC2.Z and determines that the interrupt shouldn't actually have happened. It jumps aside, and there disabled interrupts by setting PSR.PIL to 14 and then it clears ICC2.C. (6) If interrupts were then saved and disabled again (local_irq_save): ICC2.Z would be shifted into the save variable and masked off (giving a 1). ICC2.Z would then be set to 1 (thus unchanged), and ICC2.C would be unaffected (ie: 0). (7) If interrupts were then restored from state (6) (local_irq_restore): ICC2.Z would be set to indicate the result of XOR'ing the saved value (ie: 1) with 1, which gives a result of 0 - thus leaving ICC2.Z set. ICC2.C would remain unaffected (ie: 0). A TIHI #2 instruction would be used to again assay the current state, but this would do nothing as Z==1. (8) If interrupts were then enabled (local_irq_enable): ICC2.Z would be cleared. ICC2.C would be left unaffected. Both flags would now be 0. A TIHI #2 instruction again issued to assay the current state would then trap as both Z==0 [interrupts virtually enabled] and C==0 [interrupts really disabled] would then be true. (9) The trap #2 handler would simply enable hardware interrupts (set PSR.PIL to 0), set ICC2.C to 1 and return. (10) Immediately upon returning, the pending interrupt would be taken. (11) The interrupt handler would take the path of actually processing the interrupt (ICC2.Z is clear, BEQ fails as per step (2)). (12) The interrupt handler would then set ICC2.C to 1 since hardware interrupts are definitely enabled - or else the kernel wouldn't be here. (13) On return from the interrupt handler, things would be back to state (1). This trap (#2) is only available in kernel mode. In user mode it will result in SIGILL.
Documentation/hwmon/w83627hf +4 −0 Original line number Original line Diff line number Diff line Loading @@ -36,6 +36,10 @@ Module Parameters (default is 1) (default is 1) Use 'init=0' to bypass initializing the chip. Use 'init=0' to bypass initializing the chip. Try this if your computer crashes when you load the module. Try this if your computer crashes when you load the module. * reset: int (default is 0) The driver used to reset the chip on load, but does no more. Use 'reset=1' to restore the old behavior. Report if you need to do this. Description Description ----------- ----------- Loading
Documentation/kernel-parameters.txt +5 −0 Original line number Original line Diff line number Diff line Loading @@ -1133,6 +1133,8 @@ running once the system is up. Mechanism 1. Mechanism 1. conf2 [IA-32] Force use of PCI Configuration conf2 [IA-32] Force use of PCI Configuration Mechanism 2. Mechanism 2. nommconf [IA-32,X86_64] Disable use of MMCONFIG for PCI Configuration nosort [IA-32] Don't sort PCI devices according to nosort [IA-32] Don't sort PCI devices according to order given by the PCI BIOS. This sorting is order given by the PCI BIOS. This sorting is done to get a device order compatible with done to get a device order compatible with Loading Loading @@ -1636,6 +1638,9 @@ running once the system is up. Format: Format: <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]] <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]] norandmaps Don't use address space randomization Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space ______________________________________________________________________ ______________________________________________________________________ Changelog: Changelog: Loading
Documentation/kprobes.txt +42 −39 Original line number Original line Diff line number Diff line Loading @@ -136,17 +136,20 @@ Kprobes, jprobes, and return probes are implemented on the following architectures: architectures: - i386 - i386 - x86_64 (AMD-64, E64MT) - x86_64 (AMD-64, EM64T) - ppc64 - ppc64 - ia64 (Support for probes on certain instruction types is still in progress.) - ia64 (Does not support probes on instruction slot1.) - sparc64 (Return probes not yet implemented.) - sparc64 (Return probes not yet implemented.) 3. Configuring Kprobes 3. Configuring Kprobes When configuring the kernel using make menuconfig/xconfig/oldconfig, When configuring the kernel using make menuconfig/xconfig/oldconfig, ensure that CONFIG_KPROBES is set to "y". Under "Kernel hacking", ensure that CONFIG_KPROBES is set to "y". Under "Instrumentation look for "Kprobes". You may have to enable "Kernel debugging" Support", look for "Kprobes". (CONFIG_DEBUG_KERNEL) before you can enable Kprobes. So that you can load and unload Kprobes-based instrumentation modules, make sure "Loadable module support" (CONFIG_MODULES) and "Module unloading" (CONFIG_MODULE_UNLOAD) are set to "y". You may also want to ensure that CONFIG_KALLSYMS and perhaps even You may also want to ensure that CONFIG_KALLSYMS and perhaps even CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name() CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name() Loading Loading @@ -262,18 +265,18 @@ at any time after the probe has been registered. 5. Kprobes Features and Limitations 5. Kprobes Features and Limitations As of Linux v2.6.12, Kprobes allows multiple probes at the same Kprobes allows multiple probes at the same address. Currently, address. Currently, however, there cannot be multiple jprobes on however, there cannot be multiple jprobes on the same function at the same function at the same time. the same time. In general, you can install a probe anywhere in the kernel. In general, you can install a probe anywhere in the kernel. In particular, you can probe interrupt handlers. Known exceptions In particular, you can probe interrupt handlers. Known exceptions are discussed in this section. are discussed in this section. For obvious reasons, it's a bad idea to install a probe in The register_*probe functions will return -EINVAL if you attempt the code that implements Kprobes (mostly kernel/kprobes.c and to install a probe in the code that implements Kprobes (mostly arch/*/kernel/kprobes.c). A patch in the v2.6.13 timeframe instructs kernel/kprobes.c and arch/*/kernel/kprobes.c, but also functions such Kprobes to reject such requests. as do_page_fault and notifier_call_chain). If you install a probe in an inline-able function, Kprobes makes If you install a probe in an inline-able function, Kprobes makes no attempt to chase down all inline instances of the function and no attempt to chase down all inline instances of the function and Loading @@ -290,18 +293,14 @@ from the accidental ones. Don't drink and probe. Kprobes makes no attempt to prevent probe handlers from stepping on Kprobes makes no attempt to prevent probe handlers from stepping on each other -- e.g., probing printk() and then calling printk() from a each other -- e.g., probing printk() and then calling printk() from a probe handler. As of Linux v2.6.12, if a probe handler hits a probe, probe handler. If a probe handler hits a probe, that second probe's that second probe's handlers won't be run in that instance. handlers won't be run in that instance, and the kprobe.nmissed member of the second probe will be incremented. In Linux v2.6.12 and previous versions, Kprobes' data structures are protected by a single lock that is held during probe registration and As of Linux v2.6.15-rc1, multiple handlers (or multiple instances of unregistration and while handlers are run. Thus, no two handlers the same handler) may run concurrently on different CPUs. can run simultaneously. To improve scalability on SMP systems, this restriction will probably be removed soon, in which case Kprobes does not use mutexes or allocate memory except during multiple handlers (or multiple instances of the same handler) may run concurrently on different CPUs. Code your handlers accordingly. Kprobes does not use semaphores or allocate memory except during registration and unregistration. registration and unregistration. Probe handlers are run with preemption disabled. Depending on the Probe handlers are run with preemption disabled. Depending on the Loading @@ -316,11 +315,18 @@ address instead of the real return address for kretprobed functions. (As far as we can tell, __builtin_return_address() is used only (As far as we can tell, __builtin_return_address() is used only for instrumentation and error reporting.) for instrumentation and error reporting.) If the number of times a function is called does not match the If the number of times a function is called does not match the number number of times it returns, registering a return probe on that of times it returns, registering a return probe on that function may function may produce undesirable results. We have the do_exit() produce undesirable results. We have the do_exit() case covered. and do_execve() cases covered. do_fork() is not an issue. We're do_execve() and do_fork() are not an issue. We're unaware of other unaware of other specific cases where this could be a problem. specific cases where this could be a problem. If, upon entry to or exit from a function, the CPU is running on a stack other than that of the current task, registering a return probe on that function may produce undesirable results. For this reason, Kprobes doesn't support return probes (or kprobes or jprobes) on the x86_64 version of __switch_to(); the registration functions return -EINVAL. 6. Probe Overhead 6. Probe Overhead Loading @@ -347,14 +353,12 @@ k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99 7. TODO 7. TODO a. SystemTap (http://sourceware.org/systemtap): Work in progress a. SystemTap (http://sourceware.org/systemtap): Provides a simplified to provide a simplified programming interface for probe-based programming interface for probe-based instrumentation. Try it out. instrumentation. b. Kernel return probes for sparc64. b. Improved SMP scalability: Currently, work is in progress to handle c. Support for other architectures. multiple kprobes in parallel. d. User-space probes. c. Kernel return probes for sparc64. e. Watchpoint probes (which fire on data references). d. Support for other architectures. e. User-space probes. 8. Kprobes Example 8. Kprobes Example Loading Loading @@ -411,8 +415,7 @@ int init_module(void) printk("Couldn't find %s to plant kprobe\n", "do_fork"); printk("Couldn't find %s to plant kprobe\n", "do_fork"); return -1; return -1; } } ret = register_kprobe(&kp); if ((ret = register_kprobe(&kp) < 0)) { if (ret < 0) { printk("register_kprobe failed, returned %d\n", ret); printk("register_kprobe failed, returned %d\n", ret); return -1; return -1; } } Loading