diff options
Diffstat (limited to 'Documentation/kprobes.txt')
| -rw-r--r-- | Documentation/kprobes.txt | 711 |
1 files changed, 426 insertions, 285 deletions
diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt index 0541fe1de70..4bbeca8483e 100644 --- a/Documentation/kprobes.txt +++ b/Documentation/kprobes.txt @@ -1,6 +1,7 @@ Title : Kernel Probes (Kprobes) Authors : Jim Keniston <jkenisto@us.ibm.com> - : Prasanna S Panchamukhi <prasanna@in.ibm.com> + : Prasanna S Panchamukhi <prasanna.panchamukhi@gmail.com> + : Masami Hiramatsu <mhiramat@redhat.com> CONTENTS @@ -14,13 +15,16 @@ CONTENTS 8. Kprobes Example 9. Jprobes Example 10. Kretprobes Example +Appendix A: The kprobes debugfs interface +Appendix B: The kprobes sysctl interface 1. Concepts: Kprobes, Jprobes, Return Probes Kprobes enables you to dynamically break into any kernel routine and collect debugging and performance information non-disruptively. You -can trap at almost any kernel code address, specifying a handler +can trap at almost any kernel code address(*), specifying a handler routine to be invoked when the breakpoint is hit. +(*: some parts of the kernel code can not be trapped, see 1.5 Blacklist) There are currently three types of probes: kprobes, jprobes, and kretprobes (also called return probes). A kprobe can be inserted @@ -36,13 +40,18 @@ registration function such as register_kprobe() specifies where the probe is to be inserted and what handler is to be called when the probe is hit. -The next three subsections explain how the different types of -probes work. They explain certain things that you'll need to -know in order to make the best use of Kprobes -- e.g., the -difference between a pre_handler and a post_handler, and how -to use the maxactive and nmissed fields of a kretprobe. But -if you're in a hurry to start using Kprobes, you can skip ahead -to section 2. +There are also register_/unregister_*probes() functions for batch +registration/unregistration of a group of *probes. These functions +can speed up unregistration process when you have to unregister +a lot of probes at once. + +The next four subsections explain how the different types of +probes work and how jump optimization works. They explain certain +things that you'll need to know in order to make the best use of +Kprobes -- e.g., the difference between a pre_handler and +a post_handler, and how to use the maxactive and nmissed fields of +a kretprobe. But if you're in a hurry to start using Kprobes, you +can skip ahead to section 2. 1.1 How Does a Kprobe Work? @@ -91,11 +100,12 @@ handler has run. Up to MAX_STACK_SIZE bytes are copied -- e.g., 64 bytes on i386. Note that the probed function's args may be passed on the stack -or in registers (e.g., for x86_64 or for an i386 fastcall function). -The jprobe will work in either case, so long as the handler's -prototype matches that of the probed function. +or in registers. The jprobe will work in either case, so long as the +handler's prototype matches that of the probed function. + +1.3 Return Probes -1.3 How Does a Return Probe Work? +1.3.1 How Does a Return Probe Work? When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this @@ -106,9 +116,9 @@ At boot time, Kprobes registers a kprobe at the trampoline. When the probed function executes its return instruction, control passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, -then sets the saved instruction pointer to the saved return address, -and that's where execution resumes upon return from the trap. +handler calls the user-specified return handler associated with the +kretprobe, then sets the saved instruction pointer to the saved return +address, and that's where execution resumes upon return from the trap. While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling @@ -130,27 +140,180 @@ zero when the return probe is registered, and is incremented every time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe. +1.3.2 Kretprobe entry-handler + +Kretprobes also provides an optional user-specified handler which runs +on function entry. This handler is specified by setting the entry_handler +field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the +function entry is hit, the user-defined entry_handler, if any, is invoked. +If the entry_handler returns 0 (success) then a corresponding return handler +is guaranteed to be called upon function return. If the entry_handler +returns a non-zero error then Kprobes leaves the return address as is, and +the kretprobe has no further effect for that particular function instance. + +Multiple entry and return handler invocations are matched using the unique +kretprobe_instance object associated with them. Additionally, a user +may also specify per return-instance private data to be part of each +kretprobe_instance object. This is especially useful when sharing private +data between corresponding user entry and return handlers. The size of each +private data object can be specified at kretprobe registration time by +setting the data_size field of the kretprobe struct. This data can be +accessed through the data field of each kretprobe_instance object. + +In case probed function is entered but there is no kretprobe_instance +object available, then in addition to incrementing the nmissed count, +the user entry_handler invocation is also skipped. + +1.4 How Does Jump Optimization Work? + +If your kernel is built with CONFIG_OPTPROBES=y (currently this flag +is automatically set 'y' on x86/x86-64, non-preemptive kernel) and +the "debug.kprobes_optimization" kernel parameter is set to 1 (see +sysctl(8)), Kprobes tries to reduce probe-hit overhead by using a jump +instruction instead of a breakpoint instruction at each probepoint. + +1.4.1 Init a Kprobe + +When a probe is registered, before attempting this optimization, +Kprobes inserts an ordinary, breakpoint-based kprobe at the specified +address. So, even if it's not possible to optimize this particular +probepoint, there'll be a probe there. + +1.4.2 Safety Check + +Before optimizing a probe, Kprobes performs the following safety checks: + +- Kprobes verifies that the region that will be replaced by the jump +instruction (the "optimized region") lies entirely within one function. +(A jump instruction is multiple bytes, and so may overlay multiple +instructions.) + +- Kprobes analyzes the entire function and verifies that there is no +jump into the optimized region. Specifically: + - the function contains no indirect jump; + - the function contains no instruction that causes an exception (since + the fixup code triggered by the exception could jump back into the + optimized region -- Kprobes checks the exception tables to verify this); + and + - there is no near jump to the optimized region (other than to the first + byte). + +- For each instruction in the optimized region, Kprobes verifies that +the instruction can be executed out of line. + +1.4.3 Preparing Detour Buffer + +Next, Kprobes prepares a "detour" buffer, which contains the following +instruction sequence: +- code to push the CPU's registers (emulating a breakpoint trap) +- a call to the trampoline code which calls user's probe handlers. +- code to restore registers +- the instructions from the optimized region +- a jump back to the original execution path. + +1.4.4 Pre-optimization + +After preparing the detour buffer, Kprobes verifies that none of the +following situations exist: +- The probe has either a break_handler (i.e., it's a jprobe) or a +post_handler. +- Other instructions in the optimized region are probed. +- The probe is disabled. +In any of the above cases, Kprobes won't start optimizing the probe. +Since these are temporary situations, Kprobes tries to start +optimizing it again if the situation is changed. + +If the kprobe can be optimized, Kprobes enqueues the kprobe to an +optimizing list, and kicks the kprobe-optimizer workqueue to optimize +it. If the to-be-optimized probepoint is hit before being optimized, +Kprobes returns control to the original instruction path by setting +the CPU's instruction pointer to the copied code in the detour buffer +-- thus at least avoiding the single-step. + +1.4.5 Optimization + +The Kprobe-optimizer doesn't insert the jump instruction immediately; +rather, it calls synchronize_sched() for safety first, because it's +possible for a CPU to be interrupted in the middle of executing the +optimized region(*). As you know, synchronize_sched() can ensure +that all interruptions that were active when synchronize_sched() +was called are done, but only if CONFIG_PREEMPT=n. So, this version +of kprobe optimization supports only kernels with CONFIG_PREEMPT=n.(**) + +After that, the Kprobe-optimizer calls stop_machine() to replace +the optimized region with a jump instruction to the detour buffer, +using text_poke_smp(). + +1.4.6 Unoptimization + +When an optimized kprobe is unregistered, disabled, or blocked by +another kprobe, it will be unoptimized. If this happens before +the optimization is complete, the kprobe is just dequeued from the +optimized list. If the optimization has been done, the jump is +replaced with the original code (except for an int3 breakpoint in +the first byte) by using text_poke_smp(). + +(*)Please imagine that the 2nd instruction is interrupted and then +the optimizer replaces the 2nd instruction with the jump *address* +while the interrupt handler is running. When the interrupt +returns to original address, there is no valid instruction, +and it causes an unexpected result. + +(**)This optimization-safety checking may be replaced with the +stop-machine method that ksplice uses for supporting a CONFIG_PREEMPT=y +kernel. + +NOTE for geeks: +The jump optimization changes the kprobe's pre_handler behavior. +Without optimization, the pre_handler can change the kernel's execution +path by changing regs->ip and returning 1. However, when the probe +is optimized, that modification is ignored. Thus, if you want to +tweak the kernel's execution path, you need to suppress optimization, +using one of the following techniques: +- Specify an empty function for the kprobe's post_handler or break_handler. + or +- Execute 'sysctl -w debug.kprobes_optimization=n' + +1.5 Blacklist + +Kprobes can probe most of the kernel except itself. This means +that there are some functions where kprobes cannot probe. Probing +(trapping) such functions can cause a recursive trap (e.g. double +fault) or the nested probe handler may never be called. +Kprobes manages such functions as a blacklist. +If you want to add a function into the blacklist, you just need +to (1) include linux/kprobes.h and (2) use NOKPROBE_SYMBOL() macro +to specify a blacklisted function. +Kprobes checks the given probe address against the blacklist and +rejects registering it, if the given address is in the blacklist. + 2. Architectures Supported Kprobes, jprobes, and return probes are implemented on the following architectures: -- i386 -- x86_64 (AMD-64, E64MT) +- i386 (Supports jump optimization) +- x86_64 (AMD-64, EM64T) (Supports jump optimization) - ppc64 -- ia64 (Support for probes on certain instruction types is still in progress.) +- ia64 (Does not support probes on instruction slot1.) - sparc64 (Return probes not yet implemented.) +- arm +- ppc +- mips 3. Configuring Kprobes When configuring the kernel using make menuconfig/xconfig/oldconfig, -ensure that CONFIG_KPROBES is set to "y". Under "Kernel hacking", -look for "Kprobes". You may have to enable "Kernel debugging" -(CONFIG_DEBUG_KERNEL) before you can enable Kprobes. +ensure that CONFIG_KPROBES is set to "y". Under "Instrumentation +Support", look for "Kprobes". + +So that you can load and unload Kprobes-based instrumentation modules, +make sure "Loadable module support" (CONFIG_MODULES) and "Module +unloading" (CONFIG_MODULE_UNLOAD) are set to "y". -You may also want to ensure that CONFIG_KALLSYMS and perhaps even -CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name() -is a handy, version-independent way to find a function's address. +Also make sure that CONFIG_KALLSYMS and perhaps even CONFIG_KALLSYMS_ALL +are set to "y", since kallsyms_lookup_name() is used by the in-kernel +kprobe address resolution code. If you need to insert a probe in the middle of a function, you may find it useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO), @@ -160,9 +323,11 @@ code mapping. 4. API Reference The Kprobes API includes a "register" function and an "unregister" -function for each type of probe. Here are terse, mini-man-page -specifications for these functions and the associated probe handlers -that you'll write. See the latter half of this document for examples. +function for each type of probe. The API also includes "register_*probes" +and "unregister_*probes" functions for (un)registering arrays of probes. +Here are terse, mini-man-page specifications for these functions and +the associated probe handlers that you'll write. See the files in the +samples/kprobes/ sub-directory for examples. 4.1 register_kprobe @@ -174,7 +339,30 @@ hit, Kprobes calls kp->pre_handler. After the probed instruction is single-stepped, Kprobe calls kp->post_handler. If a fault occurs during execution of kp->pre_handler or kp->post_handler, or during single-stepping of the probed instruction, Kprobes calls -kp->fault_handler. Any or all handlers can be NULL. +kp->fault_handler. Any or all handlers can be NULL. If kp->flags +is set KPROBE_FLAG_DISABLED, that kp will be registered but disabled, +so, its handlers aren't hit until calling enable_kprobe(kp). + +NOTE: +1. With the introduction of the "symbol_name" field to struct kprobe, +the probepoint address resolution will now be taken care of by the kernel. +The following will now work: + + kp.symbol_name = "symbol_name"; + +(64-bit powerpc intricacies such as function descriptors are handled +transparently) + +2. Use the "offset" field of struct kprobe if the offset into the symbol +to install a probepoint is known. This field is used to calculate the +probepoint. + +3. Specify either the kprobe "symbol_name" OR the "addr". If both are +specified, kprobe registration will fail with -EINVAL. + +4. With CISC architectures (such as i386 and x86_64), the kprobes code +does not validate if the kprobe.addr is at an instruction boundary. +Use "offset" with caution. register_kprobe() returns 0 on success, or a negative errno otherwise. @@ -218,9 +406,9 @@ Kprobes runs the handler whose address is jp->entry. The handler should have the same arg list and return type as the probed function; and just before it returns, it must call jprobe_return(). (The handler never actually returns, since jprobe_return() returns -control to Kprobes.) If the probed function is declared asmlinkage, -fastcall, or anything else that affects how args are passed, the -handler's declaration must match. +control to Kprobes.) If the probed function is declared asmlinkage +or anything else that affects how args are passed, the handler's +declaration must match. register_jprobe() returns 0 on success, or a negative errno otherwise. @@ -248,6 +436,13 @@ of interest: - ret_addr: the return address - rp: points to the corresponding kretprobe object - task: points to the corresponding task struct +- data: points to per return-instance private data; see "Kretprobe + entry-handler" for details. + +The regs_return_value(regs) macro provides a simple abstraction to +extract the return value from the appropriate register as defined by +the architecture's ABI. + The handler's return value is currently ignored. 4.4 unregister_*probe @@ -260,20 +455,80 @@ void unregister_kretprobe(struct kretprobe *rp); Removes the specified probe. The unregister function can be called at any time after the probe has been registered. +NOTE: +If the functions find an incorrect probe (ex. an unregistered probe), +they clear the addr field of the probe. + +4.5 register_*probes + +#include <linux/kprobes.h> +int register_kprobes(struct kprobe **kps, int num); +int register_kretprobes(struct kretprobe **rps, int num); +int register_jprobes(struct jprobe **jps, int num); + +Registers each of the num probes in the specified array. If any +error occurs during registration, all probes in the array, up to +the bad probe, are safely unregistered before the register_*probes +function returns. +- kps/rps/jps: an array of pointers to *probe data structures +- num: the number of the array entries. + +NOTE: +You have to allocate(or define) an array of pointers and set all +of the array entries before using these functions. + +4.6 unregister_*probes + +#include <linux/kprobes.h> +void unregister_kprobes(struct kprobe **kps, int num); +void unregister_kretprobes(struct kretprobe **rps, int num); +void unregister_jprobes(struct jprobe **jps, int num); + +Removes each of the num probes in the specified array at once. + +NOTE: +If the functions find some incorrect probes (ex. unregistered +probes) in the specified array, they clear the addr field of those +incorrect probes. However, other probes in the array are +unregistered correctly. + +4.7 disable_*probe + +#include <linux/kprobes.h> +int disable_kprobe(struct kprobe *kp); +int disable_kretprobe(struct kretprobe *rp); +int disable_jprobe(struct jprobe *jp); + +Temporarily disables the specified *probe. You can enable it again by using +enable_*probe(). You must specify the probe which has been registered. + +4.8 enable_*probe + +#include <linux/kprobes.h> +int enable_kprobe(struct kprobe *kp); +int enable_kretprobe(struct kretprobe *rp); +int enable_jprobe(struct jprobe *jp); + +Enables *probe which has been disabled by disable_*probe(). You must specify +the probe which has been registered. + 5. Kprobes Features and Limitations -As of Linux v2.6.12, Kprobes allows multiple probes at the same -address. Currently, however, there cannot be multiple jprobes on -the same function at the same time. +Kprobes allows multiple probes at the same address. Currently, +however, there cannot be multiple jprobes on the same function at +the same time. Also, a probepoint for which there is a jprobe or +a post_handler cannot be optimized. So if you install a jprobe, +or a kprobe with a post_handler, at an optimized probepoint, the +probepoint will be unoptimized automatically. In general, you can install a probe anywhere in the kernel. In particular, you can probe interrupt handlers. Known exceptions are discussed in this section. -For obvious reasons, it's a bad idea to install a probe in -the code that implements Kprobes (mostly kernel/kprobes.c and -arch/*/kernel/kprobes.c). A patch in the v2.6.13 timeframe instructs -Kprobes to reject such requests. +The register_*probe functions will return -EINVAL if you attempt +to install a probe in the code that implements Kprobes (mostly +kernel/kprobes.c and arch/*/kernel/kprobes.c, but also functions such +as do_page_fault and notifier_call_chain). If you install a probe in an inline-able function, Kprobes makes no attempt to chase down all inline instances of the function and @@ -290,24 +545,22 @@ from the accidental ones. Don't drink and probe. Kprobes makes no attempt to prevent probe handlers from stepping on each other -- e.g., probing printk() and then calling printk() from a -probe handler. As of Linux v2.6.12, if a probe handler hits a probe, -that second probe's handlers won't be run in that instance. - -In Linux v2.6.12 and previous versions, Kprobes' data structures are -protected by a single lock that is held during probe registration and -unregistration and while handlers are run. Thus, no two handlers -can run simultaneously. To improve scalability on SMP systems, -this restriction will probably be removed soon, in which case -multiple handlers (or multiple instances of the same handler) may -run concurrently on different CPUs. Code your handlers accordingly. - -Kprobes does not use semaphores or allocate memory except during +probe handler. If a probe handler hits a probe, that second probe's +handlers won't be run in that instance, and the kprobe.nmissed member +of the second probe will be incremented. + +As of Linux v2.6.15-rc1, multiple handlers (or multiple instances of +the same handler) may run concurrently on different CPUs. + +Kprobes does not use mutexes or allocate memory except during registration and unregistration. Probe handlers are run with preemption disabled. Depending on the -architecture, handlers may also run with interrupts disabled. In any -case, your handler should not yield the CPU (e.g., by attempting to -acquire a semaphore). +architecture and optimization state, handlers may also run with +interrupts disabled (e.g., kretprobe handlers and optimized kprobe +handlers run without interrupt disabled on x86/x86-64). In any case, +your handler should not yield the CPU (e.g., by attempting to acquire +a semaphore). Since a return probe is implemented by replacing the return address with the trampoline's address, stack backtraces and calls @@ -316,11 +569,53 @@ address instead of the real return address for kretprobed functions. (As far as we can tell, __builtin_return_address() is used only for instrumentation and error reporting.) -If the number of times a function is called does not match the -number of times it returns, registering a return probe on that -function may produce undesirable results. We have the do_exit() -and do_execve() cases covered. do_fork() is not an issue. We're -unaware of other specific cases where this could be a problem. +If the number of times a function is called does not match the number +of times it returns, registering a return probe on that function may +produce undesirable results. In such a case, a line: +kretprobe BUG!: Processing kretprobe d000000000041aa8 @ c00000000004f48c +gets printed. With this information, one will be able to correlate the +exact instance of the kretprobe that caused the problem. We have the +do_exit() case covered. do_execve() and do_fork() are not an issue. +We're unaware of other specific cases where this could be a problem. + +If, upon entry to or exit from a function, the CPU is running on +a stack other than that of the current task, registering a return +probe on that function may produce undesirable results. For this +reason, Kprobes doesn't support return probes (or kprobes or jprobes) +on the x86_64 version of __switch_to(); the registration functions +return -EINVAL. + +On x86/x86-64, since the Jump Optimization of Kprobes modifies +instructions widely, there are some limitations to optimization. To +explain it, we introduce some terminology. Imagine a 3-instruction +sequence consisting of a two 2-byte instructions and one 3-byte +instruction. + + IA + | +[-2][-1][0][1][2][3][4][5][6][7] + [ins1][ins2][ ins3 ] + [<- DCR ->] + [<- JTPR ->] + +ins1: 1st Instruction +ins2: 2nd Instruction +ins3: 3rd Instruction +IA: Insertion Address +JTPR: Jump Target Prohibition Region +DCR: Detoured Code Region + +The instructions in DCR are copied to the out-of-line buffer +of the kprobe, because the bytes in DCR are replaced by +a 5-byte jump instruction. So there are several limitations. + +a) The instructions in DCR must be relocatable. +b) The instructions in DCR must not include a call instruction. +c) JTPR must not be targeted by any jump or call instruction. +d) DCR must not straddle the border between functions. + +Anyway, these limitations are checked by the in-kernel instruction +decoder, so you don't need to worry about that. 6. Probe Overhead @@ -345,244 +640,90 @@ k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07 ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU) k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99 +6.1 Optimized Probe Overhead + +Typically, an optimized kprobe hit takes 0.07 to 0.1 microseconds to +process. Here are sample overhead figures (in usec) for x86 architectures. +k = unoptimized kprobe, b = boosted (single-step skipped), o = optimized kprobe, +r = unoptimized kretprobe, rb = boosted kretprobe, ro = optimized kretprobe. + +i386: Intel(R) Xeon(R) E5410, 2.33GHz, 4656.90 bogomips +k = 0.80 usec; b = 0.33; o = 0.05; r = 1.10; rb = 0.61; ro = 0.33 + +x86-64: Intel(R) Xeon(R) E5410, 2.33GHz, 4656.90 bogomips +k = 0.99 usec; b = 0.43; o = 0.06; r = 1.24; rb = 0.68; ro = 0.30 + 7. TODO -a. SystemTap (http://sourceware.org/systemtap): Work in progress -to provide a simplified programming interface for probe-based -instrumentation. -b. Improved SMP scalability: Currently, work is in progress to handle -multiple kprobes in parallel. -c. Kernel return probes for sparc64. -d. Support for other architectures. -e. User-space probes. +a. SystemTap (http://sourceware.org/systemtap): Provides a simplified +programming interface for probe-based instrumentation. Try it out. +b. Kernel return probes for sparc64. +c. Support for other architectures. +d. User-space probes. +e. Watchpoint probes (which fire on data references). 8. Kprobes Example -Here's a sample kernel module showing the use of kprobes to dump a -stack trace and selected i386 registers when do_fork() is called. ------ cut here ----- -/*kprobe_example.c*/ -#include <linux/kernel.h> -#include <linux/module.h> -#include <linux/kprobes.h> -#include <linux/kallsyms.h> -#include <linux/sched.h> - -/*For each probe you need to allocate a kprobe structure*/ -static struct kprobe kp; - -/*kprobe pre_handler: called just before the probed instruction is executed*/ -int handler_pre(struct kprobe *p, struct pt_regs *regs) -{ - printk("pre_handler: p->addr=0x%p, eip=%lx, eflags=0x%lx\n", - p->addr, regs->eip, regs->eflags); - dump_stack(); - return 0; -} - -/*kprobe post_handler: called after the probed instruction is executed*/ -void handler_post(struct kprobe *p, struct pt_regs *regs, unsigned long flags) -{ - printk("post_handler: p->addr=0x%p, eflags=0x%lx\n", - p->addr, regs->eflags); -} - -/* fault_handler: this is called if an exception is generated for any - * instruction within the pre- or post-handler, or when Kprobes - * single-steps the probed instruction. - */ -int handler_fault(struct kprobe *p, struct pt_regs *regs, int trapnr) -{ - printk("fault_handler: p->addr=0x%p, trap #%dn", - p->addr, trapnr); - /* Return 0 because we don't handle the fault. */ - return 0; -} - -int init_module(void) -{ - int ret; - kp.pre_handler = handler_pre; - kp.post_handler = handler_post; - kp.fault_handler = handler_fault; - kp.addr = (kprobe_opcode_t*) kallsyms_lookup_name("do_fork"); - /* register the kprobe now */ - if (!kp.addr) { - printk("Couldn't find %s to plant kprobe\n", "do_fork"); - return -1; - } - if ((ret = register_kprobe(&kp) < 0)) { - printk("register_kprobe failed, returned %d\n", ret); - return -1; - } - printk("kprobe registered\n"); - return 0; -} - -void cleanup_module(void) -{ - unregister_kprobe(&kp); - printk("kprobe unregistered\n"); -} - -MODULE_LICENSE("GPL"); ------ cut here ----- - -You can build the kernel module, kprobe-example.ko, using the following -Makefile: ------ cut here ----- -obj-m := kprobe-example.o -KDIR := /lib/modules/$(shell uname -r)/build -PWD := $(shell pwd) -default: - $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules -clean: - rm -f *.mod.c *.ko *.o ------ cut here ----- - -$ make -$ su - -... -# insmod kprobe-example.ko - -You will see the trace data in /var/log/messages and on the console -whenever do_fork() is invoked to create a new process. +See samples/kprobes/kprobe_example.c 9. Jprobes Example -Here's a sample kernel module showing the use of jprobes to dump -the arguments of do_fork(). ------ cut here ----- -/*jprobe-example.c */ -#include <linux/kernel.h> -#include <linux/module.h> -#include <linux/fs.h> -#include <linux/uio.h> -#include <linux/kprobes.h> -#include <linux/kallsyms.h> - -/* - * Jumper probe for do_fork. - * Mirror principle enables access to arguments of the probed routine - * from the probe handler. - */ - -/* Proxy routine having the same arguments as actual do_fork() routine */ -long jdo_fork(unsigned long clone_flags, unsigned long stack_start, - struct pt_regs *regs, unsigned long stack_size, - int __user * parent_tidptr, int __user * child_tidptr) -{ - printk("jprobe: clone_flags=0x%lx, stack_size=0x%lx, regs=0x%p\n", - clone_flags, stack_size, regs); - /* Always end with a call to jprobe_return(). */ - jprobe_return(); - /*NOTREACHED*/ - return 0; -} - -static struct jprobe my_jprobe = { - .entry = (kprobe_opcode_t *) jdo_fork -}; - -int init_module(void) -{ - int ret; - my_jprobe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("do_fork"); - if (!my_jprobe.kp.addr) { - printk("Couldn't find %s to plant jprobe\n", "do_fork"); - return -1; - } - - if ((ret = register_jprobe(&my_jprobe)) <0) { - printk("register_jprobe failed, returned %d\n", ret); - return -1; - } - printk("Planted jprobe at %p, handler addr %p\n", - my_jprobe.kp.addr, my_jprobe.entry); - return 0; -} - -void cleanup_module(void) -{ - unregister_jprobe(&my_jprobe); - printk("jprobe unregistered\n"); -} - -MODULE_LICENSE("GPL"); ------ cut here ----- - -Build and insert the kernel module as shown in the above kprobe -example. You will see the trace data in /var/log/messages and on -the console whenever do_fork() is invoked to create a new process. -(Some messages may be suppressed if syslogd is configured to -eliminate duplicate messages.) +See samples/kprobes/jprobe_example.c 10. Kretprobes Example -Here's a sample kernel module showing the use of return probes to -report failed calls to sys_open(). ------ cut here ----- -/*kretprobe-example.c*/ -#include <linux/kernel.h> -#include <linux/module.h> -#include <linux/kprobes.h> -#include <linux/kallsyms.h> - -static const char *probed_func = "sys_open"; - -/* Return-probe handler: If the probed function fails, log the return value. */ -static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs) -{ - // Substitute the appropriate register name for your architecture -- - // e.g., regs->rax for x86_64, regs->gpr[3] for ppc64. - int retval = (int) regs->eax; - if (retval < 0) { - printk("%s returns %d\n", probed_func, retval); - } - return 0; -} - -static struct kretprobe my_kretprobe = { - .handler = ret_handler, - /* Probe up to 20 instances concurrently. */ - .maxactive = 20 -}; - -int init_module(void) -{ - int ret; - my_kretprobe.kp.addr = - (kprobe_opcode_t *) kallsyms_lookup_name(probed_func); - if (!my_kretprobe.kp.addr) { - printk("Couldn't find %s to plant return probe\n", probed_func); - return -1; - } - if ((ret = register_kretprobe(&my_kretprobe)) < 0) { - printk("register_kretprobe failed, returned %d\n", ret); - return -1; - } - printk("Planted return probe at %p\n", my_kretprobe.kp.addr); - return 0; -} - -void cleanup_module(void) -{ - unregister_kretprobe(&my_kretprobe); - printk("kretprobe unregistered\n"); - /* nmissed > 0 suggests that maxactive was set too low. */ - printk("Missed probing %d instances of %s\n", - my_kretprobe.nmissed, probed_func); -} - -MODULE_LICENSE("GPL"); ------ cut here ----- - -Build and insert the kernel module as shown in the above kprobe -example. You will see the trace data in /var/log/messages and on the -console whenever sys_open() returns a negative value. (Some messages -may be suppressed if syslogd is configured to eliminate duplicate -messages.) +See samples/kprobes/kretprobe_example.c For additional information on Kprobes, refer to the following URLs: http://www-106.ibm.com/developerworks/library/l-kprobes.html?ca=dgr-lnxw42Kprobe http://www.redhat.com/magazine/005mar05/features/kprobes/ +http://www-users.cs.umn.edu/~boutcher/kprobes/ +http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115) + + +Appendix A: The kprobes debugfs interface + +With recent kernels (> 2.6.20) the list of registered kprobes is visible +under the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at //sys/kernel/debug). + +/sys/kernel/debug/kprobes/list: Lists all registered probes on the system + +c015d71a k vfs_read+0x0 +c011a316 j do_fork+0x0 +c03dedc5 r tcp_v4_rcv+0x0 + +The first column provides the kernel address where the probe is inserted. +The second column identifies the type of probe (k - kprobe, r - kretprobe +and j - jprobe), while the third column specifies the symbol+offset of +the probe. If the probed function belongs to a module, the module name +is also specified. Following columns show probe status. If the probe is on +a virtual address that is no longer valid (module init sections, module +virtual addresses that correspond to modules that've been unloaded), +such probes are marked with [GONE]. If the probe is temporarily disabled, +such probes are marked with [DISABLED]. If the probe is optimized, it is +marked with [OPTIMIZED]. + +/sys/kernel/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly. + +Provides a knob to globally and forcibly turn registered kprobes ON or OFF. +By default, all kprobes are enabled. By echoing "0" to this file, all +registered probes will be disarmed, till such time a "1" is echoed to this +file. Note that this knob just disarms and arms all kprobes and doesn't +change each probe's disabling state. This means that disabled kprobes (marked +[DISABLED]) will be not enabled if you turn ON all kprobes by this knob. + + +Appendix B: The kprobes sysctl interface + +/proc/sys/debug/kprobes-optimization: Turn kprobes optimization ON/OFF. + +When CONFIG_OPTPROBES=y, this sysctl interface appears and it provides +a knob to globally and forcibly turn jump optimization (see section +1.4) ON or OFF. By default, jump optimization is allowed (ON). +If you echo "0" to this file or set "debug.kprobes_optimization" to +0 via sysctl, all optimized probes will be unoptimized, and any new +probes registered after that will not be optimized. Note that this +knob *changes* the optimized state. This means that optimized probes +(marked [OPTIMIZED]) will be unoptimized ([OPTIMIZED] tag will be +removed). If the knob is turned on, they will be optimized again. + |
