aboutsummaryrefslogtreecommitdiff
path: root/Documentation/trace
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/trace')
-rw-r--r--Documentation/trace/events-power.txt2
-rw-r--r--Documentation/trace/events.txt207
-rw-r--r--Documentation/trace/ftrace-design.txt5
-rw-r--r--Documentation/trace/ftrace.txt32
-rw-r--r--Documentation/trace/postprocess/trace-vmscan-postprocess.pl32
-rw-r--r--Documentation/trace/ring-buffer-design.txt2
-rw-r--r--Documentation/trace/tracepoints.txt29
-rw-r--r--Documentation/trace/uprobetracer.txt36
8 files changed, 312 insertions, 33 deletions
diff --git a/Documentation/trace/events-power.txt b/Documentation/trace/events-power.txt
index 3bd33b8dc7c..21d514ced21 100644
--- a/Documentation/trace/events-power.txt
+++ b/Documentation/trace/events-power.txt
@@ -92,5 +92,5 @@ dev_pm_qos_remove_request "device=%s type=%s new_value=%d"
The first parameter gives the device name which tries to add/update/remove
QoS requests.
-The second parameter gives the request type (e.g. "DEV_PM_QOS_LATENCY").
+The second parameter gives the request type (e.g. "DEV_PM_QOS_RESUME_LATENCY").
The third parameter is value to be added/updated/removed.
diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt
index 37732a220d3..75d25a1d6e4 100644
--- a/Documentation/trace/events.txt
+++ b/Documentation/trace/events.txt
@@ -287,3 +287,210 @@ their old filters):
prev_pid == 0
# cat sched_wakeup/filter
common_pid == 0
+
+6. Event triggers
+=================
+
+Trace events can be made to conditionally invoke trigger 'commands'
+which can take various forms and are described in detail below;
+examples would be enabling or disabling other trace events or invoking
+a stack trace whenever the trace event is hit. Whenever a trace event
+with attached triggers is invoked, the set of trigger commands
+associated with that event is invoked. Any given trigger can
+additionally have an event filter of the same form as described in
+section 5 (Event filtering) associated with it - the command will only
+be invoked if the event being invoked passes the associated filter.
+If no filter is associated with the trigger, it always passes.
+
+Triggers are added to and removed from a particular event by writing
+trigger expressions to the 'trigger' file for the given event.
+
+A given event can have any number of triggers associated with it,
+subject to any restrictions that individual commands may have in that
+regard.
+
+Event triggers are implemented on top of "soft" mode, which means that
+whenever a trace event has one or more triggers associated with it,
+the event is activated even if it isn't actually enabled, but is
+disabled in a "soft" mode. That is, the tracepoint will be called,
+but just will not be traced, unless of course it's actually enabled.
+This scheme allows triggers to be invoked even for events that aren't
+enabled, and also allows the current event filter implementation to be
+used for conditionally invoking triggers.
+
+The syntax for event triggers is roughly based on the syntax for
+set_ftrace_filter 'ftrace filter commands' (see the 'Filter commands'
+section of Documentation/trace/ftrace.txt), but there are major
+differences and the implementation isn't currently tied to it in any
+way, so beware about making generalizations between the two.
+
+6.1 Expression syntax
+---------------------
+
+Triggers are added by echoing the command to the 'trigger' file:
+
+ # echo 'command[:count] [if filter]' > trigger
+
+Triggers are removed by echoing the same command but starting with '!'
+to the 'trigger' file:
+
+ # echo '!command[:count] [if filter]' > trigger
+
+The [if filter] part isn't used in matching commands when removing, so
+leaving that off in a '!' command will accomplish the same thing as
+having it in.
+
+The filter syntax is the same as that described in the 'Event
+filtering' section above.
+
+For ease of use, writing to the trigger file using '>' currently just
+adds or removes a single trigger and there's no explicit '>>' support
+('>' actually behaves like '>>') or truncation support to remove all
+triggers (you have to use '!' for each one added.)
+
+6.2 Supported trigger commands
+------------------------------
+
+The following commands are supported:
+
+- enable_event/disable_event
+
+ These commands can enable or disable another trace event whenever
+ the triggering event is hit. When these commands are registered,
+ the other trace event is activated, but disabled in a "soft" mode.
+ That is, the tracepoint will be called, but just will not be traced.
+ The event tracepoint stays in this mode as long as there's a trigger
+ in effect that can trigger it.
+
+ For example, the following trigger causes kmalloc events to be
+ traced when a read system call is entered, and the :1 at the end
+ specifies that this enablement happens only once:
+
+ # echo 'enable_event:kmem:kmalloc:1' > \
+ /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
+
+ The following trigger causes kmalloc events to stop being traced
+ when a read system call exits. This disablement happens on every
+ read system call exit:
+
+ # echo 'disable_event:kmem:kmalloc' > \
+ /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
+
+ The format is:
+
+ enable_event:<system>:<event>[:count]
+ disable_event:<system>:<event>[:count]
+
+ To remove the above commands:
+
+ # echo '!enable_event:kmem:kmalloc:1' > \
+ /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
+
+ # echo '!disable_event:kmem:kmalloc' > \
+ /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
+
+ Note that there can be any number of enable/disable_event triggers
+ per triggering event, but there can only be one trigger per
+ triggered event. e.g. sys_enter_read can have triggers enabling both
+ kmem:kmalloc and sched:sched_switch, but can't have two kmem:kmalloc
+ versions such as kmem:kmalloc and kmem:kmalloc:1 or 'kmem:kmalloc if
+ bytes_req == 256' and 'kmem:kmalloc if bytes_alloc == 256' (they
+ could be combined into a single filter on kmem:kmalloc though).
+
+- stacktrace
+
+ This command dumps a stacktrace in the trace buffer whenever the
+ triggering event occurs.
+
+ For example, the following trigger dumps a stacktrace every time the
+ kmalloc tracepoint is hit:
+
+ # echo 'stacktrace' > \
+ /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+ The following trigger dumps a stacktrace the first 5 times a kmalloc
+ request happens with a size >= 64K
+
+ # echo 'stacktrace:5 if bytes_req >= 65536' > \
+ /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+ The format is:
+
+ stacktrace[:count]
+
+ To remove the above commands:
+
+ # echo '!stacktrace' > \
+ /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+ # echo '!stacktrace:5 if bytes_req >= 65536' > \
+ /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+ The latter can also be removed more simply by the following (without
+ the filter):
+
+ # echo '!stacktrace:5' > \
+ /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+ Note that there can be only one stacktrace trigger per triggering
+ event.
+
+- snapshot
+
+ This command causes a snapshot to be triggered whenever the
+ triggering event occurs.
+
+ The following command creates a snapshot every time a block request
+ queue is unplugged with a depth > 1. If you were tracing a set of
+ events or functions at the time, the snapshot trace buffer would
+ capture those events when the trigger event occurred:
+
+ # echo 'snapshot if nr_rq > 1' > \
+ /sys/kernel/debug/tracing/events/block/block_unplug/trigger
+
+ To only snapshot once:
+
+ # echo 'snapshot:1 if nr_rq > 1' > \
+ /sys/kernel/debug/tracing/events/block/block_unplug/trigger
+
+ To remove the above commands:
+
+ # echo '!snapshot if nr_rq > 1' > \
+ /sys/kernel/debug/tracing/events/block/block_unplug/trigger
+
+ # echo '!snapshot:1 if nr_rq > 1' > \
+ /sys/kernel/debug/tracing/events/block/block_unplug/trigger
+
+ Note that there can be only one snapshot trigger per triggering
+ event.
+
+- traceon/traceoff
+
+ These commands turn tracing on and off when the specified events are
+ hit. The parameter determines how many times the tracing system is
+ turned on and off. If unspecified, there is no limit.
+
+ The following command turns tracing off the first time a block
+ request queue is unplugged with a depth > 1. If you were tracing a
+ set of events or functions at the time, you could then examine the
+ trace buffer to see the sequence of events that led up to the
+ trigger event:
+
+ # echo 'traceoff:1 if nr_rq > 1' > \
+ /sys/kernel/debug/tracing/events/block/block_unplug/trigger
+
+ To always disable tracing when nr_rq > 1 :
+
+ # echo 'traceoff if nr_rq > 1' > \
+ /sys/kernel/debug/tracing/events/block/block_unplug/trigger
+
+ To remove the above commands:
+
+ # echo '!traceoff:1 if nr_rq > 1' > \
+ /sys/kernel/debug/tracing/events/block/block_unplug/trigger
+
+ # echo '!traceoff if nr_rq > 1' > \
+ /sys/kernel/debug/tracing/events/block/block_unplug/trigger
+
+ Note that there can be only one traceon or traceoff trigger per
+ triggering event.
diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt
index 79fcafc7fd6..3f669b9e885 100644
--- a/Documentation/trace/ftrace-design.txt
+++ b/Documentation/trace/ftrace-design.txt
@@ -358,11 +358,8 @@ Every arch has an init callback function. If you need to do something early on
to initialize some state, this is the time to do that. Otherwise, this simple
function below should be sufficient for most people:
-int __init ftrace_dyn_arch_init(void *data)
+int __init ftrace_dyn_arch_init(void)
{
- /* return value is done indirectly via data */
- *(unsigned long *)data = 0;
-
return 0;
}
diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index ea2d35d64d2..2479b2a0c77 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -655,7 +655,11 @@ explains which is which.
read the irq flags variable, an 'X' will always
be printed here.
- need-resched: 'N' task need_resched is set, '.' otherwise.
+ need-resched:
+ 'N' both TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED is set,
+ 'n' only TIF_NEED_RESCHED is set,
+ 'p' only PREEMPT_NEED_RESCHED is set,
+ '.' otherwise.
hardirq/softirq:
'H' - hard irq occurred inside a softirq.
@@ -1999,6 +2003,32 @@ want, depending on your needs.
360.774530 | 1) 0.594 us | __phys_addr();
+The function name is always displayed after the closing bracket
+for a function if the start of that function is not in the
+trace buffer.
+
+Display of the function name after the closing bracket may be
+enabled for functions whose start is in the trace buffer,
+allowing easier searching with grep for function durations.
+It is default disabled.
+
+ hide: echo nofuncgraph-tail > trace_options
+ show: echo funcgraph-tail > trace_options
+
+ Example with nofuncgraph-tail (default):
+ 0) | putname() {
+ 0) | kmem_cache_free() {
+ 0) 0.518 us | __phys_addr();
+ 0) 1.757 us | }
+ 0) 2.861 us | }
+
+ Example with funcgraph-tail:
+ 0) | putname() {
+ 0) | kmem_cache_free() {
+ 0) 0.518 us | __phys_addr();
+ 0) 1.757 us | } /* kmem_cache_free() */
+ 0) 2.861 us | } /* putname() */
+
You can put some comments on specific functions by using
trace_printk() For example, if you want to put a comment inside
the __might_sleep() function, you just have to include
diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
index 4a37c4759cd..78c9a7b2b58 100644
--- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
+++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
@@ -47,7 +47,6 @@ use constant HIGH_KSWAPD_REWAKEUP => 21;
use constant HIGH_NR_SCANNED => 22;
use constant HIGH_NR_TAKEN => 23;
use constant HIGH_NR_RECLAIMED => 24;
-use constant HIGH_NR_CONTIG_DIRTY => 25;
my %perprocesspid;
my %perprocess;
@@ -105,7 +104,7 @@ my $regex_direct_end_default = 'nr_reclaimed=([0-9]*)';
my $regex_kswapd_wake_default = 'nid=([0-9]*) order=([0-9]*)';
my $regex_kswapd_sleep_default = 'nid=([0-9]*)';
my $regex_wakeup_kswapd_default = 'nid=([0-9]*) zid=([0-9]*) order=([0-9]*)';
-my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_taken=([0-9]*) contig_taken=([0-9]*) contig_dirty=([0-9]*) contig_failed=([0-9]*)';
+my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_taken=([0-9]*) file=([0-9]*)';
my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) zid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
my $regex_lru_shrink_active_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_rotated=([0-9]*) priority=([0-9]*)';
my $regex_writepage_default = 'page=([0-9a-f]*) pfn=([0-9]*) flags=([A-Z_|]*)';
@@ -123,7 +122,7 @@ my $regex_writepage;
# Static regex used. Specified like this for readability and for use with /o
# (process_pid) (cpus ) ( time ) (tpoint ) (details)
-my $regex_traceevent = '\s*([a-zA-Z0-9-]*)\s*(\[[0-9]*\])\s*([0-9.]*):\s*([a-zA-Z_]*):\s*(.*)';
+my $regex_traceevent = '\s*([a-zA-Z0-9-]*)\s*(\[[0-9]*\])(\s*[dX.][Nnp.][Hhs.][0-9a-fA-F.]*|)\s*([0-9.]*):\s*([a-zA-Z_]*):\s*(.*)';
my $regex_statname = '[-0-9]*\s\((.*)\).*';
my $regex_statppid = '[-0-9]*\s\(.*\)\s[A-Za-z]\s([0-9]*).*';
@@ -200,7 +199,7 @@ $regex_lru_isolate = generate_traceevent_regex(
$regex_lru_isolate_default,
"isolate_mode", "order",
"nr_requested", "nr_scanned", "nr_taken",
- "contig_taken", "contig_dirty", "contig_failed");
+ "file");
$regex_lru_shrink_inactive = generate_traceevent_regex(
"vmscan/mm_vmscan_lru_shrink_inactive",
$regex_lru_shrink_inactive_default,
@@ -270,8 +269,8 @@ EVENT_PROCESS:
while ($traceevent = <STDIN>) {
if ($traceevent =~ /$regex_traceevent/o) {
$process_pid = $1;
- $timestamp = $3;
- $tracepoint = $4;
+ $timestamp = $4;
+ $tracepoint = $5;
$process_pid =~ /(.*)-([0-9]*)$/;
my $process = $1;
@@ -299,7 +298,7 @@ EVENT_PROCESS:
$perprocesspid{$process_pid}->{MM_VMSCAN_DIRECT_RECLAIM_BEGIN}++;
$perprocesspid{$process_pid}->{STATE_DIRECT_BEGIN} = $timestamp;
- $details = $5;
+ $details = $6;
if ($details !~ /$regex_direct_begin/o) {
print "WARNING: Failed to parse mm_vmscan_direct_reclaim_begin as expected\n";
print " $details\n";
@@ -322,7 +321,7 @@ EVENT_PROCESS:
$perprocesspid{$process_pid}->{HIGH_DIRECT_RECLAIM_LATENCY}[$index] = "$order-$latency";
}
} elsif ($tracepoint eq "mm_vmscan_kswapd_wake") {
- $details = $5;
+ $details = $6;
if ($details !~ /$regex_kswapd_wake/o) {
print "WARNING: Failed to parse mm_vmscan_kswapd_wake as expected\n";
print " $details\n";
@@ -356,7 +355,7 @@ EVENT_PROCESS:
} elsif ($tracepoint eq "mm_vmscan_wakeup_kswapd") {
$perprocesspid{$process_pid}->{MM_VMSCAN_WAKEUP_KSWAPD}++;
- $details = $5;
+ $details = $6;
if ($details !~ /$regex_wakeup_kswapd/o) {
print "WARNING: Failed to parse mm_vmscan_wakeup_kswapd as expected\n";
print " $details\n";
@@ -366,7 +365,7 @@ EVENT_PROCESS:
my $order = $3;
$perprocesspid{$process_pid}->{MM_VMSCAN_WAKEUP_KSWAPD_PERORDER}[$order]++;
} elsif ($tracepoint eq "mm_vmscan_lru_isolate") {
- $details = $5;
+ $details = $6;
if ($details !~ /$regex_lru_isolate/o) {
print "WARNING: Failed to parse mm_vmscan_lru_isolate as expected\n";
print " $details\n";
@@ -375,7 +374,6 @@ EVENT_PROCESS:
}
my $isolate_mode = $1;
my $nr_scanned = $4;
- my $nr_contig_dirty = $7;
# To closer match vmstat scanning statistics, only count isolate_both
# and isolate_inactive as scanning. isolate_active is rotation
@@ -385,9 +383,8 @@ EVENT_PROCESS:
if ($isolate_mode != 2) {
$perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned;
}
- $perprocesspid{$process_pid}->{HIGH_NR_CONTIG_DIRTY} += $nr_contig_dirty;
} elsif ($tracepoint eq "mm_vmscan_lru_shrink_inactive") {
- $details = $5;
+ $details = $6;
if ($details !~ /$regex_lru_shrink_inactive/o) {
print "WARNING: Failed to parse mm_vmscan_lru_shrink_inactive as expected\n";
print " $details\n";
@@ -397,7 +394,7 @@ EVENT_PROCESS:
my $nr_reclaimed = $4;
$perprocesspid{$process_pid}->{HIGH_NR_RECLAIMED} += $nr_reclaimed;
} elsif ($tracepoint eq "mm_vmscan_writepage") {
- $details = $5;
+ $details = $6;
if ($details !~ /$regex_writepage/o) {
print "WARNING: Failed to parse mm_vmscan_writepage as expected\n";
print " $details\n";
@@ -539,13 +536,6 @@ sub dump_stats {
}
}
}
- if ($stats{$process_pid}->{HIGH_NR_CONTIG_DIRTY}) {
- print " ";
- my $count = $stats{$process_pid}->{HIGH_NR_CONTIG_DIRTY};
- if ($count != 0) {
- print "contig-dirty=$count ";
- }
- }
print "\n";
}
diff --git a/Documentation/trace/ring-buffer-design.txt b/Documentation/trace/ring-buffer-design.txt
index 7d350b49658..ff747b6fa39 100644
--- a/Documentation/trace/ring-buffer-design.txt
+++ b/Documentation/trace/ring-buffer-design.txt
@@ -683,7 +683,7 @@ against nested writers.
cmpxchg(tail_page, temp_page, next_page)
The above will update the tail page if it is still pointing to the expected
-page. If this fails, a nested write pushed it forward, the the current write
+page. If this fails, a nested write pushed it forward, the current write
does not need to push it.
diff --git a/Documentation/trace/tracepoints.txt b/Documentation/trace/tracepoints.txt
index ac4170dd0f2..a3efac621c5 100644
--- a/Documentation/trace/tracepoints.txt
+++ b/Documentation/trace/tracepoints.txt
@@ -114,3 +114,32 @@ core kernel image or in modules.
If the tracepoint has to be used in kernel modules, an
EXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be
used to export the defined tracepoints.
+
+If you need to do a bit of work for a tracepoint parameter, and
+that work is only used for the tracepoint, that work can be encapsulated
+within an if statement with the following:
+
+ if (trace_foo_bar_enabled()) {
+ int i;
+ int tot = 0;
+
+ for (i = 0; i < count; i++)
+ tot += calculate_nuggets();
+
+ trace_foo_bar(tot);
+ }
+
+All trace_<tracepoint>() calls have a matching trace_<tracepoint>_enabled()
+function defined that returns true if the tracepoint is enabled and
+false otherwise. The trace_<tracepoint>() should always be within the
+block of the if (trace_<tracepoint>_enabled()) to prevent races between
+the tracepoint being enabled and the check being seen.
+
+The advantage of using the trace_<tracepoint>_enabled() is that it uses
+the static_key of the tracepoint to allow the if statement to be implemented
+with jump labels and avoid conditional branches.
+
+Note: The convenience macro TRACE_EVENT provides an alternative way to
+ define tracepoints. Check http://lwn.net/Articles/379903,
+ http://lwn.net/Articles/381064 and http://lwn.net/Articles/383362
+ for a series of articles with more details.
diff --git a/Documentation/trace/uprobetracer.txt b/Documentation/trace/uprobetracer.txt
index d9c3e682312..f1cf9a34ad9 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -19,18 +19,44 @@ user to calculate the offset of the probepoint in the object.
Synopsis of uprobe_tracer
-------------------------
- p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe
- r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe (uretprobe)
- -:[GRP/]EVENT : Clear uprobe or uretprobe event
+ p[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a uprobe
+ r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a return uprobe (uretprobe)
+ -:[GRP/]EVENT : Clear uprobe or uretprobe event
GRP : Group name. If omitted, "uprobes" is the default value.
EVENT : Event name. If omitted, the event name is generated based
- on SYMBOL+offs.
+ on PATH+OFFSET.
PATH : Path to an executable or a library.
- SYMBOL[+offs] : Symbol+offset where the probe is inserted.
+ OFFSET : Offset where the probe is inserted.
FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
+ @ADDR : Fetch memory at ADDR (ADDR should be in userspace)
+ @+OFFSET : Fetch memory at OFFSET (OFFSET from same file as PATH)
+ $stackN : Fetch Nth entry of stack (N >= 0)
+ $stack : Fetch stack address.
+ $retval : Fetch return value.(*)
+ +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
+ NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
+ FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
+ (u8/u16/u32/u64/s8/s16/s32/s64), "string" and bitfield
+ are supported.
+
+ (*) only for return probe.
+ (**) this is useful for fetching a field of data structures.
+
+Types
+-----
+Several types are supported for fetch-args. Uprobe tracer will access memory
+by given type. Prefix 's' and 'u' means those types are signed and unsigned
+respectively. Traced arguments are shown in decimal (signed) or hex (unsigned).
+String type is a special type, which fetches a "null-terminated" string from
+user space.
+Bitfield is another special type, which takes 3 parameters, bit-width, bit-
+offset, and container-size (usually 32). The syntax is;
+
+ b<bit-width>@<bit-offset>/<container-size>
+
Event Profiling
---------------