aboutsummaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/RCU/trace.txt144
-rw-r--r--Documentation/dontdiff26
-rw-r--r--Documentation/kernel-docs.txt27
-rw-r--r--Documentation/kernel-parameters.txt10
-rw-r--r--Documentation/trace/events-power.txt90
5 files changed, 247 insertions, 50 deletions
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index a851118775d..6a8c73f55b8 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -1,18 +1,22 @@
CONFIG_RCU_TRACE debugfs Files and Formats
-The rcutree implementation of RCU provides debugfs trace output that
-summarizes counters and state. This information is useful for debugging
-RCU itself, and can sometimes also help to debug abuses of RCU.
-The following sections describe the debugfs files and formats.
+The rcutree and rcutiny implementations of RCU provide debugfs trace
+output that summarizes counters and state. This information is useful for
+debugging RCU itself, and can sometimes also help to debug abuses of RCU.
+The following sections describe the debugfs files and formats, first
+for rcutree and next for rcutiny.
-Hierarchical RCU debugfs Files and Formats
+CONFIG_TREE_RCU and CONFIG_TREE_PREEMPT_RCU debugfs Files and Formats
-This implementation of RCU provides three debugfs files under the
+These implementations of RCU provides five debugfs files under the
top-level directory RCU: rcu/rcudata (which displays fields in struct
-rcu_data), rcu/rcugp (which displays grace-period counters), and
-rcu/rcuhier (which displays the struct rcu_node hierarchy).
+rcu_data), rcu/rcudata.csv (which is a .csv spreadsheet version of
+rcu/rcudata), rcu/rcugp (which displays grace-period counters),
+rcu/rcuhier (which displays the struct rcu_node hierarchy), and
+rcu/rcu_pending (which displays counts of the reasons that the
+rcu_pending() function decided that there was core RCU work to do).
The output of "cat rcu/rcudata" looks as follows:
@@ -130,7 +134,8 @@ o "ci" is the number of RCU callbacks that have been invoked for
been registered in absence of CPU-hotplug activity.
o "co" is the number of RCU callbacks that have been orphaned due to
- this CPU going offline.
+ this CPU going offline. These orphaned callbacks have been moved
+ to an arbitrarily chosen online CPU.
o "ca" is the number of RCU callbacks that have been adopted due to
other CPUs going offline. Note that ci+co-ca+ql is the number of
@@ -168,12 +173,12 @@ o "gpnum" is the number of grace periods that have started. It is
The output of "cat rcu/rcuhier" looks as follows, with very long lines:
-c=6902 g=6903 s=2 jfq=3 j=72c7 nfqs=13142/nfqsng=0(13142) fqlh=6 oqlen=0
+c=6902 g=6903 s=2 jfq=3 j=72c7 nfqs=13142/nfqsng=0(13142) fqlh=6
1/1 .>. 0:127 ^0
3/3 .>. 0:35 ^0 0/0 .>. 36:71 ^1 0/0 .>. 72:107 ^2 0/0 .>. 108:127 ^3
3/3f .>. 0:5 ^0 2/3 .>. 6:11 ^1 0/0 .>. 12:17 ^2 0/0 .>. 18:23 ^3 0/0 .>. 24:29 ^4 0/0 .>. 30:35 ^5 0/0 .>. 36:41 ^0 0/0 .>. 42:47 ^1 0/0 .>. 48:53 ^2 0/0 .>. 54:59 ^3 0/0 .>. 60:65 ^4 0/0 .>. 66:71 ^5 0/0 .>. 72:77 ^0 0/0 .>. 78:83 ^1 0/0 .>. 84:89 ^2 0/0 .>. 90:95 ^3 0/0 .>. 96:101 ^4 0/0 .>. 102:107 ^5 0/0 .>. 108:113 ^0 0/0 .>. 114:119 ^1 0/0 .>. 120:125 ^2 0/0 .>. 126:127 ^3
rcu_bh:
-c=-226 g=-226 s=1 jfq=-5701 j=72c7 nfqs=88/nfqsng=0(88) fqlh=0 oqlen=0
+c=-226 g=-226 s=1 jfq=-5701 j=72c7 nfqs=88/nfqsng=0(88) fqlh=0
0/1 .>. 0:127 ^0
0/3 .>. 0:35 ^0 0/0 .>. 36:71 ^1 0/0 .>. 72:107 ^2 0/0 .>. 108:127 ^3
0/3f .>. 0:5 ^0 0/3 .>. 6:11 ^1 0/0 .>. 12:17 ^2 0/0 .>. 18:23 ^3 0/0 .>. 24:29 ^4 0/0 .>. 30:35 ^5 0/0 .>. 36:41 ^0 0/0 .>. 42:47 ^1 0/0 .>. 48:53 ^2 0/0 .>. 54:59 ^3 0/0 .>. 60:65 ^4 0/0 .>. 66:71 ^5 0/0 .>. 72:77 ^0 0/0 .>. 78:83 ^1 0/0 .>. 84:89 ^2 0/0 .>. 90:95 ^3 0/0 .>. 96:101 ^4 0/0 .>. 102:107 ^5 0/0 .>. 108:113 ^0 0/0 .>. 114:119 ^1 0/0 .>. 120:125 ^2 0/0 .>. 126:127 ^3
@@ -212,11 +217,6 @@ o "fqlh" is the number of calls to force_quiescent_state() that
exited immediately (without even being counted in nfqs above)
due to contention on ->fqslock.
-o "oqlen" is the number of callbacks on the "orphan" callback
- list. RCU callbacks are placed on this list by CPUs going
- offline, and are "adopted" either by the CPU helping the outgoing
- CPU or by the next rcu_barrier*() call, whichever comes first.
-
o Each element of the form "1/1 0:127 ^0" represents one struct
rcu_node. Each line represents one level of the hierarchy, from
root to leaves. It is best to think of the rcu_data structures
@@ -326,3 +326,115 @@ o "nn" is the number of times that this CPU needed nothing. Alert
readers will note that the rcu "nn" number for a given CPU very
closely matches the rcu_bh "np" number for that same CPU. This
is due to short-circuit evaluation in rcu_pending().
+
+
+CONFIG_TINY_RCU and CONFIG_TINY_PREEMPT_RCU debugfs Files and Formats
+
+These implementations of RCU provides a single debugfs file under the
+top-level directory RCU, namely rcu/rcudata, which displays fields in
+rcu_bh_ctrlblk, rcu_sched_ctrlblk and, for CONFIG_TINY_PREEMPT_RCU,
+rcu_preempt_ctrlblk.
+
+The output of "cat rcu/rcudata" is as follows:
+
+rcu_preempt: qlen=24 gp=1097669 g197/p197/c197 tasks=...
+ ttb=. btg=no ntb=184 neb=0 nnb=183 j=01f7 bt=0274
+ normal balk: nt=1097669 gt=0 bt=371 b=0 ny=25073378 nos=0
+ exp balk: bt=0 nos=0
+rcu_sched: qlen: 0
+rcu_bh: qlen: 0
+
+This is split into rcu_preempt, rcu_sched, and rcu_bh sections, with the
+rcu_preempt section appearing only in CONFIG_TINY_PREEMPT_RCU builds.
+The last three lines of the rcu_preempt section appear only in
+CONFIG_RCU_BOOST kernel builds. The fields are as follows:
+
+o "qlen" is the number of RCU callbacks currently waiting either
+ for an RCU grace period or waiting to be invoked. This is the
+ only field present for rcu_sched and rcu_bh, due to the
+ short-circuiting of grace period in those two cases.
+
+o "gp" is the number of grace periods that have completed.
+
+o "g197/p197/c197" displays the grace-period state, with the
+ "g" number being the number of grace periods that have started
+ (mod 256), the "p" number being the number of grace periods
+ that the CPU has responded to (also mod 256), and the "c"
+ number being the number of grace periods that have completed
+ (once again mode 256).
+
+ Why have both "gp" and "g"? Because the data flowing into
+ "gp" is only present in a CONFIG_RCU_TRACE kernel.
+
+o "tasks" is a set of bits. The first bit is "T" if there are
+ currently tasks that have recently blocked within an RCU
+ read-side critical section, the second bit is "N" if any of the
+ aforementioned tasks are blocking the current RCU grace period,
+ and the third bit is "E" if any of the aforementioned tasks are
+ blocking the current expedited grace period. Each bit is "."
+ if the corresponding condition does not hold.
+
+o "ttb" is a single bit. It is "B" if any of the blocked tasks
+ need to be priority boosted and "." otherwise.
+
+o "btg" indicates whether boosting has been carried out during
+ the current grace period, with "exp" indicating that boosting
+ is in progress for an expedited grace period, "no" indicating
+ that boosting has not yet started for a normal grace period,
+ "begun" indicating that boosting has bebug for a normal grace
+ period, and "done" indicating that boosting has completed for
+ a normal grace period.
+
+o "ntb" is the total number of tasks subjected to RCU priority boosting
+ periods since boot.
+
+o "neb" is the number of expedited grace periods that have had
+ to resort to RCU priority boosting since boot.
+
+o "nnb" is the number of normal grace periods that have had
+ to resort to RCU priority boosting since boot.
+
+o "j" is the low-order 12 bits of the jiffies counter in hexadecimal.
+
+o "bt" is the low-order 12 bits of the value that the jiffies counter
+ will have at the next time that boosting is scheduled to begin.
+
+o In the line beginning with "normal balk", the fields are as follows:
+
+ o "nt" is the number of times that the system balked from
+ boosting because there were no blocked tasks to boost.
+ Note that the system will balk from boosting even if the
+ grace period is overdue when the currently running task
+ is looping within an RCU read-side critical section.
+ There is no point in boosting in this case, because
+ boosting a running task won't make it run any faster.
+
+ o "gt" is the number of times that the system balked
+ from boosting because, although there were blocked tasks,
+ none of them were preventing the current grace period
+ from completing.
+
+ o "bt" is the number of times that the system balked
+ from boosting because boosting was already in progress.
+
+ o "b" is the number of times that the system balked from
+ boosting because boosting had already completed for
+ the grace period in question.
+
+ o "ny" is the number of times that the system balked from
+ boosting because it was not yet time to start boosting
+ the grace period in question.
+
+ o "nos" is the number of times that the system balked from
+ boosting for inexplicable ("not otherwise specified")
+ reasons. This can actually happen due to races involving
+ increments of the jiffies counter.
+
+o In the line beginning with "exp balk", the fields are as follows:
+
+ o "bt" is the number of times that the system balked from
+ boosting because there were no blocked tasks to boost.
+
+ o "nos" is the number of times that the system balked from
+ boosting for inexplicable ("not otherwise specified")
+ reasons.
diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index d9bcffd5943..470d3dba1a6 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -62,6 +62,10 @@ aic7*reg_print.c*
aic7*seq.h*
aicasm
aicdb.h*
+altivec1.c
+altivec2.c
+altivec4.c
+altivec8.c
asm-offsets.h
asm_offsets.h
autoconf.h*
@@ -76,6 +80,7 @@ btfixupprep
build
bvmlinux
bzImage*
+capflags.c
classlist.h*
comp*.log
compile.h*
@@ -94,6 +99,7 @@ devlist.h*
docproc
elf2ecoff
elfconfig.h*
+evergreen_reg_safe.h
fixdep
flask.h
fore200e_mkfirm
@@ -108,9 +114,16 @@ genksyms
*_gray256.c
ihex2fw
ikconfig.h*
+inat-tables.c
initramfs_data.cpio
initramfs_data.cpio.gz
initramfs_list
+int16.c
+int1.c
+int2.c
+int32.c
+int4.c
+int8.c
kallsyms
kconfig
keywords.c
@@ -140,6 +153,7 @@ mkprep
mktables
mktree
modpost
+modules.builtin
modules.order
modversions.h*
ncscope.*
@@ -153,14 +167,23 @@ pca200e.bin
pca200e_ecd.bin2
piggy.gz
piggyback
+piggy.S
pnmtologo
ppc_defs.h*
pss_boot.h
qconf
+r100_reg_safe.h
+r200_reg_safe.h
+r300_reg_safe.h
+r420_reg_safe.h
+r600_reg_safe.h
raid6altivec*.c
raid6int*.c
raid6tables.c
relocs
+rn50_reg_safe.h
+rs600_reg_safe.h
+rv515_reg_safe.h
series
setup
setup.bin
@@ -169,6 +192,7 @@ sImage
sm_tbl*
split-include
syscalltab.h
+tables.c
tags
tftpboot.img
timeconst.h
@@ -190,6 +214,7 @@ vmlinux
vmlinux-*
vmlinux.aout
vmlinux.lds
+voffset.h
vsyscall.lds
vsyscall_32.lds
wanxlfw.inc
@@ -200,3 +225,4 @@ wakeup.elf
wakeup.lds
zImage*
zconf.hash.c
+zoffset.h
diff --git a/Documentation/kernel-docs.txt b/Documentation/kernel-docs.txt
index 715eaaf1519..9a8674629a0 100644
--- a/Documentation/kernel-docs.txt
+++ b/Documentation/kernel-docs.txt
@@ -537,7 +537,7 @@
Notes: Further information in
http://www.oreilly.com/catalog/linuxdrive2/
- * Title: "Linux Device Drivers, 3nd Edition"
+ * Title: "Linux Device Drivers, 3rd Edition"
Authors: Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman
Publisher: O'Reilly & Associates.
Date: 2005.
@@ -592,14 +592,6 @@
Pages: 600.
ISBN: 0-13-101908-2
- * Title: "The Design and Implementation of the 4.4 BSD UNIX
- Operating System"
- Author: Marshall Kirk McKusick, Keith Bostic, Michael J. Karels,
- John S. Quarterman.
- Publisher: Addison-Wesley.
- Date: 1996.
- ISBN: 0-201-54979-4
-
* Title: "Programming for the real world - POSIX.4"
Author: Bill O. Gallmeister.
Publisher: O'Reilly & Associates, Inc..
@@ -610,28 +602,13 @@
POSIX. Good reference.
* Title: "UNIX Systems for Modern Architectures: Symmetric
- Multiprocesssing and Caching for Kernel Programmers"
+ Multiprocessing and Caching for Kernel Programmers"
Author: Curt Schimmel.
Publisher: Addison Wesley.
Date: June, 1994.
Pages: 432.
ISBN: 0-201-63338-8
- * Title: "The Design and Implementation of the 4.3 BSD UNIX
- Operating System"
- Author: Samuel J. Leffler, Marshall Kirk McKusick, Michael J.
- Karels, John S. Quarterman.
- Publisher: Addison-Wesley.
- Date: 1989 (reprinted with corrections on October, 1990).
- ISBN: 0-201-06196-1
-
- * Title: "The Design of the UNIX Operating System"
- Author: Maurice J. Bach.
- Publisher: Prentice Hall.
- Date: 1986.
- Pages: 471.
- ISBN: 0-13-201757-1
-
MISCELLANEOUS:
* Name: linux/Documentation
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 1031923f525..d6496fde618 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1579,20 +1579,12 @@ and is between 256 and 4096 characters. It is defined in the file
nmi_watchdog= [KNL,BUGS=X86] Debugging features for SMP kernels
Format: [panic,][num]
- Valid num: 0,1,2
+ Valid num: 0
0 - turn nmi_watchdog off
- 1 - use the IO-APIC timer for the NMI watchdog
- 2 - use the local APIC for the NMI watchdog using
- a performance counter. Note: This will use one
- performance counter and the local APIC's performance
- vector.
When panic is specified, panic when an NMI watchdog
timeout occurs.
This is useful when you use a panic=... timeout and
need the box quickly up again.
- Instead of 1 and 2 it is possible to use the following
- symbolic names: lapic and ioapic
- Example: nmi_watchdog=2 or nmi_watchdog=panic,lapic
netpoll.carrier_timeout=
[NET] Specifies amount of time (in seconds) that
diff --git a/Documentation/trace/events-power.txt b/Documentation/trace/events-power.txt
new file mode 100644
index 00000000000..96d87b67fe3
--- /dev/null
+++ b/Documentation/trace/events-power.txt
@@ -0,0 +1,90 @@
+
+ Subsystem Trace Points: power
+
+The power tracing system captures events related to power transitions
+within the kernel. Broadly speaking there are three major subheadings:
+
+ o Power state switch which reports events related to suspend (S-states),
+ cpuidle (C-states) and cpufreq (P-states)
+ o System clock related changes
+ o Power domains related changes and transitions
+
+This document describes what each of the tracepoints is and why they
+might be useful.
+
+Cf. include/trace/events/power.h for the events definitions.
+
+1. Power state switch events
+============================
+
+1.1 New trace API
+-----------------
+
+A 'cpu' event class gathers the CPU-related events: cpuidle and
+cpufreq.
+
+cpu_idle "state=%lu cpu_id=%lu"
+cpu_frequency "state=%lu cpu_id=%lu"
+
+A suspend event is used to indicate the system going in and out of the
+suspend mode:
+
+machine_suspend "state=%lu"
+
+
+Note: the value of '-1' or '4294967295' for state means an exit from the current state,
+i.e. trace_cpu_idle(4, smp_processor_id()) means that the system
+enters the idle state 4, while trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id())
+means that the system exits the previous idle state.
+
+The event which has 'state=4294967295' in the trace is very important to the user
+space tools which are using it to detect the end of the current state, and so to
+correctly draw the states diagrams and to calculate accurate statistics etc.
+
+1.2 DEPRECATED trace API
+------------------------
+
+A new Kconfig option CONFIG_EVENT_POWER_TRACING_DEPRECATED with the default value of
+'y' has been created. This allows the legacy trace power API to be used conjointly
+with the new trace API.
+The Kconfig option, the old trace API (in include/trace/events/power.h) and the
+old trace points will disappear in a future release (namely 2.6.41).
+
+power_start "type=%lu state=%lu cpu_id=%lu"
+power_frequency "type=%lu state=%lu cpu_id=%lu"
+power_end "cpu_id=%lu"
+
+The 'type' parameter takes one of those macros:
+ . POWER_NONE = 0,
+ . POWER_CSTATE = 1, /* C-State */
+ . POWER_PSTATE = 2, /* Fequency change or DVFS */
+
+The 'state' parameter is set depending on the type:
+ . Target C-state for type=POWER_CSTATE,
+ . Target frequency for type=POWER_PSTATE,
+
+power_end is used to indicate the exit of a state, corresponding to the latest
+power_start event.
+
+2. Clocks events
+================
+The clock events are used for clock enable/disable and for
+clock rate change.
+
+clock_enable "%s state=%lu cpu_id=%lu"
+clock_disable "%s state=%lu cpu_id=%lu"
+clock_set_rate "%s state=%lu cpu_id=%lu"
+
+The first parameter gives the clock name (e.g. "gpio1_iclk").
+The second parameter is '1' for enable, '0' for disable, the target
+clock rate for set_rate.
+
+3. Power domains events
+=======================
+The power domain events are used for power domains transitions
+
+power_domain_target "%s state=%lu cpu_id=%lu"
+
+The first parameter gives the power domain name (e.g. "mpu_pwrdm").
+The second parameter is the power domain target state.
+