aboutsummaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorBenjamin Herrenschmidt <benh@kernel.crashing.org>2008-07-15 15:44:51 +1000
committerBenjamin Herrenschmidt <benh@kernel.crashing.org>2008-07-15 15:44:51 +1000
commit43d2548bb2ef7e6d753f91468a746784041e522d (patch)
tree77d13fcd48fd998393abb825ec36e2b732684a73 /Documentation
parent585583d95c5660973bc0cf64add517b040acd8a4 (diff)
parent85082fd7cbe3173198aac0eb5e85ab1edcc6352c (diff)
Merge commit '85082fd7cbe3173198aac0eb5e85ab1edcc6352c' into test-build
Manual fixup of: arch/powerpc/Kconfig
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-block34
-rw-r--r--Documentation/ABI/testing/sysfs-bus-css35
-rw-r--r--Documentation/ABI/testing/sysfs-firmware-memmap71
-rw-r--r--Documentation/block/data-integrity.txt327
-rw-r--r--Documentation/ftrace.txt134
-rw-r--r--Documentation/ioctl-number.txt1
-rw-r--r--Documentation/kdump/kdump.txt2
-rw-r--r--Documentation/kernel-parameters.txt42
-rw-r--r--Documentation/nmi_watchdog.txt16
-rw-r--r--Documentation/scheduler/sched-domains.txt7
-rw-r--r--Documentation/scheduler/sched-rt-group.txt4
-rw-r--r--Documentation/sound/alsa/ALSA-Configuration.txt17
-rw-r--r--Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl4
-rw-r--r--Documentation/tracers/mmiotrace.txt164
-rw-r--r--Documentation/x86/i386/IO-APIC.txt (renamed from Documentation/i386/IO-APIC.txt)0
-rw-r--r--Documentation/x86/i386/boot.txt (renamed from Documentation/i386/boot.txt)79
-rw-r--r--Documentation/x86/i386/usb-legacy-support.txt (renamed from Documentation/i386/usb-legacy-support.txt)0
-rw-r--r--Documentation/x86/i386/zero-page.txt (renamed from Documentation/i386/zero-page.txt)0
-rw-r--r--Documentation/x86/x86_64/00-INDEX (renamed from Documentation/x86_64/00-INDEX)0
-rw-r--r--Documentation/x86/x86_64/boot-options.txt (renamed from Documentation/x86_64/boot-options.txt)0
-rw-r--r--Documentation/x86/x86_64/cpu-hotplug-spec (renamed from Documentation/x86_64/cpu-hotplug-spec)0
-rw-r--r--Documentation/x86/x86_64/fake-numa-for-cpusets (renamed from Documentation/x86_64/fake-numa-for-cpusets)0
-rw-r--r--Documentation/x86/x86_64/kernel-stacks (renamed from Documentation/x86_64/kernel-stacks)0
-rw-r--r--Documentation/x86/x86_64/machinecheck (renamed from Documentation/x86_64/machinecheck)0
-rw-r--r--Documentation/x86/x86_64/mm.txt (renamed from Documentation/x86_64/mm.txt)5
-rw-r--r--Documentation/x86/x86_64/uefi.txt (renamed from Documentation/x86_64/uefi.txt)4
26 files changed, 825 insertions, 121 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index 4bd9ea53912..44f52a4f590 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -26,3 +26,37 @@ Description:
I/O statistics of partition <part>. The format is the
same as the above-written /sys/block/<disk>/stat
format.
+
+
+What: /sys/block/<disk>/integrity/format
+Date: June 2008
+Contact: Martin K. Petersen <martin.petersen@oracle.com>
+Description:
+ Metadata format for integrity capable block device.
+ E.g. T10-DIF-TYPE1-CRC.
+
+
+What: /sys/block/<disk>/integrity/read_verify
+Date: June 2008
+Contact: Martin K. Petersen <martin.petersen@oracle.com>
+Description:
+ Indicates whether the block layer should verify the
+ integrity of read requests serviced by devices that
+ support sending integrity metadata.
+
+
+What: /sys/block/<disk>/integrity/tag_size
+Date: June 2008
+Contact: Martin K. Petersen <martin.petersen@oracle.com>
+Description:
+ Number of bytes of integrity tag space available per
+ 512 bytes of data.
+
+
+What: /sys/block/<disk>/integrity/write_generate
+Date: June 2008
+Contact: Martin K. Petersen <martin.petersen@oracle.com>
+Description:
+ Indicates whether the block layer should automatically
+ generate checksums for write requests bound for
+ devices that support receiving integrity metadata.
diff --git a/Documentation/ABI/testing/sysfs-bus-css b/Documentation/ABI/testing/sysfs-bus-css
new file mode 100644
index 00000000000..b585ec258a0
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-css
@@ -0,0 +1,35 @@
+What: /sys/bus/css/devices/.../type
+Date: March 2008
+Contact: Cornelia Huck <cornelia.huck@de.ibm.com>
+ linux-s390@vger.kernel.org
+Description: Contains the subchannel type, as reported by the hardware.
+ This attribute is present for all subchannel types.
+
+What: /sys/bus/css/devices/.../modalias
+Date: March 2008
+Contact: Cornelia Huck <cornelia.huck@de.ibm.com>
+ linux-s390@vger.kernel.org
+Description: Contains the module alias as reported with uevents.
+ It is of the format css:t<type> and present for all
+ subchannel types.
+
+What: /sys/bus/css/drivers/io_subchannel/.../chpids
+Date: December 2002
+Contact: Cornelia Huck <cornelia.huck@de.ibm.com>
+ linux-s390@vger.kernel.org
+Description: Contains the ids of the channel paths used by this
+ subchannel, as reported by the channel subsystem
+ during subchannel recognition.
+ Note: This is an I/O-subchannel specific attribute.
+Users: s390-tools, HAL
+
+What: /sys/bus/css/drivers/io_subchannel/.../pimpampom
+Date: December 2002
+Contact: Cornelia Huck <cornelia.huck@de.ibm.com>
+ linux-s390@vger.kernel.org
+Description: Contains the PIM/PAM/POM values, as reported by the
+ channel subsystem when last queried by the common I/O
+ layer (this implies that this attribute is not neccessarily
+ in sync with the values current in the channel subsystem).
+ Note: This is an I/O-subchannel specific attribute.
+Users: s390-tools, HAL
diff --git a/Documentation/ABI/testing/sysfs-firmware-memmap b/Documentation/ABI/testing/sysfs-firmware-memmap
new file mode 100644
index 00000000000..0d99ee6ae02
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-firmware-memmap
@@ -0,0 +1,71 @@
+What: /sys/firmware/memmap/
+Date: June 2008
+Contact: Bernhard Walle <bwalle@suse.de>
+Description:
+ On all platforms, the firmware provides a memory map which the
+ kernel reads. The resources from that memory map are registered
+ in the kernel resource tree and exposed to userspace via
+ /proc/iomem (together with other resources).
+
+ However, on most architectures that firmware-provided memory
+ map is modified afterwards by the kernel itself, either because
+ the kernel merges that memory map with other information or
+ just because the user overwrites that memory map via command
+ line.
+
+ kexec needs the raw firmware-provided memory map to setup the
+ parameter segment of the kernel that should be booted with
+ kexec. Also, the raw memory map is useful for debugging. For
+ that reason, /sys/firmware/memmap is an interface that provides
+ the raw memory map to userspace.
+
+ The structure is as follows: Under /sys/firmware/memmap there
+ are subdirectories with the number of the entry as their name:
+
+ /sys/firmware/memmap/0
+ /sys/firmware/memmap/1
+ /sys/firmware/memmap/2
+ /sys/firmware/memmap/3
+ ...
+
+ The maximum depends on the number of memory map entries provided
+ by the firmware. The order is just the order that the firmware
+ provides.
+
+ Each directory contains three files:
+
+ start : The start address (as hexadecimal number with the
+ '0x' prefix).
+ end : The end address, inclusive (regardless whether the
+ firmware provides inclusive or exclusive ranges).
+ type : Type of the entry as string. See below for a list of
+ valid types.
+
+ So, for example:
+
+ /sys/firmware/memmap/0/start
+ /sys/firmware/memmap/0/end
+ /sys/firmware/memmap/0/type
+ /sys/firmware/memmap/1/start
+ ...
+
+ Currently following types exist:
+
+ - System RAM
+ - ACPI Tables
+ - ACPI Non-volatile Storage
+ - reserved
+
+ Following shell snippet can be used to display that memory
+ map in a human-readable format:
+
+ -------------------- 8< ----------------------------------------
+ #!/bin/bash
+ cd /sys/firmware/memmap
+ for dir in * ; do
+ start=$(cat $dir/start)
+ end=$(cat $dir/end)
+ type=$(cat $dir/type)
+ printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type"
+ done
+ -------------------- >8 ----------------------------------------
diff --git a/Documentation/block/data-integrity.txt b/Documentation/block/data-integrity.txt
new file mode 100644
index 00000000000..e9dc8d86adc
--- /dev/null
+++ b/Documentation/block/data-integrity.txt
@@ -0,0 +1,327 @@
+----------------------------------------------------------------------
+1. INTRODUCTION
+
+Modern filesystems feature checksumming of data and metadata to
+protect against data corruption. However, the detection of the
+corruption is done at read time which could potentially be months
+after the data was written. At that point the original data that the
+application tried to write is most likely lost.
+
+The solution is to ensure that the disk is actually storing what the
+application meant it to. Recent additions to both the SCSI family
+protocols (SBC Data Integrity Field, SCC protection proposal) as well
+as SATA/T13 (External Path Protection) try to remedy this by adding
+support for appending integrity metadata to an I/O. The integrity
+metadata (or protection information in SCSI terminology) includes a
+checksum for each sector as well as an incrementing counter that
+ensures the individual sectors are written in the right order. And
+for some protection schemes also that the I/O is written to the right
+place on disk.
+
+Current storage controllers and devices implement various protective
+measures, for instance checksumming and scrubbing. But these
+technologies are working in their own isolated domains or at best
+between adjacent nodes in the I/O path. The interesting thing about
+DIF and the other integrity extensions is that the protection format
+is well defined and every node in the I/O path can verify the
+integrity of the I/O and reject it if corruption is detected. This
+allows not only corruption prevention but also isolation of the point
+of failure.
+
+----------------------------------------------------------------------
+2. THE DATA INTEGRITY EXTENSIONS
+
+As written, the protocol extensions only protect the path between
+controller and storage device. However, many controllers actually
+allow the operating system to interact with the integrity metadata
+(IMD). We have been working with several FC/SAS HBA vendors to enable
+the protection information to be transferred to and from their
+controllers.
+
+The SCSI Data Integrity Field works by appending 8 bytes of protection
+information to each sector. The data + integrity metadata is stored
+in 520 byte sectors on disk. Data + IMD are interleaved when
+transferred between the controller and target. The T13 proposal is
+similar.
+
+Because it is highly inconvenient for operating systems to deal with
+520 (and 4104) byte sectors, we approached several HBA vendors and
+encouraged them to allow separation of the data and integrity metadata
+scatter-gather lists.
+
+The controller will interleave the buffers on write and split them on
+read. This means that the Linux can DMA the data buffers to and from
+host memory without changes to the page cache.
+
+Also, the 16-bit CRC checksum mandated by both the SCSI and SATA specs
+is somewhat heavy to compute in software. Benchmarks found that
+calculating this checksum had a significant impact on system
+performance for a number of workloads. Some controllers allow a
+lighter-weight checksum to be used when interfacing with the operating
+system. Emulex, for instance, supports the TCP/IP checksum instead.
+The IP checksum received from the OS is converted to the 16-bit CRC
+when writing and vice versa. This allows the integrity metadata to be
+generated by Linux or the application at very low cost (comparable to
+software RAID5).
+
+The IP checksum is weaker than the CRC in terms of detecting bit
+errors. However, the strength is really in the separation of the data
+buffers and the integrity metadata. These two distinct buffers much
+match up for an I/O to complete.
+
+The separation of the data and integrity metadata buffers as well as
+the choice in checksums is referred to as the Data Integrity
+Extensions. As these extensions are outside the scope of the protocol
+bodies (T10, T13), Oracle and its partners are trying to standardize
+them within the Storage Networking Industry Association.
+
+----------------------------------------------------------------------
+3. KERNEL CHANGES
+
+The data integrity framework in Linux enables protection information
+to be pinned to I/Os and sent to/received from controllers that
+support it.
+
+The advantage to the integrity extensions in SCSI and SATA is that
+they enable us to protect the entire path from application to storage
+device. However, at the same time this is also the biggest
+disadvantage. It means that the protection information must be in a
+format that can be understood by the disk.
+
+Generally Linux/POSIX applications are agnostic to the intricacies of
+the storage devices they are accessing. The virtual filesystem switch
+and the block layer make things like hardware sector size and
+transport protocols completely transparent to the application.
+
+However, this level of detail is required when preparing the
+protection information to send to a disk. Consequently, the very
+concept of an end-to-end protection scheme is a layering violation.
+It is completely unreasonable for an application to be aware whether
+it is accessing a SCSI or SATA disk.
+
+The data integrity support implemented in Linux attempts to hide this
+from the application. As far as the application (and to some extent
+the kernel) is concerned, the integrity metadata is opaque information
+that's attached to the I/O.
+
+The current implementation allows the block layer to automatically
+generate the protection information for any I/O. Eventually the
+intent is to move the integrity metadata calculation to userspace for
+user data. Metadata and other I/O that originates within the kernel
+will still use the automatic generation interface.
+
+Some storage devices allow each hardware sector to be tagged with a
+16-bit value. The owner of this tag space is the owner of the block
+device. I.e. the filesystem in most cases. The filesystem can use
+this extra space to tag sectors as they see fit. Because the tag
+space is limited, the block interface allows tagging bigger chunks by
+way of interleaving. This way, 8*16 bits of information can be
+attached to a typical 4KB filesystem block.
+
+This also means that applications such as fsck and mkfs will need
+access to manipulate the tags from user space. A passthrough
+interface for this is being worked on.
+
+
+----------------------------------------------------------------------
+4. BLOCK LAYER IMPLEMENTATION DETAILS
+
+4.1 BIO
+
+The data integrity patches add a new field to struct bio when
+CONFIG_BLK_DEV_INTEGRITY is enabled. bio->bi_integrity is a pointer
+to a struct bip which contains the bio integrity payload. Essentially
+a bip is a trimmed down struct bio which holds a bio_vec containing
+the integrity metadata and the required housekeeping information (bvec
+pool, vector count, etc.)
+
+A kernel subsystem can enable data integrity protection on a bio by
+calling bio_integrity_alloc(bio). This will allocate and attach the
+bip to the bio.
+
+Individual pages containing integrity metadata can subsequently be
+attached using bio_integrity_add_page().
+
+bio_free() will automatically free the bip.
+
+
+4.2 BLOCK DEVICE
+
+Because the format of the protection data is tied to the physical
+disk, each block device has been extended with a block integrity
+profile (struct blk_integrity). This optional profile is registered
+with the block layer using blk_integrity_register().
+
+The profile contains callback functions for generating and verifying
+the protection data, as well as getting and setting application tags.
+The profile also contains a few constants to aid in completing,
+merging and splitting the integrity metadata.
+
+Layered block devices will need to pick a profile that's appropriate
+for all subdevices. blk_integrity_compare() can help with that. DM
+and MD linear, RAID0 and RAID1 are currently supported. RAID4/5/6
+will require extra work due to the application tag.
+
+
+----------------------------------------------------------------------
+5.0 BLOCK LAYER INTEGRITY API
+
+5.1 NORMAL FILESYSTEM
+
+ The normal filesystem is unaware that the underlying block device
+ is capable of sending/receiving integrity metadata. The IMD will
+ be automatically generated by the block layer at submit_bio() time
+ in case of a WRITE. A READ request will cause the I/O integrity
+ to be verified upon completion.
+
+ IMD generation and verification can be toggled using the
+
+ /sys/block/<bdev>/integrity/write_generate
+
+ and
+
+ /sys/block/<bdev>/integrity/read_verify
+
+ flags.
+
+
+5.2 INTEGRITY-AWARE FILESYSTEM
+
+ A filesystem that is integrity-aware can prepare I/Os with IMD
+ attached. It can also use the application tag space if this is
+ supported by the block device.
+
+
+ int bdev_integrity_enabled(block_device, int rw);
+
+ bdev_integrity_enabled() will return 1 if the block device
+ supports integrity metadata transfer for the data direction
+ specified in 'rw'.
+
+ bdev_integrity_enabled() honors the write_generate and
+ read_verify flags in sysfs and will respond accordingly.
+
+
+ int bio_integrity_prep(bio);
+
+ To generate IMD for WRITE and to set up buffers for READ, the
+ filesystem must call bio_integrity_prep(bio).
+
+ Prior to calling this function, the bio data direction and start
+ sector must be set, and the bio should have all data pages
+ added. It is up to the caller to ensure that the bio does not
+ change while I/O is in progress.
+
+ bio_integrity_prep() should only be called if
+ bio_integrity_enabled() returned 1.
+
+
+ int bio_integrity_tag_size(bio);
+
+ If the filesystem wants to use the application tag space it will
+ first have to find out how much storage space is available.
+ Because tag space is generally limited (usually 2 bytes per
+ sector regardless of sector size), the integrity framework
+ supports interleaving the information between the sectors in an
+ I/O.
+
+ Filesystems can call bio_integrity_tag_size(bio) to find out how
+ many bytes of storage are available for that particular bio.
+
+ Another option is bdev_get_tag_size(block_device) which will
+ return the number of available bytes per hardware sector.
+
+
+ int bio_integrity_set_tag(bio, void *tag_buf, len);
+
+ After a successful return from bio_integrity_prep(),
+ bio_integrity_set_tag() can be used to attach an opaque tag
+ buffer to a bio. Obviously this only makes sense if the I/O is
+ a WRITE.
+
+
+ int bio_integrity_get_tag(bio, void *tag_buf, len);
+
+ Similarly, at READ I/O completion time the filesystem can
+ retrieve the tag buffer using bio_integrity_get_tag().
+
+
+6.3 PASSING EXISTING INTEGRITY METADATA
+
+ Filesystems that either generate their own integrity metadata or
+ are capable of transferring IMD from user space can use the
+ following calls:
+
+
+ struct bip * bio_integrity_alloc(bio, gfp_mask, nr_pages);
+
+ Allocates the bio integrity payload and hangs it off of the bio.
+ nr_pages indicate how many pages of protection data need to be
+ stored in the integrity bio_vec list (similar to bio_alloc()).
+
+ The integrity payload will be freed at bio_free() time.
+
+
+ int bio_integrity_add_page(bio, page, len, offset);
+
+ Attaches a page containing integrity metadata to an existing
+ bio. The bio must have an existing bip,
+ i.e. bio_integrity_alloc() must have been called. For a WRITE,
+ the integrity metadata in the pages must be in a format
+ understood by the target device with the notable exception that
+ the sector numbers will be remapped as the request traverses the
+ I/O stack. This implies that the pages added using this call
+ will be modified during I/O! The first reference tag in the
+ integrity metadata must have a value of bip->bip_sector.
+
+ Pages can be added using bio_integrity_add_page() as long as
+ there is room in the bip bio_vec array (nr_pages).
+
+ Upon completion of a READ operation, the attached pages will
+ contain the integrity metadata received from the storage device.
+ It is up to the receiver to process them and verify data
+ integrity upon completion.
+
+
+6.4 REGISTERING A BLOCK DEVICE AS CAPABLE OF EXCHANGING INTEGRITY
+ METADATA
+
+ To enable integrity exchange on a block device the gendisk must be
+ registered as capable:
+
+ int blk_integrity_register(gendisk, blk_integrity);
+
+ The blk_integrity struct is a template and should contain the
+ following:
+
+ static struct blk_integrity my_profile = {
+ .name = "STANDARDSBODY-TYPE-VARIANT-CSUM",
+ .generate_fn = my_generate_fn,
+ .verify_fn = my_verify_fn,
+ .get_tag_fn = my_get_tag_fn,
+ .set_tag_fn = my_set_tag_fn,
+ .tuple_size = sizeof(struct my_tuple_size),
+ .tag_size = <tag bytes per hw sector>,
+ };
+
+ 'name' is a text string which will be visible in sysfs. This is
+ part of the userland API so chose it carefully and never change
+ it. The format is standards body-type-variant.
+ E.g. T10-DIF-TYPE1-IP or T13-EPP-0-CRC.
+
+ 'generate_fn' generates appropriate integrity metadata (for WRITE).
+
+ 'verify_fn' verifies that the data buffer matches the integrity
+ metadata.
+
+ 'tuple_size' must be set to match the size of the integrity
+ metadata per sector. I.e. 8 for DIF and EPP.
+
+ 'tag_size' must be set to identify how many bytes of tag space
+ are available per hardware sector. For DIF this is either 2 or
+ 0 depending on the value of the Control Mode Page ATO bit.
+
+ See 6.2 for a description of get_tag_fn and set_tag_fn.
+
+----------------------------------------------------------------------
+2007-12-24 Martin K. Petersen <martin.petersen@oracle.com>
diff --git a/Documentation/ftrace.txt b/Documentation/ftrace.txt
index 13e4bf054c3..77d3faa1a61 100644
--- a/Documentation/ftrace.txt
+++ b/Documentation/ftrace.txt
@@ -2,8 +2,11 @@
========================
Copyright 2008 Red Hat Inc.
-Author: Steven Rostedt <srostedt@redhat.com>
+ Author: Steven Rostedt <srostedt@redhat.com>
+ License: The GNU Free Documentation License, Version 1.2
+Reviewers: Elias Oltmanns and Randy Dunlap
+Writen for: 2.6.26-rc8 linux-2.6-tip.git tip/tracing/ftrace branch
Introduction
------------
@@ -46,7 +49,7 @@ of ftrace. Here is a list of some of the key files:
that is configured.
available_tracers : This holds the different types of tracers that
- has been compiled into the kernel. The tracers
+ have been compiled into the kernel. The tracers
listed here can be configured by echoing in their
name into current_tracer.
@@ -90,11 +93,13 @@ of ftrace. Here is a list of some of the key files:
trace_entries : This sets or displays the number of trace
entries each CPU buffer can hold. The tracer buffers
are the same size for each CPU, so care must be
- taken when modifying the trace_entries. The number
- of actually entries will be the number given
- times the number of possible CPUS. The buffers
- are saved as individual pages, and the actual entries
- will always be rounded up to entries per page.
+ taken when modifying the trace_entries. The trace
+ buffers are allocated in pages (blocks of memory that
+ the kernel uses for allocation, usually 4 KB in size).
+ Since each entry is smaller than a page, if the last
+ allocated page has room for more entries than were
+ requested, the rest of the page is used to allocate
+ entries.
This can only be updated when the current_tracer
is set to "none".
@@ -114,13 +119,13 @@ of ftrace. Here is a list of some of the key files:
in performance. This also has a side effect of
enabling or disabling specific functions to be
traced. Echoing in names of functions into this
- file will limit the trace to only those files.
+ file will limit the trace to only these functions.
set_ftrace_notrace: This has the opposite effect that
set_ftrace_filter has. Any function that is added
here will not be traced. If a function exists
- in both set_ftrace_filter and set_ftrace_notrace
- the function will _not_ bet traced.
+ in both set_ftrace_filter and set_ftrace_notrace,
+ the function will _not_ be traced.
available_filter_functions : When a function is encountered the first
time by the dynamic tracer, it is recorded and
@@ -138,7 +143,7 @@ Here are the list of current tracers that can be configured.
ftrace - function tracer that uses mcount to trace all functions.
It is possible to filter out which functions that are
- traced when dynamic ftrace is configured in.
+ to be traced when dynamic ftrace is configured in.
sched_switch - traces the context switches between tasks.
@@ -297,13 +302,13 @@ explains which is which.
The above is mostly meaningful for kernel developers.
- time: This differs from the trace output where as the trace output
- contained a absolute timestamp. This timestamp is relative
- to the start of the first entry in the the trace.
+ time: This differs from the trace file output. The trace file output
+ included an absolute timestamp. The timestamp used by the
+ latency_trace file is relative to the start of the trace.
delay: This is just to help catch your eye a bit better. And
needs to be fixed to be only relative to the same CPU.
- The marks is determined by the difference between this
+ The marks are determined by the difference between this
current trace and the next trace.
'!' - greater than preempt_mark_thresh (default 100)
'+' - greater than 1 microsecond
@@ -322,13 +327,13 @@ output. To see what is available, simply cat the file:
print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \
noblock nostacktrace nosched-tree
-To disable one of the options, echo in the option appended with "no".
+To disable one of the options, echo in the option prepended with "no".
echo noprint-parent > /debug/tracing/iter_ctrl
To enable an option, leave off the "no".
- echo sym-offest > /debug/tracing/iter_ctrl
+ echo sym-offset > /debug/tracing/iter_ctrl
Here are the available options:
@@ -344,7 +349,7 @@ Here are the available options:
sym-offset - Display not only the function name, but also the offset
in the function. For example, instead of seeing just
- "ktime_get" you will see "ktime_get+0xb/0x20"
+ "ktime_get", you will see "ktime_get+0xb/0x20".
sym-offset:
bash-4000 [01] 1477.606694: simple_strtoul+0x6/0xa0
@@ -364,7 +369,7 @@ Here are the available options:
user applications that can translate the raw numbers better than
having it done in the kernel.
- hex - similar to raw, but the numbers will be in a hexadecimal format.
+ hex - Similar to raw, but the numbers will be in a hexadecimal format.
bin - This will print out the formats in raw binary.
@@ -381,7 +386,7 @@ sched_switch
------------
This tracer simply records schedule switches. Here's an example
-on how to implement it.
+of how to use it.
# echo sched_switch > /debug/tracing/current_tracer
# echo 1 > /debug/tracing/tracing_enabled
@@ -470,7 +475,7 @@ interrupt from triggering or the mouse interrupt from letting the
kernel know of a new mouse event. The result is a latency with the
reaction time.
-The irqsoff tracer tracks the time interrupts are disabled and when
+The irqsoff tracer tracks the time interrupts are disabled to the time
they are re-enabled. When a new maximum latency is hit, it saves off
the trace so that it may be retrieved at a later time. Every time a
new maximum in reached, the old saved trace is discarded and the new
@@ -519,7 +524,7 @@ The difference between the 6 and the displayed timestamp 7us is
because the clock must have incremented between the time of recording
the max latency and recording the function that had that latency.
-Note the above had ftrace_enabled not set. If we set the ftrace_enabled
+Note the above had ftrace_enabled not set. If we set the ftrace_enabled,
we get a much larger output:
# tracer: irqsoff
@@ -570,21 +575,21 @@ vim:ft=help
Here we traced a 50 microsecond latency. But we also see all the
-functions that were called during that time. Note that enabling
-function tracing we endure an added overhead. This overhead may
-extend the latency times. But never the less, this trace has provided
-some very helpful debugging.
+functions that were called during that time. Note that by enabling
+function tracing, we endure an added overhead. This overhead may
+extend the latency times. But nevertheless, this trace has provided
+some very helpful debugging information.
preemptoff
----------
-When preemption is disabled we may be able to receive interrupts but
-the task can not be preempted and a higher priority task must wait
+When preemption is disabled, we may be able to receive interrupts but
+the task cannot be preempted and a higher priority task must wait
for preemption to be enabled again before it can preempt a lower
priority task.
-The preemptoff tracer traces the places that disables preemption.
+The preemptoff tracer traces the places that disable preemption.
Like the irqsoff, it records the maximum latency that preemption
was disabled. The control of preemptoff is much like the irqsoff.
@@ -696,7 +701,7 @@ Notice that the __do_softirq when called doesn't have a preempt_count.
It may seem that we missed a preempt enabled. What really happened
is that the preempt count is held on the threads stack and we
switched to the softirq stack (4K stacks in effect). The code
-does not copy the preempt count, but because interrupts are disabled
+does not copy the preempt count, but because interrupts are disabled,
we don't need to worry about it. Having a tracer like this is good
to let people know what really happens inside the kernel.
@@ -732,7 +737,7 @@ To record this time, use the preemptirqsoff tracer.
Again, using this trace is much like the irqsoff and preemptoff tracers.
- # echo preemptoff > /debug/tracing/current_tracer
+ # echo preemptirqsoff > /debug/tracing/current_tracer
# echo 0 > /debug/tracing/tracing_max_latency
# echo 1 > /debug/tracing/tracing_enabled
# ls -ltr
@@ -862,9 +867,9 @@ This is a very interesting trace. It started with the preemption of
the ls task. We see that the task had the "need_resched" bit set
with the 'N' in the trace. Interrupts are disabled in the spin_lock
and the trace started. We see that a schedule took place to run
-sshd. When the interrupts were enabled we took an interrupt.
-On return of the interrupt the softirq ran. We took another interrupt
-while running the softirq as we see with the capital 'H'.
+sshd. When the interrupts were enabled, we took an interrupt.
+On return from the interrupt handler, the softirq ran. We took another
+interrupt while running the softirq as we see with the capital 'H'.
wakeup
@@ -876,9 +881,9 @@ time it executes. This is also known as "schedule latency".
I stress the point that this is about RT tasks. It is also important
to know the scheduling latency of non-RT tasks, but the average
schedule latency is better for non-RT tasks. Tools like
-LatencyTop is more appropriate for such measurements.
+LatencyTop are more appropriate for such measurements.
-Real-Time environments is interested in the worst case latency.
+Real-Time environments are interested in the worst case latency.
That is the longest latency it takes for something to happen, and
not the average. We can have a very fast scheduler that may only
have a large latency once in a while, but that would not work well
@@ -889,8 +894,8 @@ tasks that are unpredictable will overwrite the worst case latency
of RT tasks.
Since this tracer only deals with RT tasks, we will run this slightly
-different than we did with the previous tracers. Instead of performing
-an 'ls' we will run 'sleep 1' under 'chrt' which changes the