diff options
author | James Bottomley <jejb@mulgrave.il.steeleye.com> | 2006-11-22 12:06:44 -0600 |
---|---|---|
committer | James Bottomley <jejb@mulgrave.il.steeleye.com> | 2006-11-22 12:06:44 -0600 |
commit | 0bd2af46839ad6262d25714a6ec0365db9d6b98f (patch) | |
tree | dcced72d230d69fd0c5816ac6dd03ab84799a93e /Documentation | |
parent | e138a5d2356729b8752e88520cc1525fae9794ac (diff) | |
parent | f26b90440cd74c78fe10c9bd5160809704a9627c (diff) |
Merge ../scsi-rc-fixes-2.6
Diffstat (limited to 'Documentation')
34 files changed, 796 insertions, 412 deletions
diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power index d882f809387..dcff4d0623a 100644 --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power @@ -21,7 +21,7 @@ Description: these states. What: /sys/power/disk -Date: August 2006 +Date: September 2006 Contact: Rafael J. Wysocki <rjw@sisk.pl> Description: The /sys/power/disk file controls the operating mode of the @@ -39,6 +39,19 @@ Description: 'reboot' - the memory image will be saved by the kernel and the system will be rebooted. + Additionally, /sys/power/disk can be used to turn on one of the + two testing modes of the suspend-to-disk mechanism: 'testproc' + or 'test'. If the suspend-to-disk mechanism is in the + 'testproc' mode, writing 'disk' to /sys/power/state will cause + the kernel to disable nonboot CPUs and freeze tasks, wait for 5 + seconds, unfreeze tasks and enable nonboot CPUs. If it is in + the 'test' mode, writing 'disk' to /sys/power/state will cause + the kernel to disable nonboot CPUs and freeze tasks, shrink + memory, suspend devices, wait for 5 seconds, resume devices, + unfreeze tasks and enable nonboot CPUs. Then, we are able to + look in the log messages and work out, for example, which code + is being slow and which device drivers are misbehaving. + The suspend-to-disk method may be chosen by writing to this file one of the accepted strings: @@ -46,6 +59,8 @@ Description: 'platform' 'shutdown' 'reboot' + 'testproc' + 'test' It will only change to 'firmware' or 'platform' if the system supports that. diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index 66e1cf73357..db9499adbed 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile @@ -9,7 +9,7 @@ DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \ kernel-hacking.xml kernel-locking.xml deviceiobook.xml \ procfs-guide.xml writing_usb_driver.xml \ - kernel-api.xml journal-api.xml lsm.xml usb.xml \ + kernel-api.xml filesystems.xml lsm.xml usb.xml \ gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \ genericirq.xml diff --git a/Documentation/DocBook/journal-api.tmpl b/Documentation/DocBook/filesystems.tmpl index 2077f9a28c1..39fa2aba7f9 100644 --- a/Documentation/DocBook/journal-api.tmpl +++ b/Documentation/DocBook/filesystems.tmpl @@ -2,39 +2,11 @@ <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> -<book id="LinuxJBDAPI"> +<book id="Linux-filesystems-API"> <bookinfo> - <title>The Linux Journalling API</title> - <authorgroup> - <author> - <firstname>Roger</firstname> - <surname>Gammans</surname> - <affiliation> - <address> - <email>rgammans@computer-surgery.co.uk</email> - </address> - </affiliation> - </author> - </authorgroup> - - <authorgroup> - <author> - <firstname>Stephen</firstname> - <surname>Tweedie</surname> - <affiliation> - <address> - <email>sct@redhat.com</email> - </address> - </affiliation> - </author> - </authorgroup> + <title>Linux Filesystems API</title> - <copyright> - <year>2002</year> - <holder>Roger Gammans</holder> - </copyright> - -<legalnotice> + <legalnotice> <para> This documentation is free software; you can redistribute it and/or modify it under the terms of the GNU General Public @@ -42,21 +14,21 @@ version 2 of the License, or (at your option) any later version. </para> - + <para> This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. </para> - + <para> You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA </para> - + <para> For more details see the file COPYING in the source distribution of Linux. @@ -66,17 +38,113 @@ <toc></toc> - <chapter id="Overview"> + <chapter id="vfs"> + <title>The Linux VFS</title> + <sect1><title>The Filesystem types</title> +!Iinclude/linux/fs.h + </sect1> + <sect1><title>The Directory Cache</title> +!Efs/dcache.c +!Iinclude/linux/dcache.h + </sect1> + <sect1><title>Inode Handling</title> +!Efs/inode.c +!Efs/bad_inode.c + </sect1> + <sect1><title>Registration and Superblocks</title> +!Efs/super.c + </sect1> + <sect1><title>File Locks</title> +!Efs/locks.c +!Ifs/locks.c + </sect1> + <sect1><title>Other Functions</title> +!Efs/mpage.c +!Efs/namei.c +!Efs/buffer.c +!Efs/bio.c +!Efs/seq_file.c +!Efs/filesystems.c +!Efs/fs-writeback.c +!Efs/block_dev.c + </sect1> + </chapter> + + <chapter id="proc"> + <title>The proc filesystem</title> + + <sect1><title>sysctl interface</title> +!Ekernel/sysctl.c + </sect1> + + <sect1><title>proc filesystem interface</title> +!Ifs/proc/base.c + </sect1> + </chapter> + + <chapter id="sysfs"> + <title>The Filesystem for Exporting Kernel Objects</title> +!Efs/sysfs/file.c +!Efs/sysfs/symlink.c +!Efs/sysfs/bin.c + </chapter> + + <chapter id="debugfs"> + <title>The debugfs filesystem</title> + + <sect1><title>debugfs interface</title> +!Efs/debugfs/inode.c +!Efs/debugfs/file.c + </sect1> + </chapter> + + <chapter id="LinuxJDBAPI"> + <chapterinfo> + <title>The Linux Journalling API</title> + + <authorgroup> + <author> + <firstname>Roger</firstname> + <surname>Gammans</surname> + <affiliation> + <address> + <email>rgammans@computer-surgery.co.uk</email> + </address> + </affiliation> + </author> + </authorgroup> + + <authorgroup> + <author> + <firstname>Stephen</firstname> + <surname>Tweedie</surname> + <affiliation> + <address> + <email>sct@redhat.com</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2002</year> + <holder>Roger Gammans</holder> + </copyright> + </chapterinfo> + + <title>The Linux Journalling API</title> + + <sect1> <title>Overview</title> - <sect1> + <sect2> <title>Details</title> <para> -The journalling layer is easy to use. You need to +The journalling layer is easy to use. You need to first of all create a journal_t data structure. There are two calls to do this dependent on how you decide to allocate the physical -media on which the journal resides. The journal_init_inode() call +media on which the journal resides. The journal_init_inode() call is for journals stored in filesystem inodes, or the journal_init_dev() -call can be use for journal stored on a raw device (in a continuous range +call can be use for journal stored on a raw device (in a continuous range of blocks). A journal_t is a typedef for a struct pointer, so when you are finally finished make sure you call journal_destroy() on it to free up any used kernel memory. @@ -91,27 +159,26 @@ need to call journal_create(). <para> Most of the time however your journal file will already have been created, but before you load it you must call journal_wipe() to empty the journal file. -Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the +Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the job of the client file system to detect this and skip the call to journal_wipe(). </para> <para> In either case the next call should be to journal_load() which prepares the -journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery() +journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery() for you if it detects any outstanding transactions in the journal and similarly journal_load() will call journal_recover() if necessary. I would advise reading fs/ext3/super.c for examples on this stage. -[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly -complicate the API. Or isn't a good idea for the journal layer to hide +[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly +complicate the API. Or isn't a good idea for the journal layer to hide dirty mounts from the client fs] </para> <para> -Now you can go ahead and start modifying the underlying +Now you can go ahead and start modifying the underlying filesystem. Almost. </para> - <para> You still need to actually journal your filesystem changes, this @@ -138,10 +205,10 @@ individual buffers (blocks). Before you start to modify a buffer you need to call journal_get_{create,write,undo}_access() as appropriate, this allows the journalling layer to copy the unmodified data if it needs to. After all the buffer may be part of a previously uncommitted -transaction. +transaction. At this point you are at last ready to modify a buffer, and once you are have done so you need to call journal_dirty_{meta,}data(). -Or if you've asked for access to a buffer you now know is now longer +Or if you've asked for access to a buffer you now know is now longer required to be pushed back on the device you can call journal_forget() in much the same way as you might have used bforget() in the past. </para> @@ -156,7 +223,6 @@ Then at umount time , in your put_super() (2.4) or write_super() (2.5) you can then call journal_destroy() to clean up your in-core journal object. </para> - <para> Unfortunately there a couple of ways the journal layer can cause a deadlock. The first thing to note is that each task can only have @@ -164,19 +230,19 @@ a single outstanding transaction at any one time, remember nothing commits until the outermost journal_stop(). This means you must complete the transaction at the end of each file/inode/address etc. operation you perform, so that the journalling system isn't re-entered -on another journal. Since transactions can't be nested/batched +on another journal. Since transactions can't be nested/batched across differing journals, and another filesystem other than yours (say ext3) may be modified in a later syscall. </para> <para> -The second case to bear in mind is that journal_start() can -block if there isn't enough space in the journal for your transaction +The second case to bear in mind is that journal_start() can +block if there isn't enough space in the journal for your transaction (based on the passed nblocks param) - when it blocks it merely(!) needs to -wait for transactions to complete and be committed from other tasks, -so essentially we are waiting for journal_stop(). So to avoid +wait for transactions to complete and be committed from other tasks, +so essentially we are waiting for journal_stop(). So to avoid deadlocks you must treat journal_start/stop() as if they -were semaphores and include them in your semaphore ordering rules to prevent +were semaphores and include them in your semaphore ordering rules to prevent deadlocks. Note that journal_extend() has similar blocking behaviour to journal_start() so you can deadlock here just as easily as on journal_start(). </para> @@ -184,7 +250,7 @@ journal_start() so you can deadlock here just as easily as on journal_start(). <para> Try to reserve the right number of blocks the first time. ;-). This will be the maximum number of blocks you are going to touch in this transaction. -I advise having a look at at least ext3_jbd.h to see the basis on which +I advise having a look at at least ext3_jbd.h to see the basis on which ext3 uses to make these decisions. </para> @@ -193,13 +259,13 @@ Another wriggle to watch out for is your on-disk block allocation strategy. why? Because, if you undo a delete, you need to ensure you haven't reused any of the freed blocks in a later transaction. One simple way of doing this is make sure any blocks you allocate only have checkpointed transactions -listed against them. Ext3 does this in ext3_test_allocatable(). +listed against them. Ext3 does this in ext3_test_allocatable(). </para> <para> Lock is also providing through journal_{un,}lock_updates(), ext3 uses this when it wants a window with a clean and stable fs for a moment. -eg. +eg. </para> <programlisting> @@ -230,19 +296,19 @@ extend it like this:- struct journal_callback for_jbd; // Stuff for myfs allocated together. myfs_inode* i_commited; - + } </programlisting> <para> -this would be useful if you needed to know when data was committed to a +this would be useful if you needed to know when data was committed to a particular inode. </para> -</sect1> + </sect2> -<sect1> -<title>Summary</title> + <sect2> + <title>Summary</title> <para> Using the journal is a matter of wrapping the different context changes, being each mount, each modification (transaction) and each changed buffer @@ -260,15 +326,15 @@ an example. if (clean) journal_wipe(); journal_load(); - foreach(transaction) { /*transactions must be + foreach(transaction) { /*transactions must be completed before - a syscall returns to + a syscall returns to userspace*/ handle_t * xct=journal_start(my_jnrl); foreach(bh) { journal_get_{create,write,undo}_access(xact,bh); - if ( myfs_modify(bh) ) { /* returns true + if ( myfs_modify(bh) ) { /* returns true if makes changes */ journal_dirty_{meta,}data(xact,bh); } else { @@ -279,55 +345,57 @@ an example. } journal_destroy(my_jrnl); </programlisting> -</sect1> + </sect2> -</chapter> + </sect1> - <chapter id="adt"> + <sect1> <title>Data Types</title> - <para> + <para> The journalling layer uses typedefs to 'hide' the concrete definitions of the structures used. As a client of the JBD layer you can just rely on the using the pointer as a magic cookie of some sort. - + Obviously the hiding is not enforced as this is 'C'. - </para> - <sect1><title>Structures</title> + </para> + <sect2><title>Structures</title> !Iinclude/linux/jbd.h - </sect1> -</chapter> + </sect2> + </sect1> - <chapter id="calls"> + <sect1> <title>Functions</title> - <para> + <para> The functions here are split into two groups those that affect a journal as a whole, and those which are used to manage transactions -</para> - <sect1><title>Journal Level</title> + </para> + <sect2><title>Journal Level</title> !Efs/jbd/journal.c !Ifs/jbd/recovery.c - </sect1> - <sect1><title>Transasction Level</title> -!Efs/jbd/transaction.c - </sect1> -</chapter> -<chapter> + </sect2> + <sect2><title>Transasction Level</title> +!Efs/jbd/transaction.c + </sect2> + </sect1> + <sect1> <title>See also</title> <para> - <citation> + <citation> <ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/journal-design.ps.gz"> - Journaling the Linux ext2fs Filesystem,LinuxExpo 98, Stephen Tweedie + Journaling the Linux ext2fs Filesystem, LinuxExpo 98, Stephen Tweedie </ulink> - </citation> - </para> - <para> + </citation> + </para> + <para> <citation> <ulink url="http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html"> - Ext3 Journalling FileSystem , OLS 2000, Dr. Stephen Tweedie + Ext3 Journalling FileSystem, OLS 2000, Dr. Stephen Tweedie </ulink> </citation> - </para> -</chapter> + </para> + </sect1> + + </chapter> </book> diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index 2b5ac604948..a166675c430 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -182,66 +182,6 @@ X!Ilib/string.c </sect1> </chapter> - <chapter id="vfs"> - <title>The Linux VFS</title> - <sect1><title>The Filesystem types</title> -!Iinclude/linux/fs.h - </sect1> - <sect1><title>The Directory Cache</title> -!Efs/dcache.c -!Iinclude/linux/dcache.h - </sect1> - <sect1><title>Inode Handling</title> -!Efs/inode.c -!Efs/bad_inode.c - </sect1> - <sect1><title>Registration and Superblocks</title> -!Efs/super.c - </sect1> - <sect1><title>File Locks</title> -!Efs/locks.c -!Ifs/locks.c - </sect1> - <sect1><title>Other Functions</title> -!Efs/mpage.c -!Efs/namei.c -!Efs/buffer.c -!Efs/bio.c -!Efs/seq_file.c -!Efs/filesystems.c -!Efs/fs-writeback.c -!Efs/block_dev.c - </sect1> - </chapter> - - <chapter id="proc"> - <title>The proc filesystem</title> - - <sect1><title>sysctl interface</title> -!Ekernel/sysctl.c - </sect1> - - <sect1><title>proc filesystem interface</title> -!Ifs/proc/base.c - </sect1> - </chapter> - - <chapter id="sysfs"> - <title>The Filesystem for Exporting Kernel Objects</title> -!Efs/sysfs/file.c -!Efs/sysfs/symlink.c -!Efs/sysfs/bin.c - </chapter> - - <chapter id="debugfs"> - <title>The debugfs filesystem</title> - - <sect1><title>debugfs interface</title> -!Efs/debugfs/inode.c -!Efs/debugfs/file.c - </sect1> - </chapter> - <chapter id="relayfs"> <title>relay interface support</title> diff --git a/Documentation/HOWTO b/Documentation/HOWTO index d6f3dd1a346..8d51c148f72 100644 --- a/Documentation/HOWTO +++ b/Documentation/HOWTO @@ -395,6 +395,26 @@ bugme-janitor mailing list (every change in the bugzilla is mailed here) +Managing bug reports +-------------------- + +One of the best ways to put into practice your hacking skills is by fixing +bugs reported by other people. Not only you will help to make the kernel +more stable, you'll learn to fix real world problems and you will improve +your skills, and other developers will be aware of your presence. Fixing +bugs is one of the best ways to get merits among other developers, because +not many people like wasting time fixing other people's bugs. + +To work in the already reported bug reports, go to http://bugzilla.kernel.org. +If you want to be advised of the future bug reports, you can subscribe to the +bugme-new mailing list (only new bug reports are mailed here) or to the +bugme-janitor mailing list (every change in the bugzilla is mailed here) + + http://lists.osdl.org/mailman/listinfo/bugme-new + http://lists.osdl.org/mailman/listinfo/bugme-janitors + + + Mailing lists ------------- diff --git a/Documentation/MSI-HOWTO.txt b/Documentation/MSI-HOWTO.txt index c70306abb7b..5c34910665d 100644 --- a/Documentation/MSI-HOWTO.txt +++ b/Documentation/MSI-HOWTO.txt @@ -470,7 +470,68 @@ LOC: 324553 325068 ERR: 0 MIS: 0 -6. FAQ +6. MSI quirks + +Several PCI chipsets or devices are known to not support MSI. +The PCI stack provides 3 possible levels of MSI disabling: +* on a single device +* on all devices behind a specific bridge +* globally + +6.1. Disabling MSI on a single device + +Under some circumstances, it might be required to disable MSI on a +single device, It may be achived by either not calling pci_enable_msi() +or all, or setting the pci_dev->no_msi flag before (most of the time +in a quirk). + +6.2. Disabling MSI below a bridge + +The vast majority of MSI quirks are required by PCI bridges not +being able to route MSI between busses. In this case, MSI have to be +disabled on all devices behind this bridge. It is achieves by setting +the PCI_BUS_FLAGS_NO_MSI flag in the pci_bus->bus_flags of the bridge +subordinate bus. There is no need to set the same flag on bridges that +are below the broken brigde. When pci_enable_msi() is called to enable +MSI on a device, pci_msi_supported() takes care of checking the NO_MSI +flag in all parent busses of the device. + +Some bridges actually support dynamic MSI support enabling/disabling +by changing some bits in their PCI configuration space (especially +the Hypertransport chipsets such as the nVidia nForce and Serverworks +HT2000). It may then be required to update the NO_MSI flag on the +corresponding devices in the sysfs hierarchy. To enable MSI support +on device "0000:00:0e", do: + + echo 1 > /sys/bus/pci/devices/0000:00:0e/msi_bus + +To disable MSI support, echo 0 instead of 1. Note that it should be +used with caution since changing this value might break interrupts. + +6.3. Disabling MSI globally + +Some extreme cases may require to disable MSI globally on the system. +For now, the only known case is a Serverworks PCI-X chipsets (MSI are +not supported on several busses that are not all connected to the +chipset in the Linux PCI hierarchy). In the vast majority of other +cases, disabling only behind a specific bridge is enough. + +For debugging purpose, the user may also pass pci=nomsi on the kernel +command-line to explicitly disable MSI globally. But, once the appro- +priate quirks are added to the kernel, this option should not be +required anymore. + +6.4. Finding why MSI cannot be enabled on a device + +Assuming that MSI are not enabled on a device, you should look at +dmesg to find messages that quirks may output when disabling MSI +on some devices, some bridges or even globally. +Then, lspci -t gives the list of bridges above a device. Reading +/sys/bus/pci/devices/0000:00:0e/msi_bus will tell you whether MSI +are enabled (1) or disabled (0). In 0 is found in a single bridge +msi_bus file above the device, MSI cannot be enabled. + +7. FAQ Q1. Are there any limitations on using the MSI? diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c index b11792abd6b..bf2b0e2f87e 100644 --- a/Documentation/accounting/getdelays.c +++ b/Documentation/accounting/getdelays.c @@ -49,7 +49,7 @@ __u64 stime, utime; } /* Maximum size of response requested or message sent */ -#define MAX_MSG_SIZE 256 +#define MAX_MSG_SIZE 1024 /* Maximum number of cpus expected to be specified in a cpumask */ #define MAX_CPUS 32 /* Maximum length of pathname to log file */ diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt index bc107cb157a..4868c34f750 100644 --- a/Documentation/cpu-hotplug.txt +++ b/Documentation/cpu-hotplug.txt @@ -46,7 +46,7 @@ maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using maxcpus=2 will only boot 2. You can choose to bring the other cpus later online, read FAQ's for more info. -additional_cpus*=n Use this to limit hotpluggable cpus. This option sets +additional_cpus=n (*) Use this to limit hotpluggable cpus. This option sets cpu_possible_map = cpu_present_map + additional_cpus (*) Option valid only for following architectures @@ -101,15 +101,15 @@ cpu_possible_map/for_each_possible_cpu() to iterate. Never use anything other than cpumask_t to represent bitmap of CPUs. -#include <linux/cpumask.h> + #include <linux/cpumask.h> -for_each_possible_cpu - Iterate over cpu_possible_map -for_each_online_cpu - Iterate over cpu_online_map -for_each_present_cpu - Iterate over cpu_present_map -for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask. + for_each_possible_cpu - Iterate over cpu_possible_map + for_each_online_cpu - Iterate over cpu_online_map + for_each_present_cpu - Iterate over cpu_present_map + for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask. -#include <linux/cpu.h> -lock_cpu_hotplug() and unlock_cpu_hotplug(): + #include <linux/cpu.h> + lock_cpu_hotplug() and unlock_cpu_hotplug(): The above calls are used to inhibit cpu hotplug operations. While holding the cpucontrol mutex, cpu_online_map will not change. If you merely need to avoid @@ -120,7 +120,7 @@ will work as long as stop_machine_run() is used to take a cpu down. CPU Hotplug - Frequently Asked Questions. -Q: How to i enable my kernel to support CPU hotplug? +Q: How to enable my kernel to support CPU hotplug? A: When doing make defconfig, Enable CPU hotplug support "Processor type and Features" -> Support for Hotpluggable CPUs @@ -141,39 +141,39 @@ A: You should now notice an entry in sysfs. Check if sysfs is mounted, using the "mount" command. You should notice an entry as shown below in the output. -.... -none on /sys type sysfs (rw) -.... + .... + none on /sys type sysfs (rw) + .... -if this is not mounted, do the following. +If this is not mounted, do the following. -#mkdir /sysfs -#mount -t sysfs sys /sys + #mkdir /sysfs + #mount -t sysfs sys /sys -now |