diff options
Diffstat (limited to 'Documentation')
42 files changed, 957 insertions, 275 deletions
diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index 1af0f2d5022..2ffb0d62f0f 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -33,7 +33,9 @@ pci_alloc_consistent(struct pci_dev *dev, size_t size, Consistent memory is memory for which a write by either the device or the processor can immediately be read by the processor or device -without having to worry about caching effects. +without having to worry about caching effects. (You may however need +to make sure to flush the processor's write buffers before telling +devices to read that memory.) This routine allocates a region of <size> bytes of consistent memory. it also returns a <dma_handle> which may be cast to an unsigned @@ -304,12 +306,12 @@ dma address with dma_mapping_error(). A non zero return value means the mapping could not be created and the driver should take appropriate action (eg reduce current DMA mapping usage or delay and try again later). -int -dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, - enum dma_data_direction direction) -int -pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, - int nents, int direction) + int + dma_map_sg(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction direction) + int + pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, + int nents, int direction) Maps a scatter gather list from the block layer. @@ -327,12 +329,33 @@ critical that the driver do something, in the case of a block driver aborting the request or even oopsing is better than doing nothing and corrupting the filesystem. -void -dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries, - enum dma_data_direction direction) -void -pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, - int nents, int direction) +With scatterlists, you use the resulting mapping like this: + + int i, count = dma_map_sg(dev, sglist, nents, direction); + struct scatterlist *sg; + + for (i = 0, sg = sglist; i < count; i++, sg++) { + hw_address[i] = sg_dma_address(sg); + hw_len[i] = sg_dma_len(sg); + } + +where nents is the number of entries in the sglist. + +The implementation is free to merge several consecutive sglist entries +into one (e.g. with an IOMMU, or if several pages just happen to be +physically contiguous) and returns the actual number of sg entries it +mapped them to. On failure 0, is returned. + +Then you should loop count times (note: this can be less than nents times) +and use sg_dma_address() and sg_dma_len() macros where you previously +accessed sg->address and sg->length as shown above. + + void + dma_unmap_sg(struct device *dev, struct scatterlist *sg, + int nhwentries, enum dma_data_direction direction) + void + pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, + int nents, int direction) unmap the previously mapped scatter/gather list. All the parameters must be the same as those and passed in to the scatter/gather mapping diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt index ee4bb73683c..7c717699032 100644 --- a/Documentation/DMA-mapping.txt +++ b/Documentation/DMA-mapping.txt @@ -58,11 +58,15 @@ translating each of those pages back to a kernel address using something like __va(). [ EDIT: Update this when we integrate Gerd Knorr's generic code which does this. ] -This rule also means that you may not use kernel image addresses -(ie. items in the kernel's data/text/bss segment, or your driver's) -nor may you use kernel stack addresses for DMA. Both of these items -might be mapped somewhere entirely different than the rest of physical -memory. +This rule also means that you may use neither kernel image addresses +(items in data/text/bss segments), nor module image addresses, nor +stack addresses for DMA. These could all be mapped somewhere entirely +different than the rest of physical memory. Even if those classes of +memory could physically work with DMA, you'd need to ensure the I/O +buffers were cacheline-aligned. Without that, you'd see cacheline +sharing problems (data corruption) on CPUs with DMA-incoherent caches. +(The CPU could write to one word, DMA would write to a different one +in the same cache line, and one of them could be overwritten.) Also, this means that you cannot take the return of a kmap() call and DMA to/from that. This is similar to vmalloc(). @@ -194,7 +198,7 @@ document for how to handle this case. Finally, if your device can only drive the low 24-bits of address during PCI bus mastering you might do something like: - if (pci_set_dma_mask(pdev, 0x00ffffff)) { + if (pci_set_dma_mask(pdev, DMA_24BIT_MASK)) { printk(KERN_WARNING "mydev: 24-bit DMA addressing not available.\n"); goto ignore_this_device; @@ -212,7 +216,7 @@ functions (for example a sound card provides playback and record functions) and the various different functions have _different_ DMA addressing limitations, you may wish to probe each mask and only provide the functionality which the machine can handle. It -is important that the last call to pci_set_dma_mask() be for the +is important that the last call to pci_set_dma_mask() be for the most specific mask. Here is pseudo-code showing how this might be done: @@ -284,6 +288,11 @@ There are two types of DMA mappings: in order to get correct behavior on all platforms. + Also, on some platforms your driver may need to flush CPU write + buffers in much the same way as it needs to flush write buffers + found in PCI bridges (such as by reading a register's value + after writing it). + - Streaming DMA mappings which are usually mapped for one DMA transfer, unmapped right after it (unless you use pci_dma_sync_* below) and for which hardware can optimize for sequential accesses. @@ -303,6 +312,9 @@ There are two types of DMA mappings: Neither type of DMA mapping has alignment restrictions that come from PCI, although some devices may have such restrictions. +Also, systems with caches that aren't DMA-coherent will work better +when the underlying buffers don't share cache lines with other data. + Using Consistent DMA mappings. diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index 7d87dd73cbe..5a2882d275b 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile @@ -2,7 +2,7 @@ # This makefile is used to generate the kernel documentation, # primarily based on in-line comments in various source files. # See Documentation/kernel-doc-nano-HOWTO.txt for instruction in how -# to ducument the SRC - and how to read it. +# to document the SRC - and how to read it. # To add a new book the only step required is to add the book to the # list of DOCBOOKS. diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index 8c9c6704e85..ca02e04a906 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -322,7 +322,6 @@ X!Earch/i386/kernel/mca.c <chapter id="sysfs"> <title>The Filesystem for Exporting Kernel Objects</title> !Efs/sysfs/file.c -!Efs/sysfs/dir.c !Efs/sysfs/symlink.c !Efs/sysfs/bin.c </chapter> diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl index 5bcbb6ee3bc..f869b03929d 100644 --- a/Documentation/DocBook/libata.tmpl +++ b/Documentation/DocBook/libata.tmpl @@ -705,7 +705,7 @@ and other resources, etc. <sect1><title>ata_scsi_error()</title> <para> - ata_scsi_error() is the current hostt->eh_strategy_handler() + ata_scsi_error() is the current transportt->eh_strategy_handler() for libata. As discussed above, this will be entered in two cases - timeout and ATAPI error completion. This function calls low level libata driver's eng_timeout() callback, the diff --git a/Documentation/HOWTO b/Documentation/HOWTO index 6c9e746267d..915ae8c986c 100644 --- a/Documentation/HOWTO +++ b/Documentation/HOWTO @@ -603,7 +603,8 @@ start exactly where you are now. ---------- -Thanks to Paolo Ciarrocchi who allowed the "Development Process" section +Thanks to Paolo Ciarrocchi who allowed the "Development Process" +(http://linux.tar.bz/articles/2.6-development_process) section to be based on text he had written, and to Randy Dunlap and Gerrit Huizenga for some of the list of things you should and should not say. Also thanks to Pat Mochel, Hanna Linder, Randy Dunlap, Kay Sievers, diff --git a/Documentation/acpi-hotkey.txt b/Documentation/acpi-hotkey.txt index 744f1aec655..38040fa3764 100644 --- a/Documentation/acpi-hotkey.txt +++ b/Documentation/acpi-hotkey.txt @@ -30,7 +30,7 @@ specific hotkey(event)) echo "event_num:event_type:event_argument" > /proc/acpi/hotkey/action. The result of the execution of this aml method is -attached to /proc/acpi/hotkey/poll_method, which is dnyamically +attached to /proc/acpi/hotkey/poll_method, which is dynamically created. Please use command "cat /proc/acpi/hotkey/polling_method" to retrieve it. diff --git a/Documentation/block/switching-sched.txt b/Documentation/block/switching-sched.txt new file mode 100644 index 00000000000..5fa130a6753 --- /dev/null +++ b/Documentation/block/switching-sched.txt @@ -0,0 +1,22 @@ +As of the Linux 2.6.10 kernel, it is now possible to change the +IO scheduler for a given block device on the fly (thus making it possible, +for instance, to set the CFQ scheduler for the system default, but +set a specific device to use the anticipatory or noop schedulers - which +can improve that device's throughput). + +To set a specific scheduler, simply do this: + +echo SCHEDNAME > /sys/block/DEV/queue/scheduler + +where SCHEDNAME is the name of a defined IO scheduler, and DEV is the +device name (hda, hdb, sga, or whatever you happen to have). + +The list of defined schedulers can be found by simply doing +a "cat /sys/block/DEV/queue/scheduler" - the list of valid names +will be displayed, with the currently selected scheduler in brackets: + +# cat /sys/block/hda/queue/scheduler +noop anticipatory deadline [cfq] +# echo anticipatory > /sys/block/hda/queue/scheduler +# cat /sys/block/hda/queue/scheduler +noop [anticipatory] deadline cfq diff --git a/Documentation/cpu-freq/index.txt b/Documentation/cpu-freq/index.txt index 5009805f937..ffdb5323df3 100644 --- a/Documentation/cpu-freq/index.txt +++ b/Documentation/cpu-freq/index.txt @@ -53,4 +53,4 @@ the CPUFreq Mailing list: * http://lists.linux.org.uk/mailman/listinfo/cpufreq Clock and voltage scaling for the SA-1100: -* http://www.lart.tudelft.nl/projects/scaling +* http://www.lartmaker.nl/projects/scaling diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 59d0c74c79c..421bcfff6ad 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -25,8 +25,9 @@ Who: Adrian Bunk <bunk@stusta.de> --------------------------- -What: drivers depending on OBSOLETE_OSS_DRIVER -When: January 2006 +What: drivers that were depending on OBSOLETE_OSS_DRIVER + (config options already removed) +When: before 2.6.19 Why: OSS drivers with ALSA replacements Who: Adrian Bunk <bunk@stusta.de> @@ -71,14 +72,6 @@ Who: Mauro Carvalho Chehab <mchehab@brturbo.com.br> --------------------------- -What: remove EXPORT_SYMBOL(panic_timeout) -When: April 2006 -Files: kernel/panic.c -Why: No modular usage in the kernel. -Who: Adrian Bunk <bunk@stusta.de> - ---------------------------- - What: remove EXPORT_SYMBOL(insert_resource) When: April 2006 Files: kernel/resource.c diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt index c8bce82ddca..89b1d196ca8 100644 --- a/Documentation/filesystems/sysfs.txt +++ b/Documentation/filesystems/sysfs.txt @@ -246,6 +246,7 @@ class/ devices/ firmware/ net/ +fs/ devices/ contains a filesystem representation of the device tree. It maps directly to the internal kernel device tree, which is a hierarchy of @@ -264,6 +265,10 @@ drivers/ contains a directory for each device driver that is loaded for devices on that particular bus (this assumes that drivers do not span multiple bus types). +fs/ contains a directory for some filesystems. Currently each +filesystem wanting to export attributes must create its own hierarchy +below fs/ (see ./fuse.txt for an example). + More information can driver-model specific features can be found in Documentation/driver-model/. diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index adaa899e5c9..3a2e5520c1e 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -694,7 +694,7 @@ struct file_operations ---------------------- This describes how the VFS can manipulate an open file. As of kernel -2.6.13, the following members are defined: +2.6.17, the following members are defined: struct file_operations { loff_t (*llseek) (struct file *, loff_t, int); @@ -723,6 +723,10 @@ struct file_operations { int (*check_flags)(int); int (*dir_notify)(struct file *filp, unsigned long arg); int (*flock) (struct file *, int, struct file_lock *); + ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, size_t, unsigned +int); + ssize_t (*splice_read)(struct file *, struct pipe_inode_info *, size_t, unsigned +int); }; Again, all methods are called without any locks being held, unless @@ -790,6 +794,12 @@ otherwise noted. flock: called by the flock(2) system call + splice_write: called by the VFS to splice data from a pipe to a file. This + method is used by the splice(2) system call + + splice_read: called by the VFS to splice data from file to a pipe. This + method is used by the splice(2) system call + Note that the file operations are implemented by the specific filesystem in which the inode resides. When opening a device node (character or block special) most filesystems will call special diff --git a/Documentation/fujitsu/frv/kernel-ABI.txt b/Documentation/fujitsu/frv/kernel-ABI.txt index 0ed9b0a779b..8b0a5fc8bfd 100644 --- a/Documentation/fujitsu/frv/kernel-ABI.txt +++ b/Documentation/fujitsu/frv/kernel-ABI.txt @@ -1,17 +1,19 @@ - ================================= - INTERNAL KERNEL ABI FOR FR-V ARCH - ================================= - -The internal FRV kernel ABI is not quite the same as the userspace ABI. A number of the registers -are used for special purposed, and the ABI is not consistent between modules vs core, and MMU vs -no-MMU. - -This partly stems from the fact that FRV CPUs do not have a separate supervisor stack pointer, and -most of them do not have any scratch registers, thus requiring at least one general purpose -register to be clobbered in such an event. Also, within the kernel core, it is possible to simply -jump or call directly between functions using a relative offset. This cannot be extended to modules -for the displacement is likely to be too far. Thus in modules the address of a function to call -must be calculated in a register and then used, requiring two extra instructions. + ================================= + INTERNAL KERNEL ABI FOR FR-V ARCH + ================================= + +The internal FRV kernel ABI is not quite the same as the userspace ABI. A +number of the registers are used for special purposed, and the ABI is not +consistent between modules vs core, and MMU vs no-MMU. + +This partly stems from the fact that FRV CPUs do not have a separate +supervisor stack pointer, and most of them do not have any scratch +registers, thus requiring at least one general purpose register to be +clobbered in such an event. Also, within the kernel core, it is possible to +simply jump or call directly between functions using a relative offset. +This cannot be extended to modules for the displacement is likely to be too +far. Thus in modules the address of a function to call must be calculated +in a register and then used, requiring two extra instructions. This document has the following sections: @@ -39,7 +41,8 @@ When a system call is made, the following registers are effective: CPU OPERATING MODES =================== -The FR-V CPU has three basic operating modes. In order of increasing capability: +The FR-V CPU has three basic operating modes. In order of increasing +capability: (1) User mode. @@ -47,42 +50,46 @@ The FR-V CPU has three basic operating modes. In order of increasing capability: (2) Kernel mode. - Normal kernel mode. There are many additional control registers available that may be - accessed in this mode, in addition to all the stuff available to user mode. This has two - submodes: + Normal kernel mode. There are many additional control registers + available that may be accessed in this mode, in addition to all the + stuff available to user mode. This has two submodes: (a) Exceptions enabled (PSR.T == 1). - Exceptions will invoke the appropriate normal kernel mode handler. On entry to the - handler, the PSR.T bit will be cleared. + Exceptions will invoke the appropriate normal kernel mode + handler. On entry to the handler, the PSR.T bit will be cleared. (b) Exceptions disabled (PSR.T == 0). - No exceptions or interrupts may happen. Any mandatory exceptions will cause the CPU to - halt unless the CPU is told to jump into debug mode instead. + No exceptions or interrupts may happen. Any mandatory exceptions + will cause the CPU to halt unless the CPU is told to jump into + debug mode instead. (3) Debug mode. - No exceptions may happen in this mode. Memory protection and management exceptions will be - flagged for later consideration, but the exception handler won't be invoked. Debugging traps - such as hardware breakpoints and watchpoints will be ignored. This mode is entered only by - debugging events obtained from the other two modes. + No exceptions may happen in this mode. Memory protection and + management exceptions will be flagged for later consideration, but + the exception handler won't be invoked. Debugging traps such as + hardware breakpoints and watchpoints will be ignored. This mode is + entered only by debugging events obtained from the other two modes. - All kernel mode registers may be accessed, plus a few extra debugging specific registers. + All kernel mode registers may be accessed, plus a few extra debugging + specific registers. ================================= INTERNAL KERNEL-MODE REGISTER ABI ================================= -There are a number of permanent register assignments that are set up by entry.S in the exception -prologue. Note that there is a complete set of exception prologues for each of user->kernel -transition and kernel->kernel transition. There are also user->debug and kernel->debug mode -transition prologues. +There are a number of permanent register assignments that are set up by +entry.S in the exception prologue. Note that there is a complete set of +exception prologues for each of user->kernel transition and kernel->kernel +transition. There are also user->debug and kernel->debug mode transition +prologues. REGISTER FLAVOUR USE - =============== ======= ==================================================== + =============== ======= ============================================== GR1 Supervisor stack pointer GR15 Current thread info pointer GR16 GP-Rel base register for small data @@ -92,10 +99,12 @@ transition prologues. GR31 NOMMU Destroyed by debug mode entry GR31 MMU Destroyed by TLB miss kernel mode entry CCR.ICC2 Virtual interrupt disablement tracking - CCCR.CC3 Cleared by exception prologue (atomic op emulation) + CCCR.CC3 Cleared by exception prologue + (atomic op emulation) SCR0 MMU See mmu-layout.txt. SCR1 MMU See mmu-layout.txt. - SCR2 MMU Save for EAR0 (destroyed by icache insns in debug mode) + SCR2 MMU Save for EAR0 (destroyed by icache insns + in debug mode) SCR3 MMU Save for GR31 during debug exceptions DAMR/IAMR NOMMU Fixed memory protection layout. DAMR/IAMR MMU See mmu-layout.txt. @@ -104,18 +113,21 @@ transition prologues. Certain registers are also used or modified across function calls: REGISTER CALL RETURN - =============== =============================== =============================== + =============== =============================== ====================== GR0 Fixed Zero - GR2 Function call frame pointer GR3 Special Preserved GR3-GR7 - Clobbered - GR8 Function call arg #1 Return value (or clobbered) - GR9 Function call arg #2 Return value MSW (or clobbered) + GR8 Function call arg #1 Return value + (or clobbered) + GR9 Function call arg #2 Return value MSW + (or clobbered) GR10-GR13 Function call arg #3-#6 Clobbered GR14 - Clobbered GR15-GR16 Special Preserved GR17-GR27 - Preserved - GR28-GR31 Special Only accessed explicitly + GR28-GR31 Special Only accessed + explicitly LR Return address after CALL Clobbered CCR/CCCR - Mostly Clobbered @@ -124,46 +136,53 @@ Certain registers are also used or modified across function calls: INTERNAL DEBUG-MODE REGISTER ABI ================================ -This is the same as the kernel-mode register ABI for functions calls. The difference is that in -debug-mode there's a different stack and a different exception frame. Almost all the global -registers from kernel-mode (including the stack pointer) may be changed. +This is the same as the kernel-mode register ABI for functions calls. The +difference is that in debug-mode there's a different stack and a different +exception frame. Almost all the global registers from kernel-mode +(including the stack pointer) may be changed. REGISTER FLAVOUR USE - =============== ======= ==================================================== + =============== ======= ============================================== GR1 Debug stack pointer GR16 GP-Rel base register for small data - GR31 Current debug exception frame pointer (__debug_frame) + GR31 Current debug exception frame pointer + (__debug_frame) SCR3 MMU Saved value of GR31 -Note that debug mode is able to interfere with the kernel's emulated atomic ops, so it must be -exceedingly careful not to do any that would interact with the main kernel in this regard. Hence -the debug mode code (gdbstub) is almost completely self-contained. The only external code used is -the sprintf family of functions. +Note that debug mode is able to interfere with the kernel's emulated atomic +ops, so it must be exceedingly careful not to do any that would interact +with the main kernel in this regard. Hence the debug mode code (gdbstub) is +almost completely self-contained. The only external code used is the +sprintf family of functions. -Futhermore, break.S is so complicated because single-step mode does not switch off on entry to an -exception. That means unless manually disabled, single-stepping will blithely go on stepping into -things like interrupts. See gdbstub.txt for more information. +Futhermore, break.S is so complicated because single-step mode does not +switch off on entry to an exception. That means unless manually disabled, +single-stepping will blithely go on stepping into things like interrupts. +See gdbstub.txt for more information. ========================== VIRTUAL INTERRUPT HANDLING ========================== -Because accesses to the PSR is so slow, and to disable interrupts we have to access it twice (once -to read and once to write), we don't actually disable interrupts at all if we don't have to. What -we do instead is use the ICC2 condition code flags to note virtual disablement, such that if we -then do take an interrupt, we note the flag, really disable interrupts, set another flag and resume -execution at the point the interrupt happened. Setting condition flags as a side effect of an -arithmetic or logical instruction is really fast. This use of the ICC2 only occurs within the +Because accesses to the PSR is so slow, and to disable interrupts we have +to access it twice (once to read and once to write), we don't actually +disable interrupts at all if we don't have to. What we do instead is use +the ICC2 condition code flags to note virtual disablement, such that if we +then do take an interrupt, we note the flag, really disable interrupts, set +another flag and resume execution at the point the interrupt happened. +Setting condition flags as a side effect of an arithmetic or logical +instruction is really fast. This use of the ICC2 only occurs within the kernel - it does not affect userspace. The flags we use are: (*) CCR.ICC2.Z [Zero flag] - Set to virtually disable interrupts, clear when interrupts are virtually enabled. Can be - modified by logical instructions without affecting the Carry flag. + Set to virtually disable interrupts, clear when interrupts are + virtually enabled. Can be modified by logical instructions without + affecting the Carry flag. (*) CCR.ICC2.C [Carry flag] @@ -176,8 +195,9 @@ What happens is this: ICC2.Z is 0, ICC2.C is 1. - (2) An interrupt occurs. The exception prologue examines ICC2.Z and determines that nothing needs - doing. This is done simply with an unlikely BEQ instruction. + (2) An interrupt occurs. The exception prologue examines ICC2.Z and + determines that nothing needs doing. This is done simply with an + unlikely BEQ instruction. (3) The interrupts are disabled (local_irq_disable) @@ -187,48 +207,56 @@ What happens is this: ICC2.Z would be set to 0. - A TIHI #2 instruction (trap #2 if condition HI - Z==0 && C==0) would be used to trap if - interrupts were now virtually enabled, but physically disabled - which they're not, so the - trap isn't taken. The kernel would then be back to state (1). + A TIHI #2 instruction (trap #2 if condition HI - Z==0 && C==0) would + be used to trap if interrupts were now virtually enabled, but + physically disabled - which they're not, so the trap isn't taken. The + kernel would then be back to state (1). - (5) An interrupt occurs. The exception prologue examines ICC2.Z and determines that the interrupt - shouldn't actually have happened. It jumps aside, and there disabled interrupts by setting - PSR.PIL to 14 and then it clears ICC2.C. + (5) An interrupt occurs. The exception prologue examines ICC2.Z and + determines that the interrupt shouldn't actually have happened. It |