diff options
Diffstat (limited to 'Documentation')
37 files changed, 1537 insertions, 402 deletions
diff --git a/Documentation/ABI/testing/sysfs-ata b/Documentation/ABI/testing/sysfs-ata new file mode 100644 index 00000000000..0a932155cbb --- /dev/null +++ b/Documentation/ABI/testing/sysfs-ata @@ -0,0 +1,99 @@ +What: /sys/class/ata_... +Date: August 2008 +Contact: Gwendal Grignou<gwendal@google.com> +Description: + +Provide a place in sysfs for storing the ATA topology of the system. This allows +retrieving various information about ATA objects. + +Files under /sys/class/ata_port +------------------------------- + + For each port, a directory ataX is created where X is the ata_port_id of + the port. The device parent is the ata host device. + +idle_irq (read) + + Number of IRQ received by the port while idle [some ata HBA only]. + +nr_pmp_links (read) + + If a SATA Port Multiplier (PM) is connected, number of link behind it. + +Files under /sys/class/ata_link +------------------------------- + + Behind each port, there is a ata_link. If there is a SATA PM in the + topology, 15 ata_link objects are created. + + If a link is behind a port, the directory name is linkX, where X is + ata_port_id of the port. + If a link is behind a PM, its name is linkX.Y where X is ata_port_id + of the parent port and Y the PM port. + +hw_sata_spd_limit + + Maximum speed supported by the connected SATA device. + +sata_spd_limit + + Maximum speed imposed by libata. + +sata_spd + + Current speed of the link [1.5, 3Gps,...]. + +Files under /sys/class/ata_device +--------------------------------- + + Behind each link, up to two ata device are created. + The name of the directory is devX[.Y].Z where: + - X is ata_port_id of the port where the device is connected, + - Y the port of the PM if any, and + - Z the device id: for PATA, there is usually 2 devices [0,1], + only 1 for SATA. + +class + Device class. Can be "ata" for disk, "atapi" for packet device, + "pmp" for PM, or "none" if no device was found behind the link. + +dma_mode + + Transfer modes supported by the device when in DMA mode. + Mostly used by PATA device. + +pio_mode + + Transfer modes supported by the device when in PIO mode. + Mostly used by PATA device. + +xfer_mode + + Current transfer mode. + +id + + Cached result of IDENTIFY command, as described in ATA8 7.16 and 7.17. + Only valid if the device is not a PM. + +gscr + + Cached result of the dump of PM GSCR register. + Valid registers are: + 0: SATA_PMP_GSCR_PROD_ID, + 1: SATA_PMP_GSCR_REV, + 2: SATA_PMP_GSCR_PORT_INFO, + 32: SATA_PMP_GSCR_ERROR, + 33: SATA_PMP_GSCR_ERROR_EN, + 64: SATA_PMP_GSCR_FEAT, + 96: SATA_PMP_GSCR_FEAT_EN, + 130: SATA_PMP_GSCR_SII_GPIO + Only valid if the device is a PM. + +spdn_cnt + + Number of time libata decided to lower the speed of link due to errors. + +ering + + Formatted output of the error ring of the device. diff --git a/Documentation/ABI/testing/sysfs-devices-power b/Documentation/ABI/testing/sysfs-devices-power index 6123c523bfd..7628cd1bc36 100644 --- a/Documentation/ABI/testing/sysfs-devices-power +++ b/Documentation/ABI/testing/sysfs-devices-power @@ -77,3 +77,91 @@ Description: devices this attribute is set to "enabled" by bus type code or device drivers and in that cases it should be safe to leave the default value. + +What: /sys/devices/.../power/wakeup_count +Date: September 2010 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../wakeup_count attribute contains the number + of signaled wakeup events associated with the device. This + attribute is read-only. If the device is not enabled to wake up + the system from sleep states, this attribute is empty. + +What: /sys/devices/.../power/wakeup_active_count +Date: September 2010 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../wakeup_active_count attribute contains the + number of times the processing of wakeup events associated with + the device was completed (at the kernel level). This attribute + is read-only. If the device is not enabled to wake up the + system from sleep states, this attribute is empty. + +What: /sys/devices/.../power/wakeup_hit_count +Date: September 2010 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../wakeup_hit_count attribute contains the + number of times the processing of a wakeup event associated with + the device might prevent the system from entering a sleep state. + This attribute is read-only. If the device is not enabled to + wake up the system from sleep states, this attribute is empty. + +What: /sys/devices/.../power/wakeup_active +Date: September 2010 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../wakeup_active attribute contains either 1, + or 0, depending on whether or not a wakeup event associated with + the device is being processed (1). This attribute is read-only. + If the device is not enabled to wake up the system from sleep + states, this attribute is empty. + +What: /sys/devices/.../power/wakeup_total_time_ms +Date: September 2010 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../wakeup_total_time_ms attribute contains + the total time of processing wakeup events associated with the + device, in milliseconds. This attribute is read-only. If the + device is not enabled to wake up the system from sleep states, + this attribute is empty. + +What: /sys/devices/.../power/wakeup_max_time_ms +Date: September 2010 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../wakeup_max_time_ms attribute contains + the maximum time of processing a single wakeup event associated + with the device, in milliseconds. This attribute is read-only. + If the device is not enabled to wake up the system from sleep + states, this attribute is empty. + +What: /sys/devices/.../power/wakeup_last_time_ms +Date: September 2010 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../wakeup_last_time_ms attribute contains + the value of the monotonic clock corresponding to the time of + signaling the last wakeup event associated with the device, in + milliseconds. This attribute is read-only. If the device is + not enabled to wake up the system from sleep states, this + attribute is empty. + +What: /sys/devices/.../power/autosuspend_delay_ms +Date: September 2010 +Contact: Alan Stern <stern@rowland.harvard.edu> +Description: + The /sys/devices/.../power/autosuspend_delay_ms attribute + contains the autosuspend delay value (in milliseconds). Some + drivers do not want their device to suspend as soon as it + becomes idle at run time; they want the device to remain + inactive for a certain minimum period of time first. That + period is called the autosuspend delay. Negative values will + prevent the device from being suspended at run time (similar + to writing "on" to the power/control attribute). Values >= + 1000 will cause the autosuspend timer expiration to be rounded + up to the nearest second. + + Not all drivers support this attribute. If it isn't supported, + attempts to read or write it will yield I/O errors. diff --git a/Documentation/ABI/testing/sysfs-module b/Documentation/ABI/testing/sysfs-module new file mode 100644 index 00000000000..cfcec3bffc0 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-module @@ -0,0 +1,12 @@ +What: /sys/module/pch_phub/drivers/.../pch_mac +Date: August 2010 +KernelVersion: 2.6.35 +Contact: masa-korg@dsn.okisemi.com +Description: Write/read GbE MAC address. + +What: /sys/module/pch_phub/drivers/.../pch_firmware +Date: August 2010 +KernelVersion: 2.6.35 +Contact: masa-korg@dsn.okisemi.com +Description: Write/read Option ROM data. + diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power index 2875f1f74a0..194ca446ac2 100644 --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power @@ -99,9 +99,38 @@ Description: dmesg -s 1000000 | grep 'hash matches' + If you do not get any matches (or they appear to be false + positives), it is possible that the last PM event point + referred to a device created by a loadable kernel module. In + this case cat /sys/power/pm_trace_dev_match (see below) after + your system is started up and the kernel modules are loaded. + CAUTION: Using it will cause your machine's real-time (CMOS) clock to be set to a random invalid time after a resume. +What; /sys/power/pm_trace_dev_match +Date: October 2010 +Contact: James Hogan <james@albanarts.com> +Description: + The /sys/power/pm_trace_dev_match file contains the name of the + device associated with the last PM event point saved in the RTC + across reboots when pm_trace has been used. More precisely it + contains the list of current devices (including those + registered by loadable kernel modules since boot) which match + the device hash in the RTC at boot, with a newline after each + one. + + The advantage of this file over the hash matches printed to the + kernel log (see /sys/power/pm_trace), is that it includes + devices created after boot by loadable kernel modules. + + Due to the small hash size necessary to fit in the RTC, it is + possible that more than one device matches the hash, in which + case further investigation is required to determine which + device is causing the problem. Note that genuine RTC clock + values (such as when pm_trace has not been used), can still + match a device and output it's name here. + What: /sys/power/pm_async Date: January 2009 Contact: Rafael J. Wysocki <rjw@sisk.pl> diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl index 910c923a9b8..2861055afd7 100644 --- a/Documentation/DocBook/drm.tmpl +++ b/Documentation/DocBook/drm.tmpl @@ -136,6 +136,7 @@ #ifdef CONFIG_COMPAT .compat_ioctl = i915_compat_ioctl, #endif + .llseek = noop_llseek, }, .pci_driver = { .name = DRIVER_NAME, diff --git a/Documentation/DocBook/genericirq.tmpl b/Documentation/DocBook/genericirq.tmpl index 1448b33fd22..fb10fd08c05 100644 --- a/Documentation/DocBook/genericirq.tmpl +++ b/Documentation/DocBook/genericirq.tmpl @@ -28,7 +28,7 @@ </authorgroup> <copyright> - <year>2005-2006</year> + <year>2005-2010</year> <holder>Thomas Gleixner</holder> </copyright> <copyright> @@ -100,6 +100,10 @@ <listitem><para>Edge type</para></listitem> <listitem><para>Simple type</para></listitem> </itemizedlist> + During the implementation we identified another type: + <itemizedlist> + <listitem><para>Fast EOI type</para></listitem> + </itemizedlist> In the SMP world of the __do_IRQ() super-handler another type was identified: <itemizedlist> @@ -153,6 +157,7 @@ is still available. This leads to a kind of duality for the time being. Over time the new model should be used in more and more architectures, as it enables smaller and cleaner IRQ subsystems. + It's deprecated for three years now and about to be removed. </para> </chapter> <chapter id="bugs"> @@ -217,6 +222,7 @@ <itemizedlist> <listitem><para>handle_level_irq</para></listitem> <listitem><para>handle_edge_irq</para></listitem> + <listitem><para>handle_fasteoi_irq</para></listitem> <listitem><para>handle_simple_irq</para></listitem> <listitem><para>handle_percpu_irq</para></listitem> </itemizedlist> @@ -233,33 +239,33 @@ are used by the default flow implementations. The following helper functions are implemented (simplified excerpt): <programlisting> -default_enable(irq) +default_enable(struct irq_data *data) { - desc->chip->unmask(irq); + desc->chip->irq_unmask(data); } -default_disable(irq) +default_disable(struct irq_data *data) { - if (!delay_disable(irq)) - desc->chip->mask(irq); + if (!delay_disable(data)) + desc->chip->irq_mask(data); } -default_ack(irq) +default_ack(struct irq_data *data) { - chip->ack(irq); + chip->irq_ack(data); } -default_mask_ack(irq) +default_mask_ack(struct irq_data *data) { - if (chip->mask_ack) { - chip->mask_ack(irq); + if (chip->irq_mask_ack) { + chip->irq_mask_ack(data); } else { - chip->mask(irq); - chip->ack(irq); + chip->irq_mask(data); + chip->irq_ack(data); } } -noop(irq) +noop(struct irq_data *data)) { } @@ -278,12 +284,27 @@ noop(irq) <para> The following control flow is implemented (simplified excerpt): <programlisting> -desc->chip->start(); +desc->chip->irq_mask(); handle_IRQ_event(desc->action); -desc->chip->end(); +desc->chip->irq_unmask(); </programlisting> </para> - </sect3> + </sect3> + <sect3 id="Default_FASTEOI_IRQ_flow_handler"> + <title>Default Fast EOI IRQ flow handler</title> + <para> + handle_fasteoi_irq provides a generic implementation + for interrupts, which only need an EOI at the end of + the handler + </para> + <para> + The following control flow is implemented (simplified excerpt): + <programlisting> +handle_IRQ_event(desc->action); +desc->chip->irq_eoi(); + </programlisting> + </para> + </sect3> <sect3 id="Default_Edge_IRQ_flow_handler"> <title>Default Edge IRQ flow handler</title> <para> @@ -294,20 +315,19 @@ desc->chip->end(); The following control flow is implemented (simplified excerpt): <programlisting> if (desc->status & running) { - desc->chip->hold(); + desc->chip->irq_mask(); desc->status |= pending | masked; return; } -desc->chip->start(); +desc->chip->irq_ack(); desc->status |= running; do { if (desc->status & masked) - desc->chip->enable(); + desc->chip->irq_unmask(); desc->status &= ~pending; handle_IRQ_event(desc->action); } while (status & pending); desc->status &= ~running; -desc->chip->end(); </programlisting> </para> </sect3> @@ -342,9 +362,9 @@ handle_IRQ_event(desc->action); <para> The following control flow is implemented (simplified excerpt): <programlisting> -desc->chip->start(); handle_IRQ_event(desc->action); -desc->chip->end(); +if (desc->chip->irq_eoi) + desc->chip->irq_eoi(); </programlisting> </para> </sect3> @@ -375,8 +395,7 @@ desc->chip->end(); mechanism. (It's necessary to enable CONFIG_HARDIRQS_SW_RESEND when you want to use the delayed interrupt disable feature and your hardware is not capable of retriggering an interrupt.) - The delayed interrupt disable can be runtime enabled, per interrupt, - by setting the IRQ_DELAYED_DISABLE flag in the irq_desc status field. + The delayed interrupt disable is not configurable. </para> </sect2> </sect1> @@ -387,13 +406,13 @@ desc->chip->end(); contains all the direct chip relevant functions, which can be utilized by the irq flow implementations. <itemizedlist> - <listitem><para>ack()</para></listitem> - <listitem><para>mask_ack() - Optional, recommended for performance</para></listitem> - <listitem><para>mask()</para></listitem> - <listitem><para>unmask()</para></listitem> - <listitem><para>retrigger() - Optional</para></listitem> - <listitem><para>set_type() - Optional</para></listitem> - <listitem><para>set_wake() - Optional</para></listitem> + <listitem><para>irq_ack()</para></listitem> + <listitem><para>irq_mask_ack() - Optional, recommended for performance</para></listitem> + <listitem><para>irq_mask()</para></listitem> + <listitem><para>irq_unmask()</para></listitem> + <listitem><para>irq_retrigger() - Optional</para></listitem> + <listitem><para>irq_set_type() - Optional</para></listitem> + <listitem><para>irq_set_wake() - Optional</para></listitem> </itemizedlist> These primitives are strictly intended to mean what they say: ack means ACK, masking means masking of an IRQ line, etc. It is up to the flow @@ -458,6 +477,7 @@ desc->chip->end(); <para> This chapter contains the autogenerated documentation of the internal functions. </para> +!Ikernel/irq/irqdesc.c !Ikernel/irq/handle.c !Ikernel/irq/chip.c </chapter> diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index 6899f471fb1..6b4e07f28b6 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -257,7 +257,8 @@ X!Earch/x86/kernel/mca_32.c !Iblock/blk-sysfs.c !Eblock/blk-settings.c !Eblock/blk-exec.c -!Eblock/blk-barrier.c +!Eblock/blk-flush.c +!Eblock/blk-lib.c !Eblock/blk-tag.c !Iblock/blk-tag.c !Eblock/blk-integrity.c diff --git a/Documentation/DocBook/kernel-locking.tmpl b/Documentation/DocBook/kernel-locking.tmpl index a0d479d1e1d..f66f4df1869 100644 --- a/Documentation/DocBook/kernel-locking.tmpl +++ b/Documentation/DocBook/kernel-locking.tmpl @@ -1645,7 +1645,9 @@ the amount of locking which needs to be done. all the readers who were traversing the list when we deleted the element are finished. We use <function>call_rcu()</function> to register a callback which will actually destroy the object once - the readers are finished. + all pre-existing readers are finished. Alternatively, + <function>synchronize_rcu()</function> may be used to block until + all pre-existing are finished. </para> <para> But how does Read Copy Update know when the readers are @@ -1714,7 +1716,7 @@ the amount of locking which needs to be done. - object_put(obj); + list_del_rcu(&obj->list); cache_num--; -+ call_rcu(&obj->rcu, cache_delete_rcu, obj); ++ call_rcu(&obj->rcu, cache_delete_rcu); } /* Must be holding cache_lock */ @@ -1725,14 +1727,6 @@ the amount of locking which needs to be done. if (++cache_num > MAX_CACHE_SIZE) { struct object *i, *outcast = NULL; list_for_each_entry(i, &cache, list) { -@@ -85,6 +94,7 @@ - obj->popularity = 0; - atomic_set(&obj->refcnt, 1); /* The cache holds a reference */ - spin_lock_init(&obj->lock); -+ INIT_RCU_HEAD(&obj->rcu); - - spin_lock_irqsave(&cache_lock, flags); - __cache_add(obj); @@ -104,12 +114,11 @@ struct object *cache_find(int id) { diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index 790d1a81237..0c134f8afc6 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -218,13 +218,22 @@ over a rather long period of time, but improvements are always welcome! include: a. Keeping a count of the number of data-structure elements - used by the RCU-protected data structure, including those - waiting for a grace period to elapse. Enforce a limit - on this number, stalling updates as needed to allow - previously deferred frees to complete. - - Alternatively, limit only the number awaiting deferred - free rather than the total number of elements. + used by the RCU-protected data structure, including + those waiting for a grace period to elapse. Enforce a + limit on this number, stalling updates as needed to allow + previously deferred frees to complete. Alternatively, + limit only the number awaiting deferred free rather than + the total number of elements. + + One way to stall the updates is to acquire the update-side + mutex. (Don't try this with a spinlock -- other CPUs + spinning on the lock could prevent the grace period + from ever ending.) Another way to stall the updates + is for the updates to use a wrapper function around + the memory allocator, so that this wrapper function + simulates OOM when there is too much memory awaiting an + RCU grace period. There are of course many other + variations on this theme. b. Limiting update rate. For example, if updates occur only once per hour, then no explicit rate limiting is required, @@ -365,3 +374,26 @@ over a rather long period of time, but improvements are always welcome! and the compiler to freely reorder code into and out of RCU read-side critical sections. It is the responsibility of the RCU update-side primitives to deal with this. + +17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and + the __rcu sparse checks to validate your RCU code. These + can help find problems as follows: + + CONFIG_PROVE_RCU: check that accesses to RCU-protected data + structures are carried out under the proper RCU + read-side critical section, while holding the right + combination of locks, or whatever other conditions + are appropriate. + + CONFIG_DEBUG_OBJECTS_RCU_HEAD: check that you don't pass the + same object to call_rcu() (or friends) before an RCU + grace period has elapsed since the last time that you + passed that same object to call_rcu() (or friends). + + __rcu sparse checks: tag the pointer to the RCU-protected data + structure with __rcu, and sparse will warn you if you + access that pointer without the services of one of the + variants of rcu_dereference(). + + These debugging aids can help you find problems that are + otherwise extremely difficult to spot. diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 44c6dcc93d6..862c08ef1fd 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -80,6 +80,24 @@ o A CPU looping with bottom halves disabled. This condition can o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel without invoking schedule(). +o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might + happen to preempt a low-priority task in the middle of an RCU + read-side critical section. This is especially damaging if + that low-priority task is not permitted to run on any other CPU, + in which case the next RCU grace period can never complete, which + will eventually cause the system to run out of memory and hang. + While the system is in the process of running itself out of + memory, you might see stall-warning messages. + +o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that + is running at a higher priority than the RCU softirq threads. + This will prevent RCU callbacks from ever being invoked, + and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent + RCU grace periods from ever completing. Either way, the + system will eventually run out of memory and hang. In the + CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning + messages. + o A bug in the RCU implementation. o A hardware failure. This is quite unlikely, but has occurred diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index efd8cc95c06..a851118775d 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt @@ -125,6 +125,17 @@ o "b" is the batch limit for this CPU. If more than this number of RCU callbacks is ready to invoke, then the remainder will be deferred. +o "ci" is the number of RCU callbacks that have been invoked for + this CPU. Note that ci+ql is the number of callbacks that have + been registered in absence of CPU-hotplug activity. + +o "co" is the number of RCU callbacks that have been orphaned due to + this CPU going offline. + +o "ca" is the number of RCU callbacks that have been adopted due to + other CPUs going offline. Note that ci+co-ca+ql is the number of + RCU callbacks registered on this CPU. + There is also an rcu/rcudata.csv file with the same information in comma-separated-variable spreadsheet format. @@ -180,7 +191,7 @@ o "s" is the "signaled" state that drives force_quiescent_state()'s o "jfq" is the number of jiffies remaining for this grace period before force_quiescent_state() is invoked to help push things - along. Note that CPUs in dyntick-idle mode thoughout the grace + along. Note that CPUs in dyntick-idle mode throughout the grace period will not report on their own, but rather must be check by some other CPU via force_quiescent_state(). diff --git a/Documentation/arm/00-INDEX b/Documentation/arm/00-INDEX index 7f5fc3ba9c9..ecf7d04bca2 100644 --- a/Documentation/arm/00-INDEX +++ b/Documentation/arm/00-INDEX @@ -6,6 +6,8 @@ Interrupts - ARM Interrupt subsystem documentation IXP2000 - Release Notes for Linux on Intel's IXP2000 Network Processor +msm + - MSM specific documentation Netwinder - Netwinder specific documentation Porting diff --git a/Documentation/arm/msm/gpiomux.txt b/Documentation/arm/msm/gpiomux.txt new file mode 100644 index 00000000000..67a81620adf --- /dev/null +++ b/Documentation/arm/msm/gpiomux.txt @@ -0,0 +1,176 @@ +This document provides an overview of the msm_gpiomux interface, which +is |