diff options
author | Ingo Molnar <mingo@elte.hu> | 2009-08-15 18:55:58 +0200 |
---|---|---|
committer | Ingo Molnar <mingo@elte.hu> | 2009-08-15 18:56:13 +0200 |
commit | fa08661af834875c9bd6f7f0b1b9388dc72a6585 (patch) | |
tree | c381fcfcfeb38515bfa93445c80ad9231343414d /Documentation | |
parent | 240ebbf81f149b11a31e060ebe5ee51a3c775360 (diff) | |
parent | 64f1607ffbbc772685733ea63e6f7f4183df1b16 (diff) |
Merge commit 'v2.6.31-rc6' into core/rcu
Merge reason: the branch was on pre-rc1 .30, update to latest.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Diffstat (limited to 'Documentation')
45 files changed, 2265 insertions, 1735 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index cbbd3e06994..5f3bedaf8e3 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block @@ -94,28 +94,37 @@ What: /sys/block/<disk>/queue/physical_block_size Date: May 2009 Contact: Martin K. Petersen <martin.petersen@oracle.com> Description: - This is the smallest unit the storage device can write - without resorting to read-modify-write operation. It is - usually the same as the logical block size but may be - bigger. One example is SATA drives with 4KB sectors - that expose a 512-byte logical block size to the - operating system. + This is the smallest unit a physical storage device can + write atomically. It is usually the same as the logical + block size but may be bigger. One example is SATA + drives with 4KB sectors that expose a 512-byte logical + block size to the operating system. For stacked block + devices the physical_block_size variable contains the + maximum physical_block_size of the component devices. What: /sys/block/<disk>/queue/minimum_io_size Date: April 2009 Contact: Martin K. Petersen <martin.petersen@oracle.com> Description: - Storage devices may report a preferred minimum I/O size, - which is the smallest request the device can perform - without incurring a read-modify-write penalty. For disk - drives this is often the physical block size. For RAID - arrays it is often the stripe chunk size. + Storage devices may report a granularity or preferred + minimum I/O size which is the smallest request the + device can perform without incurring a performance + penalty. For disk drives this is often the physical + block size. For RAID arrays it is often the stripe + chunk size. A properly aligned multiple of + minimum_io_size is the preferred request size for + workloads where a high number of I/O operations is + desired. What: /sys/block/<disk>/queue/optimal_io_size Date: April 2009 Contact: Martin K. Petersen <martin.petersen@oracle.com> Description: Storage devices may report an optimal I/O size, which is - the device's preferred unit of receiving I/O. This is - rarely reported for disk drives. For RAID devices it is - usually the stripe width or the internal block size. + the device's preferred unit for sustained I/O. This is + rarely reported for disk drives. For RAID arrays it is + usually the stripe width or the internal track size. A + properly aligned multiple of optimal_io_size is the + preferred request size for workloads where sustained + throughput is desired. If no optimal I/O size is + reported this file contains 0. diff --git a/Documentation/DocBook/kernel-hacking.tmpl b/Documentation/DocBook/kernel-hacking.tmpl index a50d6cd5857..992e67e6be7 100644 --- a/Documentation/DocBook/kernel-hacking.tmpl +++ b/Documentation/DocBook/kernel-hacking.tmpl @@ -449,8 +449,8 @@ printk(KERN_INFO "i = %u\n", i); </para> <programlisting> -__u32 ipaddress; -printk(KERN_INFO "my ip: %d.%d.%d.%d\n", NIPQUAD(ipaddress)); +__be32 ipaddress; +printk(KERN_INFO "my ip: %pI4\n", &ipaddress); </programlisting> <para> diff --git a/Documentation/DocBook/mac80211.tmpl b/Documentation/DocBook/mac80211.tmpl index e3698666357..f3f37f141db 100644 --- a/Documentation/DocBook/mac80211.tmpl +++ b/Documentation/DocBook/mac80211.tmpl @@ -184,8 +184,6 @@ usage should require reading the full document. !Finclude/net/mac80211.h ieee80211_ctstoself_get !Finclude/net/mac80211.h ieee80211_ctstoself_duration !Finclude/net/mac80211.h ieee80211_generic_frame_duration -!Finclude/net/mac80211.h ieee80211_get_hdrlen_from_skb -!Finclude/net/mac80211.h ieee80211_hdrlen !Finclude/net/mac80211.h ieee80211_wake_queue !Finclude/net/mac80211.h ieee80211_stop_queue !Finclude/net/mac80211.h ieee80211_wake_queues diff --git a/Documentation/RCU/rculist_nulls.txt b/Documentation/RCU/rculist_nulls.txt index 93cb28d05dc..18f9651ff23 100644 --- a/Documentation/RCU/rculist_nulls.txt +++ b/Documentation/RCU/rculist_nulls.txt @@ -83,11 +83,12 @@ not detect it missed following items in original chain. obj = kmem_cache_alloc(...); lock_chain(); // typically a spin_lock() obj->key = key; -atomic_inc(&obj->refcnt); /* * we need to make sure obj->key is updated before obj->next + * or obj->refcnt */ smp_wmb(); +atomic_set(&obj->refcnt, 1); hlist_add_head_rcu(&obj->obj_node, list); unlock_chain(); // typically a spin_unlock() @@ -159,6 +160,10 @@ out: obj = kmem_cache_alloc(cachep); lock_chain(); // typically a spin_lock() obj->key = key; +/* + * changes to obj->key must be visible before refcnt one + */ +smp_wmb(); atomic_set(&obj->refcnt, 1); /* * insert obj in RCU way (readers might be traversing chain) diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt index 43cb1004d35..9d58c7c5edd 100644 --- a/Documentation/arm/memory.txt +++ b/Documentation/arm/memory.txt @@ -21,6 +21,8 @@ ffff8000 ffffffff copy_user_page / clear_user_page use. For SA11xx and Xscale, this is used to setup a minicache mapping. +ffff4000 ffffffff cache aliasing on ARMv6 and later CPUs. + ffff1000 ffff7fff Reserved. Platforms must not use this address range. diff --git a/Documentation/block/data-integrity.txt b/Documentation/block/data-integrity.txt index e8ca040ba2c..2d735b0ae38 100644 --- a/Documentation/block/data-integrity.txt +++ b/Documentation/block/data-integrity.txt @@ -50,7 +50,7 @@ encouraged them to allow separation of the data and integrity metadata scatter-gather lists. The controller will interleave the buffers on write and split them on -read. This means that the Linux can DMA the data buffers to and from +read. This means that Linux can DMA the data buffers to and from host memory without changes to the page cache. Also, the 16-bit CRC checksum mandated by both the SCSI and SATA specs @@ -66,7 +66,7 @@ software RAID5). The IP checksum is weaker than the CRC in terms of detecting bit errors. However, the strength is really in the separation of the data -buffers and the integrity metadata. These two distinct buffers much +buffers and the integrity metadata. These two distinct buffers must match up for an I/O to complete. The separation of the data and integrity metadata buffers as well as diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt index f9ca389dddf..1d7e9784439 100644 --- a/Documentation/cgroups/cpusets.txt +++ b/Documentation/cgroups/cpusets.txt @@ -777,6 +777,18 @@ in cpuset directories: # /bin/echo 1-4 > cpus -> set cpus list to cpus 1,2,3,4 # /bin/echo 1,2,3,4 > cpus -> set cpus list to cpus 1,2,3,4 +To add a CPU to a cpuset, write the new list of CPUs including the +CPU to be added. To add 6 to the above cpuset: + +# /bin/echo 1-4,6 > cpus -> set cpus list to cpus 1,2,3,4,6 + +Similarly to remove a CPU from a cpuset, write the new list of CPUs +without the CPU to be removed. + +To remove all the CPUs: + +# /bin/echo "" > cpus -> clear cpus list + 2.3 Setting flags ----------------- diff --git a/Documentation/connector/cn_test.c b/Documentation/connector/cn_test.c index f688eba8770..6a5be5d5c8e 100644 --- a/Documentation/connector/cn_test.c +++ b/Documentation/connector/cn_test.c @@ -1,7 +1,7 @@ /* * cn_test.c * - * 2004-2005 Copyright (c) Evgeniy Polyakov <johnpol@2ka.mipt.ru> + * 2004+ Copyright (c) Evgeniy Polyakov <zbr@ioremap.net> * All rights reserved. * * This program is free software; you can redistribute it and/or modify @@ -194,5 +194,5 @@ module_init(cn_test_init); module_exit(cn_test_fini); MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Evgeniy Polyakov <johnpol@2ka.mipt.ru>"); +MODULE_AUTHOR("Evgeniy Polyakov <zbr@ioremap.net>"); MODULE_DESCRIPTION("Connector's test module"); diff --git a/Documentation/connector/ucon.c b/Documentation/connector/ucon.c index d738cde2a8d..c5092ad0ce4 100644 --- a/Documentation/connector/ucon.c +++ b/Documentation/connector/ucon.c @@ -1,7 +1,7 @@ /* * ucon.c * - * Copyright (c) 2004+ Evgeniy Polyakov <johnpol@2ka.mipt.ru> + * Copyright (c) 2004+ Evgeniy Polyakov <zbr@ioremap.net> * * * This program is free software; you can redistribute it and/or modify diff --git a/Documentation/device-mapper/dm-log.txt b/Documentation/device-mapper/dm-log.txt new file mode 100644 index 00000000000..994dd75475a --- /dev/null +++ b/Documentation/device-mapper/dm-log.txt @@ -0,0 +1,54 @@ +Device-Mapper Logging +===================== +The device-mapper logging code is used by some of the device-mapper +RAID targets to track regions of the disk that are not consistent. +A region (or portion of the address space) of the disk may be +inconsistent because a RAID stripe is currently being operated on or +a machine died while the region was being altered. In the case of +mirrors, a region would be considered dirty/inconsistent while you +are writing to it because the writes need to be replicated for all +the legs of the mirror and may not reach the legs at the same time. +Once all writes are complete, the region is considered clean again. + +There is a generic logging interface that the device-mapper RAID +implementations use to perform logging operations (see +dm_dirty_log_type in include/linux/dm-dirty-log.h). Various different +logging implementations are available and provide different +capabilities. The list includes: + +Type Files +==== ===== +disk drivers/md/dm-log.c +core drivers/md/dm-log.c +userspace drivers/md/dm-log-userspace* include/linux/dm-log-userspace.h + +The "disk" log type +------------------- +This log implementation commits the log state to disk. This way, the +logging state survives reboots/crashes. + +The "core" log type +------------------- +This log implementation keeps the log state in memory. The log state +will not survive a reboot or crash, but there may be a small boost in +performance. This method can also be used if no storage device is +available for storing log state. + +The "userspace" log type +------------------------ +This log type simply provides a way to export the log API to userspace, +so log implementations can be done there. This is done by forwarding most +logging requests to userspace, where a daemon receives and processes the +request. + +The structure used for communication between kernel and userspace are +located in include/linux/dm-log-userspace.h. Due to the frequency, +diversity, and 2-way communication nature of the exchanges between +kernel and userspace, 'connector' is used as the interface for +communication. + +There are currently two userspace log implementations that leverage this +framework - "clustered_disk" and "clustered_core". These implementations +provide a cluster-coherent log for shared-storage. Device-mapper mirroring +can be used in a shared-storage environment when the cluster log implementations +are employed. diff --git a/Documentation/device-mapper/dm-queue-length.txt b/Documentation/device-mapper/dm-queue-length.txt new file mode 100644 index 00000000000..f4db2562175 --- /dev/null +++ b/Documentation/device-mapper/dm-queue-length.txt @@ -0,0 +1,39 @@ +dm-queue-length +=============== + +dm-queue-length is a path selector module for device-mapper targets, +which selects a path with the least number of in-flight I/Os. +The path selector name is 'queue-length'. + +Table parameters for each path: [<repeat_count>] + <repeat_count>: The number of I/Os to dispatch using the selected + path before switching to the next path. + If not given, internal default is used. To check + the default value, see the activated table. + +Status for each path: <status> <fail-count> <in-flight> + <status>: 'A' if the path is active, 'F' if the path is failed. + <fail-count>: The number of path failures. + <in-flight>: The number of in-flight I/Os on the path. + + +Algorithm +========= + +dm-queue-length increments/decrements 'in-flight' when an I/O is +dispatched/completed respectively. +dm-queue-length selects a path with the minimum 'in-flight'. + + +Examples +======== +In case that 2 paths (sda and sdb) are used with repeat_count == 128. + +# echo "0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 1 8:0 A 0 0 8:16 A 0 0 diff --git a/Documentation/device-mapper/dm-service-time.txt b/Documentation/device-mapper/dm-service-time.txt new file mode 100644 index 00000000000..7d00668e97b --- /dev/null +++ b/Documentation/device-mapper/dm-service-time.txt @@ -0,0 +1,91 @@ +dm-service-time +=============== + +dm-service-time is a path selector module for device-mapper targets, +which selects a path with the shortest estimated service time for +the incoming I/O. + +The service time for each path is estimated by dividing the total size +of in-flight I/Os on a path with the performance value of the path. +The performance value is a relative throughput value among all paths +in a path-group, and it can be specified as a table argument. + +The path selector name is 'service-time'. + +Table parameters for each path: [<repeat_count> [<relative_throughput>]] + <repeat_count>: The number of I/Os to dispatch using the selected + path before switching to the next path. + If not given, internal default is used. To check + the default value, see the activated table. + <relative_throughput>: The relative throughput value of the path + among all paths in the path-group. + The valid range is 0-100. + If not given, minimum value '1' is used. + If '0' is given, the path isn't selected while + other paths having a positive value are available. + +Status for each path: <status> <fail-count> <in-flight-size> \ + <relative_throughput> + <status>: 'A' if the path is active, 'F' if the path is failed. + <fail-count>: The number of path failures. + <in-flight-size>: The size of in-flight I/Os on the path. + <relative_throughput>: The relative throughput value of the path + among all paths in the path-group. + + +Algorithm +========= + +dm-service-time adds the I/O size to 'in-flight-size' when the I/O is +dispatched and substracts when completed. +Basically, dm-service-time selects a path having minimum service time +which is calculated by: + + ('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput' + +However, some optimizations below are used to reduce the calculation +as much as possible. + + 1. If the paths have the same 'relative_throughput', skip + the division and just compare the 'in-flight-size'. + + 2. If the paths have the same 'in-flight-size', skip the division + and just compare the 'relative_throughput'. + + 3. If some paths have non-zero 'relative_throughput' and others + have zero 'relative_throughput', ignore those paths with zero + 'relative_throughput'. + +If such optimizations can't be applied, calculate service time, and +compare service time. +If calculated service time is equal, the path having maximum +'relative_throughput' may be better. So compare 'relative_throughput' +then. + + +Examples +======== +In case that 2 paths (sda and sdb) are used with repeat_count == 128 +and sda has an average throughput 1GB/s and sdb has 4GB/s, +'relative_throughput' value may be '1' for sda and '4' for sdb. + +# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4 + + +Or '2' for sda and '8' for sdb would be also true. + +# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8 diff --git a/Documentation/driver-model/driver.txt b/Documentation/driver-model/driver.txt index 82132169d47..60120fb3b96 100644 --- a/Documentation/driver-model/driver.txt +++ b/Documentation/driver-model/driver.txt @@ -207,8 +207,8 @@ Attributes ~~~~~~~~~~ struct driver_attribute { struct attribute attr; - ssize_t (*show)(struct device_driver *, char * buf, size_t count, loff_t off); - ssize_t (*store)(struct device_driver *, const char * buf, size_t count, loff_t off); + ssize_t (*show)(struct device_driver *driver, char *buf); + ssize_t (*store)(struct device_driver *, const char * buf, size_t count); }; Device drivers can export attributes via their sysfs directories. diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index a52adfc9a57..3d1b0ab70c8 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -25,7 +25,7 @@ use IO::Handle; "tda10046lifeview", "av7110", "dec2000t", "dec2540t", "dec3000s", "vp7041", "dibusb", "nxt2002", "nxt2004", "or51211", "or51132_qam", "or51132_vsb", "bluebird", - "opera1", "cx231xx", "cx18", "cx23885", "pvrusb2" ); + "opera1", "cx231xx", "cx18", "cx23885", "pvrusb2", "mpc718" ); # Check args syntax() if (scalar(@ARGV) != 1); @@ -381,6 +381,57 @@ sub cx18 { $allfiles; } +sub mpc718 { + my $archive = 'Yuan MPC718 TV Tuner Card 2.13.10.1016.zip'; + my $url = "ftp://ftp.work.acer-euro.com/desktop/aspire_idea510/vista/Drivers/$archive"; + my $fwfile = "dvb-cx18-mpc718-mt352.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + wgetfile($archive, $url); + unzip($archive, $tmpdir); + + my $sourcefile = "$tmpdir/Yuan MPC718 TV Tuner Card 2.13.10.1016/mpc718_32bit/yuanrap.sys"; + my $found = 0; + + open IN, '<', $sourcefile or die "Couldn't open $sourcefile to extract $fwfile data\n"; + binmode IN; + open OUT, '>', $fwfile; + binmode OUT; + { + # Block scope because we change the line terminator variable $/ + my $prevlen = 0; + my $currlen; + + # Buried in the data segment are 3 runs of almost identical + # register-value pairs that end in 0x5d 0x01 which is a "TUNER GO" + # command for the MT352. + # Pull out the middle run (because it's easy) of register-value + # pairs to make the "firmware" file. + + local $/ = "\x5d\x01"; # MT352 "TUNER GO" + + while (<IN>) { + $currlen = length($_); + if ($prevlen == $currlen && $currlen <= 64) { + chop; chop; # Get rid of "TUNER GO" + s/^\0\0//; # get rid of leading 00 00 if it's there + printf OUT "$_"; + $found = 1; + last; + } + $prevlen = $currlen; + } + } + close OUT; + close IN; + if (!$found) { + unlink $fwfile; + die "Couldn't find valid register-value sequence in $sourcefile for $fwfile\n"; + } + $fwfile; +} + sub cx23885 { my $url = "http://linuxtv.org/downloads/firmware/"; diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index f8cd450be9a..09e031c5588 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -458,3 +458,13 @@ Why: Remove the old legacy 32bit machine check code. This has been but the old version has been kept around for easier testing. Note this doesn't impact the old P5 and WinChip machine check handlers. Who: Andi Kleen <andi@firstfloor.org> + +---------------------------- + +What: lock_policy_rwsem_* and unlock_policy_rwsem_* will not be + exported interface anymore. +When: 2.6.33 +Why: cpu_policy_rwsem has a new cleaner definition making it local to + cpufreq core and contained inside cpufreq.c. Other dependent + drivers should not use it in order to safely avoid lockdep issues. +Who: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 229d7b7c50a..18b9d0ca063 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -109,27 +109,28 @@ prototypes: locking rules: All may block. - BKL s_lock s_umount -alloc_inode: no no no -destroy_inode: no -dirty_inode: no (must not sleep) -write_inode: no -drop_inode: no !!!inode_lock!!! -delete_inode: no -put_super: yes yes no -write_super: no yes read -sync_fs: no no read -freeze_fs: ? -unfreeze_fs: ? -statfs: no no no -remount_fs: yes yes maybe (see below) -clear_inode: no -umount_begin: yes no no -show_options: no (vfsmount->sem) -quota_read: no no no (see below) -quota_write: no no no (see below) - -->remount_fs() will have the s_umount lock if it's already mounted. + None have BKL + s_umount +alloc_inode: +destroy_inode: +dirty_inode: (must not sleep) +write_inode: +drop_inode: !!!inode_lock!!! +delete_inode: +put_super: write +write_super: read +sync_fs: read +freeze_fs: read +unfreeze_fs: read +statfs: no +remount_fs: maybe (see below) +clear_inode: +umount_begin: no +show_options: no (namespace_sem) +quota_read: no (see below) +quota_write: no (see below) + +->remount_fs() will have the s_umount exclusive lock if it's already mounted. When called from get_sb_single, it does NOT have the s_umount lock. ->quota_read() and ->quota_write() functions are both guaranteed to be the only ones operating on the quota file by the quota code (via diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt index 7e81e37c0b1..b245d524d56 100644 --- a/Documentation/filesystems/sysfs.txt +++ b/Documentation/filesystems/sysfs.txt @@ -23,7 +23,8 @@ interface. Using sysfs ~~~~~~~~~~~ -sysfs is always compiled in. You can access it by doing: +sysfs is always compiled in if CONFIG_SYSFS is defined. You can access +it by doing: mount -t sysfs sysfs /sys diff --git a/Documentation/gcov.txt b/Documentation/gcov.txt index e716aadb3a3..40ec6335276 100644 --- a/Documentation/gcov.txt +++ b/Documentation/gcov.txt @@ -188,13 +188,18 @@ Solution: Exclude affected source files from profiling by specifying GCOV_PROFILE := n or GCOV_PROFILE_basename.o := n in the corresponding Makefile. +Problem: Files copied from sysfs appear empty or incomplete. +Cause: Due to the way seq_file works, some tools such as cp or tar + may not correctly copy files from sysfs. +Solution: Use 'cat' to r |