diff options
author | James Bottomley <jejb@mulgrave.il.steeleye.com> | 2006-06-10 13:47:26 -0500 |
---|---|---|
committer | James Bottomley <jejb@mulgrave.il.steeleye.com> | 2006-06-10 13:47:26 -0500 |
commit | f0cd91a68acdc9b49d7f6738b514a426da627649 (patch) | |
tree | 8ad73564015794197583b094217ae0a71e71e753 /Documentation | |
parent | 60eef25701d25e99c991dd0f4a9f3832a0c3ad3e (diff) | |
parent | 128e6ced247cda88f96fa9f2e4ba8b2c4a681560 (diff) |
Merge ../linux-2.6
Diffstat (limited to 'Documentation')
25 files changed, 834 insertions, 108 deletions
diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index 1af0f2d5022..2ffb0d62f0f 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -33,7 +33,9 @@ pci_alloc_consistent(struct pci_dev *dev, size_t size, Consistent memory is memory for which a write by either the device or the processor can immediately be read by the processor or device -without having to worry about caching effects. +without having to worry about caching effects. (You may however need +to make sure to flush the processor's write buffers before telling +devices to read that memory.) This routine allocates a region of <size> bytes of consistent memory. it also returns a <dma_handle> which may be cast to an unsigned @@ -304,12 +306,12 @@ dma address with dma_mapping_error(). A non zero return value means the mapping could not be created and the driver should take appropriate action (eg reduce current DMA mapping usage or delay and try again later). -int -dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, - enum dma_data_direction direction) -int -pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, - int nents, int direction) + int + dma_map_sg(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction direction) + int + pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, + int nents, int direction) Maps a scatter gather list from the block layer. @@ -327,12 +329,33 @@ critical that the driver do something, in the case of a block driver aborting the request or even oopsing is better than doing nothing and corrupting the filesystem. -void -dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries, - enum dma_data_direction direction) -void -pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, - int nents, int direction) +With scatterlists, you use the resulting mapping like this: + + int i, count = dma_map_sg(dev, sglist, nents, direction); + struct scatterlist *sg; + + for (i = 0, sg = sglist; i < count; i++, sg++) { + hw_address[i] = sg_dma_address(sg); + hw_len[i] = sg_dma_len(sg); + } + +where nents is the number of entries in the sglist. + +The implementation is free to merge several consecutive sglist entries +into one (e.g. with an IOMMU, or if several pages just happen to be +physically contiguous) and returns the actual number of sg entries it +mapped them to. On failure 0, is returned. + +Then you should loop count times (note: this can be less than nents times) +and use sg_dma_address() and sg_dma_len() macros where you previously +accessed sg->address and sg->length as shown above. + + void + dma_unmap_sg(struct device *dev, struct scatterlist *sg, + int nhwentries, enum dma_data_direction direction) + void + pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, + int nents, int direction) unmap the previously mapped scatter/gather list. All the parameters must be the same as those and passed in to the scatter/gather mapping diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt index 10bf4deb96a..7c717699032 100644 --- a/Documentation/DMA-mapping.txt +++ b/Documentation/DMA-mapping.txt @@ -58,11 +58,15 @@ translating each of those pages back to a kernel address using something like __va(). [ EDIT: Update this when we integrate Gerd Knorr's generic code which does this. ] -This rule also means that you may not use kernel image addresses -(ie. items in the kernel's data/text/bss segment, or your driver's) -nor may you use kernel stack addresses for DMA. Both of these items -might be mapped somewhere entirely different than the rest of physical -memory. +This rule also means that you may use neither kernel image addresses +(items in data/text/bss segments), nor module image addresses, nor +stack addresses for DMA. These could all be mapped somewhere entirely +different than the rest of physical memory. Even if those classes of +memory could physically work with DMA, you'd need to ensure the I/O +buffers were cacheline-aligned. Without that, you'd see cacheline +sharing problems (data corruption) on CPUs with DMA-incoherent caches. +(The CPU could write to one word, DMA would write to a different one +in the same cache line, and one of them could be overwritten.) Also, this means that you cannot take the return of a kmap() call and DMA to/from that. This is similar to vmalloc(). @@ -284,6 +288,11 @@ There are two types of DMA mappings: in order to get correct behavior on all platforms. + Also, on some platforms your driver may need to flush CPU write + buffers in much the same way as it needs to flush write buffers + found in PCI bridges (such as by reading a register's value + after writing it). + - Streaming DMA mappings which are usually mapped for one DMA transfer, unmapped right after it (unless you use pci_dma_sync_* below) and for which hardware can optimize for sequential accesses. @@ -303,6 +312,9 @@ There are two types of DMA mappings: Neither type of DMA mapping has alignment restrictions that come from PCI, although some devices may have such restrictions. +Also, systems with caches that aren't DMA-coherent will work better +when the underlying buffers don't share cache lines with other data. + Using Consistent DMA mappings. diff --git a/Documentation/HOWTO b/Documentation/HOWTO index 6c9e746267d..915ae8c986c 100644 --- a/Documentation/HOWTO +++ b/Documentation/HOWTO @@ -603,7 +603,8 @@ start exactly where you are now. ---------- -Thanks to Paolo Ciarrocchi who allowed the "Development Process" section +Thanks to Paolo Ciarrocchi who allowed the "Development Process" +(http://linux.tar.bz/articles/2.6-development_process) section to be based on text he had written, and to Randy Dunlap and Gerrit Huizenga for some of the list of things you should and should not say. Also thanks to Pat Mochel, Hanna Linder, Randy Dunlap, Kay Sievers, diff --git a/Documentation/block/switching-sched.txt b/Documentation/block/switching-sched.txt new file mode 100644 index 00000000000..5fa130a6753 --- /dev/null +++ b/Documentation/block/switching-sched.txt @@ -0,0 +1,22 @@ +As of the Linux 2.6.10 kernel, it is now possible to change the +IO scheduler for a given block device on the fly (thus making it possible, +for instance, to set the CFQ scheduler for the system default, but +set a specific device to use the anticipatory or noop schedulers - which +can improve that device's throughput). + +To set a specific scheduler, simply do this: + +echo SCHEDNAME > /sys/block/DEV/queue/scheduler + +where SCHEDNAME is the name of a defined IO scheduler, and DEV is the +device name (hda, hdb, sga, or whatever you happen to have). + +The list of defined schedulers can be found by simply doing +a "cat /sys/block/DEV/queue/scheduler" - the list of valid names +will be displayed, with the currently selected scheduler in brackets: + +# cat /sys/block/hda/queue/scheduler +noop anticipatory deadline [cfq] +# echo anticipatory > /sys/block/hda/queue/scheduler +# cat /sys/block/hda/queue/scheduler +noop [anticipatory] deadline cfq diff --git a/Documentation/cpu-freq/index.txt b/Documentation/cpu-freq/index.txt index 5009805f937..ffdb5323df3 100644 --- a/Documentation/cpu-freq/index.txt +++ b/Documentation/cpu-freq/index.txt @@ -53,4 +53,4 @@ the CPUFreq Mailing list: * http://lists.linux.org.uk/mailman/listinfo/cpufreq Clock and voltage scaling for the SA-1100: -* http://www.lart.tudelft.nl/projects/scaling +* http://www.lartmaker.nl/projects/scaling diff --git a/Documentation/devices.txt b/Documentation/devices.txt index 3c406acd4df..b369a8c46a7 100644 --- a/Documentation/devices.txt +++ b/Documentation/devices.txt @@ -1721,11 +1721,6 @@ Your cooperation is appreciated. These devices support the same API as the generic SCSI devices. - 97 block Packet writing for CD/DVD devices - 0 = /dev/pktcdvd0 First packet-writing module - 1 = /dev/pktcdvd1 Second packet-writing module - ... - 98 char Control and Measurement Device (comedi) 0 = /dev/comedi0 First comedi device 1 = /dev/comedi1 Second comedi device diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index 15fc8fbef67..4820366b6ae 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -259,9 +259,9 @@ sub dibusb { } sub nxt2002 { - my $sourcefile = "Broadband4PC_4_2_11.zip"; + my $sourcefile = "Technisat_DVB-PC_4_4_COMPACT.zip"; my $url = "http://www.bbti.us/download/windows/$sourcefile"; - my $hash = "c6d2ea47a8f456d887ada0cfb718ff2a"; + my $hash = "476befae8c7c1bb9648954060b1eec1f"; my $outfile = "dvb-fe-nxt2002.fw"; my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); @@ -269,8 +269,8 @@ sub nxt2002 { wgetfile($sourcefile, $url); unzip($sourcefile, $tmpdir); - verify("$tmpdir/SkyNETU.sys", $hash); - extract("$tmpdir/SkyNETU.sys", 375832, 5908, $outfile); + verify("$tmpdir/SkyNET.sys", $hash); + extract("$tmpdir/SkyNET.sys", 331624, 5908, $outfile); $outfile; } diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 293fed113df..43ab119963d 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -25,8 +25,9 @@ Who: Adrian Bunk <bunk@stusta.de> --------------------------- -What: drivers depending on OBSOLETE_OSS_DRIVER -When: January 2006 +What: drivers that were depending on OBSOLETE_OSS_DRIVER + (config options already removed) +When: before 2.6.19 Why: OSS drivers with ALSA replacements Who: Adrian Bunk <bunk@stusta.de> @@ -56,6 +57,15 @@ Who: Jody McIntyre <scjody@steamballoon.com> --------------------------- +What: sbp2: module parameter "force_inquiry_hack" +When: July 2006 +Why: Superceded by parameter "workarounds". Both parameters are meant to be + used ad-hoc and for single devices only, i.e. not in modprobe.conf, + therefore the impact of this feature replacement should be low. +Who: Stefan Richter <stefanr@s5r6.in-berlin.de> + +--------------------------- + What: Video4Linux API 1 ioctls and video_decoder.h from Video devices. When: July 2006 Why: V4L1 AP1 was replaced by V4L2 API. during migration from 2.4 to 2.6 diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt index c8bce82ddca..89b1d196ca8 100644 --- a/Documentation/filesystems/sysfs.txt +++ b/Documentation/filesystems/sysfs.txt @@ -246,6 +246,7 @@ class/ devices/ firmware/ net/ +fs/ devices/ contains a filesystem representation of the device tree. It maps directly to the internal kernel device tree, which is a hierarchy of @@ -264,6 +265,10 @@ drivers/ contains a directory for each device driver that is loaded for devices on that particular bus (this assumes that drivers do not span multiple bus types). +fs/ contains a directory for some filesystems. Currently each +filesystem wanting to export attributes must create its own hierarchy +below fs/ (see ./fuse.txt for an example). + More information can driver-model specific features can be found in Documentation/driver-model/. diff --git a/Documentation/firmware_class/README b/Documentation/firmware_class/README index 43e836c07ae..e9cc8bb26f7 100644 --- a/Documentation/firmware_class/README +++ b/Documentation/firmware_class/README @@ -105,20 +105,3 @@ on the setup, so I think that the choice on what firmware to make persistent should be left to userspace. - - Why register_firmware()+__init can be useful: - - For boot devices needing firmware. - - To make the transition easier: - The firmware can be declared __init and register_firmware() - called on module_init. Then the firmware is warranted to be - there even if "firmware hotplug userspace" is not there yet or - it doesn't yet provide the needed firmware. - Once the firmware is widely available in userspace, it can be - removed from the kernel. Or made optional (CONFIG_.*_FIRMWARE). - - In either case, if firmware hotplug support is there, it can move the - firmware out of kernel memory into the real filesystem for later - usage. - - Note: If persistence is implemented on top of initramfs, - register_firmware() may not be appropriate. - diff --git a/Documentation/firmware_class/firmware_sample_driver.c b/Documentation/firmware_class/firmware_sample_driver.c index ad3edaba453..87feccdb5c9 100644 --- a/Documentation/firmware_class/firmware_sample_driver.c +++ b/Documentation/firmware_class/firmware_sample_driver.c @@ -5,8 +5,6 @@ * * Sample code on how to use request_firmware() from drivers. * - * Note that register_firmware() is currently useless. - * */ #include <linux/module.h> @@ -17,11 +15,6 @@ #include "linux/firmware.h" -#define WE_CAN_NEED_FIRMWARE_BEFORE_USERSPACE_IS_AVAILABLE -#ifdef WE_CAN_NEED_FIRMWARE_BEFORE_USERSPACE_IS_AVAILABLE -char __init inkernel_firmware[] = "let's say that this is firmware\n"; -#endif - static struct device ghost_device = { .bus_id = "ghost0", }; @@ -104,10 +97,6 @@ static void sample_probe_async(void) static int sample_init(void) { -#ifdef WE_CAN_NEED_FIRMWARE_BEFORE_USERSPACE_IS_AVAILABLE - register_firmware("sample_driver_fw", inkernel_firmware, - sizeof(inkernel_firmware)); -#endif device_initialize(&ghost_device); /* since there is no real hardware insertion I just call the * sample probe functions here */ diff --git a/Documentation/i2c/busses/i2c-parport b/Documentation/i2c/busses/i2c-parport index d9f23c0763f..77b995dfca2 100644 --- a/Documentation/i2c/busses/i2c-parport +++ b/Documentation/i2c/busses/i2c-parport @@ -12,18 +12,22 @@ meant as a replacement for the older, individual drivers: teletext adapters) It currently supports the following devices: - * Philips adapter - * home brew teletext adapter - * Velleman K8000 adapter - * ELV adapter - * Analog Devices evaluation boards (ADM1025, ADM1030, ADM1031, ADM1032) - * Barco LPT->DVI (K5800236) adapter + * (type=0) Philips adapter + * (type=1) home brew teletext adapter + * (type=2) Velleman K8000 adapter + * (type=3) ELV adapter + * (type=4) Analog Devices ADM1032 evaluation board + * (type=5) Analog Devices evaluation boards: ADM1025, ADM1030, ADM1031 + * (type=6) Barco LPT->DVI (K5800236) adapter These devices use different pinout configurations, so you have to tell the driver what you have, using the type module parameter. There is no way to autodetect the devices. Support for different pinout configurations can be easily added when needed. +Earlier kernels defaulted to type=0 (Philips). But now, if the type +parameter is missing, the driver will simply fail to initialize. + Building your own adapter ------------------------- diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 92f0056d928..c61d8b876fd 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -1031,7 +1031,7 @@ conflict on any particular lock. LOCKS VS MEMORY ACCESSES ------------------------ -Consider the following: the system has a pair of spinlocks (N) and (Q), and +Consider the following: the system has a pair of spinlocks (M) and (Q), and three CPUs; then should the following sequence of events occur: CPU 1 CPU 2 @@ -1678,7 +1678,7 @@ CPU's caches by some other cache event: smp_wmb(); <A:modify v=2> <C:busy> <C:queue v=2> - p = &b; q = p; + p = &v; q = p; <D:request p> <B:modify p=&v> <D:commit p=&v> <D:read p> diff --git a/Documentation/networking/operstates.txt b/Documentation/networking/operstates.txt new file mode 100644 index 00000000000..4a21d9bb836 --- /dev/null +++ b/Documentation/networking/operstates.txt @@ -0,0 +1,161 @@ + +1. Introduction + +Linux distinguishes between administrative and operational state of an +interface. Admininstrative state is the result of "ip link set dev +<dev> up or down" and reflects whether the administrator wants to use +the device for traffic. + +However, an interface is not usable just because the admin enabled it +- ethernet requires to be plugged into the switch and, depending on +a site's networking policy and configuration, an 802.1X authentication +to be performed before user data can be transferred. Operational state +shows the ability of an interface to transmit this user data. + +Thanks to 802.1X, userspace must be granted the possibility to +influence operational state. To accommodate this, operational state is +split into two parts: Two flags that can be set by the driver only, and +a RFC2863 compatible state that is derived from these flags, a policy, +and changeable from userspace under certain rules. + + +2. Querying from userspace + +Both admin and operational state can be queried via the netlink +operation RTM_GETLINK. It is also possible to subscribe to RTMGRP_LINK +to be notified of updates. This is important for setting from userspace. + +These values contain interface state: + +ifinfomsg::if_flags & IFF_UP: + Interface is admin up +ifinfomsg::if_flags & IFF_RUNNING: + Interface is in RFC2863 operational state UP or UNKNOWN. This is for + backward compatibility, routing daemons, dhcp clients can use this + flag to determine whether they should use the interface. +ifinfomsg::if_flags & IFF_LOWER_UP: + Driver has signaled netif_carrier_on() +ifinfomsg::if_flags & IFF_DORMANT: + Driver has signaled netif_dormant_on() + +These interface flags can also be queried without netlink using the +SIOCGIFFLAGS ioctl. + +TLV IFLA_OPERSTATE + +contains RFC2863 state of the interface in numeric representation: + +IF_OPER_UNKNOWN (0): + Interface is in unknown state, neither driver nor userspace has set + operational state. Interface must be considered for user data as + setting operational state has not been implemented in every driver. +IF_OPER_NOTPRESENT (1): + Unused in current kernel (notpresent interfaces normally disappear), + just a numerical placeholder. +IF_OPER_DOWN (2): + Interface is unable to transfer data on L1, f.e. ethernet is not + plugged or interface is ADMIN down. +IF_OPER_LOWERLAYERDOWN (3): + Interfaces stacked on an interface that is IF_OPER_DOWN show this + state (f.e. VLAN). +IF_OPER_TESTING (4): + Unused in current kernel. +IF_OPER_DORMANT (5): + Interface is L1 up, but waiting for an external event, f.e. for a + protocol to establish. (802.1X) +IF_OPER_UP (6): + Interface is operational up and can be used. + +This TLV can also be queried via sysfs. + +TLV IFLA_LINKMODE + +contains link policy. This is needed for userspace interaction +described below. + +This TLV can also be queried via sysfs. + + +3. Kernel driver API + +Kernel drivers have access to two flags that map to IFF_LOWER_UP and +IFF_DORMANT. These flags can be set from everywhere, even from +interrupts. It is guaranteed that only the driver has write access, +however, if different layers of the driver manipulate the same flag, +the driver has to provide the synchronisation needed. + +__LINK_STATE_NOCARRIER, maps to !IFF_LOWER_UP: + +The driver uses netif_carrier_on() to clear and netif_carrier_off() to +set this flag. On netif_carrier_off(), the scheduler stops sending +packets. The name 'carrier' and the inversion are historical, think of +it as lower layer. + +netif_carrier_ok() can be used to query that bit. + +__LINK_STATE_DORMANT, maps to IFF_DORMANT: + +Set by the driver to express that the device cannot yet be used +because some driver controlled protocol establishment has to +complete. Corresponding functions are netif_dormant_on() to set the +flag, netif_dormant_off() to clear it and netif_dormant() to query. + +On device allocation, networking core sets the flags equivalent to +netif_carrier_ok() and !netif_dormant(). + + +Whenever the driver CHANGES one of these flags, a workqueue event is +scheduled to translate the flag combination to IFLA_OPERSTATE as +follows: + +!netif_carrier_ok(): + IF_OPER_LOWERLAYERDOWN if the interface is stacked, IF_OPER_DOWN + otherwise. Kernel can recognise stacked interfaces because their + ifindex != iflink. + +netif_carrier_ok() && netif_dormant(): + IF_OPER_DORMANT + +netif_carrier_ok() && !netif_dormant(): + IF_OPER_UP if userspace interaction is disabled. Otherwise + IF_OPER_DORMANT with the possibility for userspace to initiate the + IF_OPER_UP transition afterwards. + + +4. Setting from userspace + +Applications have to use the netlink interface to influence the +RFC2863 operational state of an interface. Setting IFLA_LINKMODE to 1 +via RTM_SETLINK instructs the kernel that an interface should go to +IF_OPER_DORMANT instead of IF_OPER_UP when the combination +netif_carrier_ok() && !netif_dormant() is set by the +driver. Afterwards, the userspace application can set IFLA_OPERSTATE +to IF_OPER_DORMANT or IF_OPER_UP as long as the driver does not set +netif_carrier_off() or netif_dormant_on(). Changes made by userspace +are multicasted on the netlink group RTMGRP_LINK. + +So basically a 802.1X supplicant interacts with the kernel like this: + +-subscribe to RTMGRP_LINK +-set IFLA_LINKMODE to 1 via RTM_SETLINK +-query RTM_GETLINK once to get initial state +-if initial flags are not (IFF_LOWER_UP && !IFF_DORMANT), wait until + netlink multicast signals this state +-do 802.1X, eventually abort if flags go down again +-send RTM_SETLINK to set operstate to IF_OPER_UP if authentication + succeeds, IF_OPER_DORMANT otherwise +-see how operstate and IFF_RUNNING is echoed via netlink multicast +-set interface back to IF_OPER_DORMANT if 802.1X reauthentication + fails +-restart if kernel changes IFF_LOWER_UP or IFF_DORMANT flag + +if supplicant goes down, bring back IFLA_LINKMODE to 0 and +IFLA_OPERSTATE to a sane value. + +A routing daemon or dhcp client just needs to care for IFF_RUNNING or +waiting for operstate to go IF_OPER_UP/IF_OPER_UNKNOWN before +considering the interface / querying a DHCP address. + + +For technical questions and/or comments please e-mail to Stefan Rompf +(stefan at loplof.de). diff --git a/Documentation/networking/xfrm_sync.txt b/Documentation/networking/xfrm_sync.txt new file mode 100644 index 00000000000..8be626f7c0b --- /dev/null +++ b/Documentation/networking/xfrm_sync.txt @@ -0,0 +1,166 @@ + +The sync patches work is based on initial patches from +Krisztian <hidden@balabit.hu> and others and additional patches +from Jamal <hadi@cyberus.ca>. + +The end goal for syncing is to be able to insert attributes + generate +events so that the an SA can be safely moved from one machine to another +for HA purposes. +The idea is to synchronize the SA so that the takeover machine can do +the processing of the SA as accurate as possible if it has access to it. + +We already have the ability to generate SA add/del/upd events. +These patches add ability to sync and have accurate lifetime byte (to +ensure proper decay of SAs) and replay counters to avoid replay attacks +with as minimal loss at failover time. +This way a backup stays as closely uptodate as an active member. + +Because the above items change for every packet the SA receives, +it is possible for a lot of the events to be generated. +For this reason, we also add a nagle-like algorithm to restrict +the events. i.e we are going to set thresholds to say "let me +know if the replay sequence threshold is reached or 10 secs have passed" +These thresholds are set system-wide via sysctls or can be updated +per SA. + +The identified items that need to be synchronized are: +- the lifetime byte counter +note that: lifetime time limit is not important if you assume the failover +machine is known ahead of time since the decay of the time countdown +is not driven by packet arrival. +- the replay sequence for both inbound and outbound + +1) Message Structure +---------------------- + +nlmsghdr:aevent_id:optional-TLVs. + +The netlink message types are: + +XFRM_MSG_NEWAE and XFRM_MSG_GETAE. + +A XFRM_MSG_GETAE does not have TLVs. +A XFRM_MSG_NEWAE will have at least two TLVs (as is +discussed further below). + +aevent_id structure looks like: + + struct xfrm_aevent_id { + struct xfrm_usersa_id sa_id; + __u32 flags; + }; + +xfrm_usersa_id in this message layout identifies the SA. + +flags are used to indicate different things. The possible +flags are: + XFRM_AE_RTHR=1, /* replay threshold*/ + XFRM_AE_RVAL=2, /* replay value */ + XFRM_AE_LVAL=4, /* lifetime value */ + XFRM_AE_ETHR=8, /* expiry timer threshold */ + XFRM_AE_CR=16, /* Event cause is replay update */ + XFRM_AE_CE=32, /* Event cause is timer expiry */ + XFRM_AE_CU=64, /* Event cause is policy update */ + +How these flags are used is dependent on the direction of the +message (kernel<->user) as well the cause (config, query or event). +This is described below in the different messages. + +The pid will be set appropriately in netlink to recognize direction +(0 to the kernel and pid = processid that created the event +when going from kernel to user space) + +A program needs to subscribe to multicast group XFRMNLGRP_AEVENTS +to get notified of these events. + +2) TLVS reflect the different parameters: +----------------------------------------- + +a) byte value (XFRMA_LTIME_VAL) +This TLV carries the running/current counter for byte lifetime since +last event. + +b)replay value (XFRMA_REPLAY_VAL) +This TLV carries the running/current counter for replay sequence since +last event. + +c)replay threshold (XFRMA_REPLAY_THRESH) +This TLV carries the threshold being used by the kernel to trigger events +when the replay sequence is exceeded. + +d) expiry timer (XFRMA_ETIMER_THRESH) +This is a timer value in milliseconds which is used as the nagle +value to rate limit the events. + +3) Default configurations for the parameters: +---------------------------------------------- + +By default these events should be turned off unless there is +at least one listener registered to listen to the multicast +group XFRMNLGRP_AEVENTS. + +Programs installing SAs will need to specify the two thresholds, however, +in order to not change existing applications such as racoon +we also provide default threshold values for these different parameters +in case they are not specified. + +the two sysctls/proc entries are: +a) /proc/sys/net/core/sysctl_xfrm_aevent_etime +used to provide default values for the XFRMA_ETIMER_THRESH in incremental +units of time of 100ms. The default is 10 (1 second) + +b) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth +used to provide default values for XFRMA_REPLAY_THRESH parameter +in incremental packet count. The default is two packets. + +4) Message types +---------------- + +a) XFRM_MSG_GETAE issued by user-->kernel. +XFRM_MSG_GETAE does not carry any TLVs. +The response is a XFRM_MSG_NEWAE which is formatted based on what +XFRM_MSG_GETAE queried for. +The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. +*if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved +*if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved + +b) XFRM_MSG_NEWAE is issued by either user space to configure +or kernel to announce events or respond to a XFRM_MSG_GETAE. + +i) user --> kernel to configure a specific SA. +any of the values or threshold parameters can be updated by passing the +appropriate TLV. +A response is issued back to the sender in user space to indicate success +or failure. +In the case of success, additionally an event with +XFRM_MSG_NEWAE is also issued to any listeners as described in iii). + +ii) kernel->user direction as a response to XFRM_MSG_GETAE +The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. +The threshold TLVs will be included if explicitly requested in +the XFRM_MSG_GETAE message. + +iii) kernel->user to report as event if someone sets any values or +thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above). +In such a case XFRM_AE_CU flag is set to inform the user that +the change happened as a result of an update. +The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. + +iv) kernel->user to report event when replay threshold or a timeout +is exceeded. +In such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout +happened) is set to inform the user what happened. +Note the two flags are mutually exclusive. +The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. + +Exceptions to threshold settings +-------------------------------- + +If you have an SA that is getting hit by traffic in bursts such that +there is a period where the timer threshold expires with no packets +seen, then an odd behavior is seen as follows: +The first packet arrival after a timer expiry will trigger a timeout +aevent; i.e we dont wait for a timeout period or a packet threshold +to be reached. This is done for simplicity and efficiency reasons. + +-JHS diff --git a/Documentation/pci.txt b/Documentation/pci.txt index 711210b38f5..66bbbf1d1ef 100644 --- a/Documentation/pci.txt +++ b/Documentation/pci.txt @@ -259,7 +259,17 @@ on the bus need to be capable of doing it, so this is something which needs to be handled by platform and generic code, not individual drivers. -8. Obsolete functions +8. Vendor and device identifications |