aboutsummaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@woody.linux-foundation.org>2007-04-27 09:26:46 -0700
committerLinus Torvalds <torvalds@woody.linux-foundation.org>2007-04-27 09:26:46 -0700
commit15c54033964a943de7b0763efd3bd0ede7326395 (patch)
tree840b292612d1b5396d5bab5bde537a9013db3ceb /Documentation
parentad5da3cf39a5b11a198929be1f2644e17ecd767e (diff)
parent912a41a4ab935ce8c4308428ec13fc7f8b1f18f4 (diff)
Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (448 commits) [IPV4] nl_fib_lookup: Initialise res.r before fib_res_put(&res) [IPV6]: Fix thinko in ipv6_rthdr_rcv() changes. [IPV4]: Add multipath cached to feature-removal-schedule.txt [WIRELESS] cfg80211: Clarify locking comment. [WIRELESS] cfg80211: Fix locking in wiphy_new. [WEXT] net_device: Don't include wext bits if not required. [WEXT]: Misc code cleanups. [WEXT]: Reduce inline abuse. [WEXT]: Move EXPORT_SYMBOL statements where they belong. [WEXT]: Cleanup early ioctl call path. [WEXT]: Remove options. [WEXT]: Remove dead debug code. [WEXT]: Clean up how wext is called. [WEXT]: Move to net/wireless [AFS]: Eliminate cmpxchg() usage in vlocation code. [RXRPC]: Fix pointers passed to bitops. [RXRPC]: Remove bogus atomic_* overrides. [AFS]: Fix u64 printing in debug logging. [AFS]: Add "directory write" support. [AFS]: Implement the CB.InitCallBackState3 operation. ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/feature-removal-schedule.txt40
-rw-r--r--Documentation/filesystems/afs.txt214
-rw-r--r--Documentation/filesystems/proc.txt9
-rw-r--r--Documentation/keys.txt12
-rw-r--r--Documentation/networking/bonding.txt35
-rw-r--r--Documentation/networking/dccp.txt10
-rw-r--r--Documentation/networking/ip-sysctl.txt31
-rw-r--r--Documentation/networking/rxrpc.txt859
-rw-r--r--Documentation/networking/wan-router.txt1
9 files changed, 1093 insertions, 118 deletions
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 19b4c96b2a4..6da663607f7 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -211,15 +211,6 @@ Who: Adrian Bunk <bunk@stusta.de>
---------------------------
-What: IPv4 only connection tracking/NAT/helpers
-When: 2.6.22
-Why: The new layer 3 independant connection tracking replaces the old
- IPv4 only version. After some stabilization of the new code the
- old one will be removed.
-Who: Patrick McHardy <kaber@trash.net>
-
----------------------------
-
What: ACPI hooks (X86_SPEEDSTEP_CENTRINO_ACPI) in speedstep-centrino driver
When: December 2006
Why: Speedstep-centrino driver with ACPI hooks and acpi-cpufreq driver are
@@ -294,18 +285,6 @@ Who: Richard Purdie <rpurdie@rpsys.net>
---------------------------
-What: Wireless extensions over netlink (CONFIG_NET_WIRELESS_RTNETLINK)
-When: with the merge of wireless-dev, 2.6.22 or later
-Why: The option/code is
- * not enabled on most kernels
- * not required by any userspace tools (except an experimental one,
- and even there only for some parts, others use ioctl)
- * pointless since wext is no longer evolving and the ioctl
- interface needs to be kept
-Who: Johannes Berg <johannes@sipsolutions.net>
-
----------------------------
-
What: i8xx_tco watchdog driver
When: in 2.6.22
Why: the i8xx_tco watchdog driver has been replaced by the iTCO_wdt
@@ -313,3 +292,22 @@ Why: the i8xx_tco watchdog driver has been replaced by the iTCO_wdt
Who: Wim Van Sebroeck <wim@iguana.be>
---------------------------
+
+What: Multipath cached routing support in ipv4
+When: in 2.6.23
+Why: Code was merged, then submitter immediately disappeared leaving
+ us with no maintainer and lots of bugs. The code should not have
+ been merged in the first place, and many aspects of it's
+ implementation are blocking more critical core networking
+ development. It's marked EXPERIMENTAL and no distribution
+ enables it because it cause obscure crashes due to unfixable bugs
+ (interfaces don't return errors so memory allocation can't be
+ handled, calling contexts of these interfaces make handling
+ errors impossible too because they get called after we've
+ totally commited to creating a route object, for example).
+ This problem has existed for years and no forward progress
+ has ever been made, and nobody steps up to try and salvage
+ this code, so we're going to finally just get rid of it.
+Who: David S. Miller <davem@davemloft.net>
+
+---------------------------
diff --git a/Documentation/filesystems/afs.txt b/Documentation/filesystems/afs.txt
index 2f4237dfb8c..12ad6c7f4e5 100644
--- a/Documentation/filesystems/afs.txt
+++ b/Documentation/filesystems/afs.txt
@@ -1,31 +1,82 @@
+ ====================
kAFS: AFS FILESYSTEM
====================
-ABOUT
-=====
+Contents:
+
+ - Overview.
+ - Usage.
+ - Mountpoints.
+ - Proc filesystem.
+ - The cell database.
+ - Security.
+ - Examples.
+
+
+========
+OVERVIEW
+========
-This filesystem provides a fairly simple AFS filesystem driver. It is under
-development and only provides very basic facilities. It does not yet support
-the following AFS features:
+This filesystem provides a fairly simple secure AFS filesystem driver. It is
+under development and does not yet provide the full feature set. The features
+it does support include:
- (*) Write support.
- (*) Communications security.
- (*) Local caching.
- (*) pioctl() system call.
- (*) Automatic mounting of embedded mountpoints.
+ (*) Security (currently only AFS kaserver and KerberosIV tickets).
+ (*) File reading.
+ (*) Automounting.
+
+It does not yet support the following AFS features:
+
+ (*) Write support.
+
+ (*) Local caching.
+
+ (*) pioctl() system call.
+
+
+===========
+COMPILATION
+===========
+
+The filesystem should be enabled by turning on the kernel configuration
+options:
+
+ CONFIG_AF_RXRPC - The RxRPC protocol transport
+ CONFIG_RXKAD - The RxRPC Kerberos security handler
+ CONFIG_AFS - The AFS filesystem
+
+Additionally, the following can be turned on to aid debugging:
+
+ CONFIG_AF_RXRPC_DEBUG - Permit AF_RXRPC debugging to be enabled
+ CONFIG_AFS_DEBUG - Permit AFS debugging to be enabled
+
+They permit the debugging messages to be turned on dynamically by manipulating
+the masks in the following files:
+
+ /sys/module/af_rxrpc/parameters/debug
+ /sys/module/afs/parameters/debug
+
+
+=====
USAGE
=====
When inserting the driver modules the root cell must be specified along with a
list of volume location server IP addresses:
- insmod rxrpc.o
+ insmod af_rxrpc.o
+ insmod rxkad.o
insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
-The first module is a driver for the RxRPC remote operation protocol, and the
-second is the actual filesystem driver for the AFS filesystem.
+The first module is the AF_RXRPC network protocol driver. This provides the
+RxRPC remote operation protocol and may also be accessed from userspace. See:
+
+ Documentation/networking/rxrpc.txt
+
+The second module is the kerberos RxRPC security driver, and the third module
+is the actual filesystem driver for the AFS filesystem.
Once the module has been loaded, more modules can be added by the following
procedure:
@@ -33,7 +84,7 @@ procedure:
echo add grand.central.org 18.7.14.88:128.2.191.224 >/proc/fs/afs/cells
Where the parameters to the "add" command are the name of a cell and a list of
-volume location servers within that cell.
+volume location servers within that cell, with the latter separated by colons.
Filesystems can be mounted anywhere by commands similar to the following:
@@ -42,11 +93,6 @@ Filesystems can be mounted anywhere by commands similar to the following:
mount -t afs "#root.afs." /afs
mount -t afs "#root.cell." /afs/cambridge
- NB: When using this on Linux 2.4, the mount command has to be different,
- since the filesystem doesn't have access to the device name argument:
-
- mount -t afs none /afs -ovol="#root.afs."
-
Where the initial character is either a hash or a percent symbol depending on
whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O
volume, but are willing to use a R/W volume instead (percent).
@@ -60,55 +106,66 @@ named volume will be looked up in the cell specified during insmod.
Additional cells can be added through /proc (see later section).
+===========
MOUNTPOINTS
===========
-AFS has a concept of mountpoints. These are specially formatted symbolic links
-(of the same form as the "device name" passed to mount). kAFS presents these
-to the user as directories that have special properties:
+AFS has a concept of mountpoints. In AFS terms, these are specially formatted
+symbolic links (of the same form as the "device name" passed to mount). kAFS
+presents these to the user as directories that have a follow-link capability
+(ie: symbolic link semantics). If anyone attempts to access them, they will
+automatically cause the target volume to be mounted (if possible) on that site.
- (*) They cannot be listed. Running a program like "ls" on them will incur an
- EREMOTE error (Object is remote).
+Automatically mounted filesystems will be automatically unmounted approximately
+twenty minutes after they were last used. Alternatively they can be unmounted
+directly with the umount() system call.
- (*) Other objects can't be looked up inside of them. This also incurs an
- EREMOTE error.
+Manually unmounting an AFS volume will cause any idle submounts upon it to be
+culled first. If all are culled, then the requested volume will also be
+unmounted, otherwise error EBUSY will be returned.
- (*) They can be queried with the readlink() system call, which will return
- the name of the mountpoint to which they point. The "readlink" program
- will also work.
+This can be used by the administrator to attempt to unmount the whole AFS tree
+mounted on /afs in one go by doing:
- (*) They can be mounted on (which symbolic links can't).
+ umount /afs
+===============
PROC FILESYSTEM
===============
-The rxrpc module creates a number of files in various places in the /proc
-filesystem:
-
- (*) Firstly, some information files are made available in a directory called
- "/proc/net/rxrpc/". These list the extant transport endpoint, peer,
- connection and call records.
-
- (*) Secondly, some control files are made available in a directory called
- "/proc/sys/rxrpc/". Currently, all these files can be used for is to
- turn on various levels of tracing.
-
The AFS modules creates a "/proc/fs/afs/" directory and populates it:
- (*) A "cells" file that lists cells currently known to the afs module.
+ (*) A "cells" file that lists cells currently known to the afs module and
+ their usage counts:
+
+ [root@andromeda ~]# cat /proc/fs/afs/cells
+ USE NAME
+ 3 cambridge.redhat.com
(*) A directory per cell that contains files that list volume location
servers, volumes, and active servers known within that cell.
+ [root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/servers
+ USE ADDR STATE
+ 4 172.16.18.91 0
+ [root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/vlservers
+ ADDRESS
+ 172.16.18.91
+ [root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/volumes
+ USE STT VLID[0] VLID[1] VLID[2] NAME
+ 1 Val 20000000 20000001 20000002 root.afs
+
+=================
THE CELL DATABASE
=================
-The filesystem maintains an internal database of all the cells it knows and
-the IP addresses of the volume location servers for those cells. The cell to
-which the computer belongs is added to the database when insmod is performed
-by the "rootcell=" argument.
+The filesystem maintains an internal database of all the cells it knows and the
+IP addresses of the volume location servers for those cells. The cell to which
+the system belongs is added to the database when insmod is performed by the
+"rootcell=" argument or, if compiled in, using a "kafs.rootcell=" argument on
+the kernel command line.
Further cells can be added by commands similar to the following:
@@ -118,20 +175,65 @@ Further cells can be added by commands similar to the following:
No other cell database operations are available at this time.
+========
+SECURITY
+========
+
+Secure operations are initiated by acquiring a key using the klog program. A
+very primitive klog program is available at:
+
+ http://people.redhat.com/~dhowells/rxrpc/klog.c
+
+This should be compiled by:
+
+ make klog LDLIBS="-lcrypto -lcrypt -lkrb4 -lkeyutils"
+
+And then run as:
+
+ ./klog
+
+Assuming it's successful, this adds a key of type RxRPC, named for the service
+and cell, eg: "afs@<cellname>". This can be viewed with the keyctl program or
+by cat'ing /proc/keys:
+
+ [root@andromeda ~]# keyctl show
+ Session Keyring
+ -3 --alswrv 0 0 keyring: _ses.3268
+ 2 --alswrv 0 0 \_ keyring: _uid.0
+ 111416553 --als--v 0 0 \_ rxrpc: afs@CAMBRIDGE.REDHAT.COM
+
+Currently the username, realm, password and proposed ticket lifetime are
+compiled in to the program.
+
+It is not required to acquire a key before using AFS facilities, but if one is
+not acquired then all operations will be governed by the anonymous user parts
+of the ACLs.
+
+If a key is acquired, then all AFS operations, including mounts and automounts,
+made by a possessor of that key will be secured with that key.
+
+If a file is opened with a particular key and then the file descriptor is
+passed to a process that doesn't have that key (perhaps over an AF_UNIX
+socket), then the operations on the file will be made with key that was used to
+open the file.
+
+
+========
EXAMPLES
========
-Here's what I use to test this. Some of the names and IP addresses are local
-to my internal DNS. My "root.afs" partition has a mount point within it for
+Here's what I use to test this. Some of the names and IP addresses are local
+to my internal DNS. My "root.afs" partition has a mount point within it for
some public volumes volumes.
-insmod -S /tmp/rxrpc.o
-insmod -S /tmp/kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
+insmod /tmp/rxrpc.o
+insmod /tmp/rxkad.o
+insmod /tmp/kafs.o rootcell=cambridge.redhat.com:172.16.18.91
mount -t afs \%root.afs. /afs
mount -t afs \%cambridge.redhat.com:root.cell. /afs/cambridge.redhat.com/
-echo add grand.central.org 18.7.14.88:128.2.191.224 > /proc/fs/afs/cells
+echo add grand.central.org 18.7.14.88:128.2.191.224 > /proc/fs/afs/cells
mount -t afs "#grand.central.org:root.cell." /afs/grand.central.org/
mount -t afs "#grand.central.org:root.archive." /afs/grand.central.org/archive
mount -t afs "#grand.central.org:root.contrib." /afs/grand.central.org/contrib
@@ -141,15 +243,7 @@ mount -t afs "#grand.central.org:root.service." /afs/grand.central.org/service
mount -t afs "#grand.central.org:root.software." /afs/grand.central.org/software
mount -t afs "#grand.central.org:root.user." /afs/grand.central.org/user
-umount /afs/grand.central.org/user
-umount /afs/grand.central.org/software
-umount /afs/grand.central.org/service
-umount /afs/grand.central.org/project
-umount /afs/grand.central.org/doc
-umount /afs/grand.central.org/contrib
-umount /afs/grand.central.org/archive
-umount /afs/grand.central.org
-umount /afs/cambridge.redhat.com
umount /afs
rmmod kafs
+rmmod rxkad
rmmod rxrpc
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 5484ab5efd4..7aaf09b86a5 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1421,6 +1421,15 @@ fewer messages that will be written. Message_burst controls when messages will
be dropped. The default settings limit warning messages to one every five
seconds.
+warnings
+--------
+
+This controls console messages from the networking stack that can occur because
+of problems on the network like duplicate address or bad checksums. Normally,
+this should be enabled, but if the problem persists the messages can be
+disabled.
+
+
netdev_max_backlog
------------------
diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d9cfa..81d9aa09729 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents" for more information.
void unregister_key_type(struct key_type *type);
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys. The facility provides access to the keyring type for managing
+such a bundle:
+
+ struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings. A keyring thus found can then be searched
+with keyring_search(). Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
===================================
NOTES ON ACCESSING PAYLOAD CONTENTS
===================================
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index de809e58092..1da56663083 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -920,40 +920,9 @@ options, you may wish to use the "max_bonds" module parameter,
documented above.
To create multiple bonding devices with differing options, it
-is necessary to load the bonding driver multiple times. Note that
-current versions of the sysconfig network initialization scripts
-handle this automatically; if your distro uses these scripts, no
-special action is needed. See the section Configuring Bonding
-Devices, above, if you're not sure about your network initialization
-scripts.
-
- To load multiple instances of the module, it is necessary to
-specify a different name for each instance (the module loading system
-requires that every loaded module, even multiple instances of the same
-module, have a unique name). This is accomplished by supplying
-multiple sets of bonding options in /etc/modprobe.conf, for example:
-
-alias bond0 bonding
-options bond0 -o bond0 mode=balance-rr miimon=100
-
-alias bond1 bonding
-options bond1 -o bond1 mode=balance-alb miimon=50
-
- will load the bonding module two times. The first instance is
-named "bond0" and creates the bond0 device in balance-rr mode with an
-miimon of 100. The second instance is named "bond1" and creates the
-bond1 device in balance-alb mode with an miimon of 50.
-
- In some circumstances (typically with older distributions),
-the above does not work, and the second bonding instance never sees
-its options. In that case, the second options line can be substituted
-as follows:
-
-install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \
- mode=balance-alb miimon=50
+is necessary to use bonding parameters exported by sysfs, documented
+in the section below.
- This may be repeated any number of times, specifying a new and
-unique name in place of bond1 for each subsequent instance.
3.4 Configuring Bonding Manually via Sysfs
------------------------------------------
diff --git a/Documentation/networking/dccp.txt b/Documentation/networking/dccp.txt
index 387482e46c4..4504cc59e40 100644
--- a/Documentation/networking/dccp.txt
+++ b/Documentation/networking/dccp.txt
@@ -57,6 +57,16 @@ DCCP_SOCKOPT_SEND_CSCOV is for the receiver and has a different meaning: it
coverage value are also acceptable. The higher the number, the more
restrictive this setting (see [RFC 4340, sec. 9.2.1]).
+The following two options apply to CCID 3 exclusively and are getsockopt()-only.
+In either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned.
+DCCP_SOCKOPT_CCID_RX_INFO
+ Returns a `struct tfrc_rx_info' in optval; the buffer for optval and
+ optlen must be set to at least sizeof(struct tfrc_rx_info).
+DCCP_SOCKOPT_CCID_TX_INFO
+ Returns a `struct tfrc_tx_info' in optval; the buffer for optval and
+ optlen must be set to at least sizeof(struct tfrc_tx_info).
+
+
Sysctl variables
================
Several DCCP default parameters can be managed by the following sysctls
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 702d1d8dd04..af6a63ab902 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -179,11 +179,31 @@ tcp_fin_timeout - INTEGER
because they eat maximum 1.5K of memory, but they tend
to live longer. Cf. tcp_max_orphans.
-tcp_frto - BOOLEAN
+tcp_frto - INTEGER
Enables F-RTO, an enhanced recovery algorithm for TCP retransmission
timeouts. It is particularly beneficial in wireless environments
where packet loss is typically due to random radio interference
- rather than intermediate router congestion.
+ rather than intermediate router congestion. If set to 1, basic
+ version is enabled. 2 enables SACK enhanced F-RTO, which is
+ EXPERIMENTAL. The basic version can be used also when SACK is
+ enabled for a flow through tcp_sack sysctl.
+
+tcp_frto_response - INTEGER
+ When F-RTO has detected that a TCP retransmission timeout was
+ spurious (i.e, the timeout would have been avoided had TCP set a
+ longer retransmission timeout), TCP has several options what to do
+ next. Possible values are:
+ 0 Rate halving based; a smooth and conservative response,
+ results in halved cwnd and ssthresh after one RTT
+ 1 Very conservative response; not recommended because even
+ though being valid, it interacts poorly with the rest of
+ Linux TCP, halves cwnd and ssthresh immediately
+ 2 Aggressive response; undoes congestion control measures
+ that are now known to be unnecessary (ignoring the
+ possibility of a lost retransmission that would require
+ TCP to be more cautious), cwnd and ssthresh are restored
+ to the values prior timeout
+ Default: 0 (rate halving based)
tcp_keepalive_time - INTEGER
How often TCP sends out keepalive messages when keepalive is enabled.
@@ -995,7 +1015,12 @@ bridge-nf-call-ip6tables - BOOLEAN
Default: 1
bridge-nf-filter-vlan-tagged - BOOLEAN
- 1 : pass bridged vlan-tagged ARP/IP traffic to arptables/iptables.
+ 1 : pass bridged vlan-tagged ARP/IP/IPv6 traffic to {arp,ip,ip6}tables.
+ 0 : disable this.
+ Default: 1
+
+bridge-nf-filter-pppoe-tagged - BOOLEAN
+ 1 : pass bridged pppoe-tagged IP/IPv6 traffic to {ip,ip6}tables.
0 : disable this.
Default: 1
diff --git a/Documentation/networking/rxrpc.txt b/Documentation/networking/rxrpc.txt
new file mode 100644
index 00000000000..cae231b1c13
--- /dev/null
+++ b/Documentation/networking/rxrpc.txt
@@ -0,0 +1,859 @@
+ ======================
+ RxRPC NETWORK PROTOCOL
+ ======================
+
+The RxRPC protocol driver provides a reliable two-phase transport on top of UDP
+that can be used to perform RxRPC remote operations. This is done over sockets
+of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and
+receive data, aborts and errors.
+
+Contents of this document:
+
+ (*) Overview.
+
+ (*) RxRPC protocol summary.
+
+ (*) AF_RXRPC driver model.
+
+ (*) Control messages.
+
+ (*) Socket options.
+
+ (*) Security.
+
+ (*) Example client usage.
+
+ (*) Example server usage.
+
+ (*) AF_RXRPC kernel interface.
+
+
+========
+OVERVIEW
+========
+
+RxRPC is a two-layer protocol. There is a session layer which provides
+reliable virtual connections using UDP over IPv4 (or IPv6) as the transport
+layer, but implements a real network protocol; and there's the presentation
+layer which renders structured data to binary blobs and back again using XDR
+(as does SunRPC):
+
+ +-------------+
+ | Application |
+ +-------------+
+ | XDR | Presentation
+ +-------------+
+ | RxRPC | Session
+ +-------------+
+ | UDP | Transport
+ +-------------+
+
+
+AF_RXRPC provides:
+
+ (1) Part of an RxRPC facility for both kernel and userspace applications by
+ making the session part of it a Linux network protocol (AF_RXRPC).
+
+ (2) A two-phase protocol. The client transmits a blob (the request) and then
+ receives a blob (the reply), and the server receives the request and then
+ transmits the reply.
+
+ (3) Retention of the reusable bits of the transport system set up for one call
+ to speed up subsequent calls.
+
+ (4) A secure protocol, using the Linux kernel's key retention facility to
+ manage security on the client end. The server end must of necessity be
+ more active in security negotiations.
+
+AF_RXRPC does not provide XDR marshalling/presentation facilities. That is
+left to the application. AF_RXRPC only deals in blobs. Even the operation ID
+is just the first four bytes of the request blob, and as such is beyond the
+kernel's interest.
+
+
+Sockets of AF_RXRPC family are:
+
+ (1) created as type SOCK_DGRAM;
+
+ (2) provided with a protocol of the type of underlying transport they're going
+ to use - currently only PF_INET is supported.
+
+
+The Andrew File System (AFS) is an example of an application that uses this and
+that has both kernel (filesystem) and userspace (utility) components.
+
+
+======================
+RXRPC PROTOCOL SUMMARY
+======================
+
+An overview of the RxRPC protocol:
+
+ (*) RxRPC sits on top of another networking protocol (UDP is the only option
+ currently), and uses this to provide network transport. UDP ports, for
+ example, provide transport endpoints.
+
+ (*) RxRPC supports multiple virtual "connections" from any given transport
+ endpoint, thus allowing the endpoints to be shared, even to the same
+ remote endpoint.
+
+ (*) Each connection goes to a particular "service". A connection may not go
+ to multiple services. A service may be considered the RxRPC equivalent of
+ a port number. AF_RXRPC permits multiple services to share an endpoint.
+
+ (*) Client-originating packets are marked, thus a transport endpoint can be
+ shared between client and server connections (connections have a
+ direction).
+
+ (*) Up to a billion connections may be supported concurrently between one
+ local transport endpoint and one service on one remote endpoint. An RxRPC
+ connection is described by seven numbers:
+
+ Local address }
+ Local port } Transport (UDP) address
+ Remote address }
+ Remote port }
+ Direction
+ Connection ID
+ Service ID
+
+ (*) Each RxRPC operation is a "call". A connection may make up to four
+ billion calls, but only up to four calls may be in progress on a
+ connection at any one time.
+
+ (*) Calls are two-phase and asymmetric: the client sends its request data,
+ which the service receives; then the service transmits the reply data
+ which the client receives.
+
+ (*) The data blobs are of indefinite size, the end of a phase is marked with a
+ flag in the packet. The number of packets of data making up one blob may
+ not exceed 4 billion, however, as this would cause the sequence number to
+ wrap.
+
+ (*) The first four bytes of the request data are the service operation ID.
+
+ (*) Security is negotiated on a per-connection basis. The connection is
+ initiated by the first data packet on it arriving. If security is
+ requested, the server then issues a "challenge" and then the client
+ replies with a "response". If the response is successful, the security is
+ set for the lifetime of that connection, and all subsequent calls made
+ upon it use that same security. In the event that the server lets a
+ connection lapse before the client, the security will be renegotiated if
+ the client uses the connection again.
+
+ (*) Calls use ACK packets to handle reliability. Data packets are also
+ explicitly sequenced per call.
+
+ (*) There are two types of positive acknowledgement: hard-ACKs and soft-ACKs.
+ A hard-ACK indicates to the far side that all the data received to a point
+ has been received and processed; a soft-ACK indicates that the data has
+ been received but may yet be discarded and re-requested. The sender may
+ not discard any transmittable packets until they've been hard-ACK'd.
+
+ (*) Reception of a reply data packet implicitly hard-ACK's all the data
+ packets that make up the request.
+
+ (*) An call is complete when the request has been sent, the reply has been
+ received and the final hard-ACK on the last packet of the reply has
+ reached the server.
+
+ (*) An call may be aborted by either end at any time up to its completion.
+
+
+=====================
+AF_RXRPC DRIVER MODEL
+=====================
+
+About the AF_RXRPC driver:
+
+ (*) The AF_RXRPC protocol transparently uses internal sockets of the transport
+ protocol to represent transport endpoints.
+
+ (*) AF_RXRPC sockets map onto RxRPC connection bundles. Actual RxRPC
+ connections are handled transparently. One client socket may be used to
+ make multiple simultaneous calls to the same service. One server socket
+ may handle calls from many clients.
+
+ (*) Additional parallel client connections will be initiated to support extra
+ concurrent calls, up to a tunable limit.
+
+ (*) Each connection is retained for a certain amount of time [tunable] after
+ the last call currently using it has completed in case a new call is made
+ that could reuse it.
+
+ (*) Each internal UDP socket is retained [tunable] for a certain amount of
+ time [tunable] after the last connection using it discarded, in case a new
+ connection is made that could use it.
+
+ (*) A client-side connection is only shared between calls if they have have
+ the same key struct describing their security (and assuming the calls
+ would otherwise share the connection). Non-secured calls would also be
+ able to share connections with each other.
+
+ (*) A server-side connection is shared if the client says it is.
+
+ (*) ACK'ing is handled by the protocol driver automatically, including ping
+ replying.
+
+ (*) SO_KEEPALIVE automatically pings the other side to keep the connection
+ alive [TODO].
+
+ (*) If an ICMP error is received, all calls affected by that error will be
+ aborted with an appropriate network error passed through recvmsg().
+
+
+Interaction with the user of the RxRPC socket:
+
+ (*) A socket is made into a server socket by binding an address with a
+ non-zero service ID.
+
+ (*) In the client, sending a request is achieved with one or more sendmsgs,
+ followed by the reply being received with one or more recvmsgs.
+
+ (*) The first sendmsg for a request to be sent from a client contains a tag to
+ be used in all other sendmsgs or recvmsgs associated with that call. The
+ tag is carried in the control data.
+
+ (*) connect() is used to supply a default destination address for a client
+ socket. This may be overridden by supplying an alternate address to the
+ first sendmsg() of a call (struct msghdr::msg_name).
+
+ (*) If connect() is called on an unbound client, a random local port will
+ bound before the operation takes place.
+
+ (*) A server socket may also be used to make client calls. To do this, the
+ first sendmsg() of the call must specify the target address. The server's
+ transport endpoint is used to send the packets.
+
+ (*) Once the application has received the last message associated with a call,
+ the tag is guaranteed not to be seen again, and so it can be used to pin
+ client resources. A new call can then be initiated with the same tag
+ without fear of interference.
+
+ (*) In the server, a request is received with one or more recvmsgs, then the
+ the reply is transmitted with one or more sendmsgs, and then the final ACK
+ is received with a last recvmsg.
+
+ (*) When sending data for a call, sendmsg is given MSG_MORE if there's more
+ data to come on that call.
+
+ (*) When receiving data for a call, recvmsg flags MSG_MORE if there's more
+ data to come for that call.
+
+ (*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
+ to indicate the terminal message for that call.
+
+ (*) A call may be aborted by adding an abort control message to the control
+ data. Issuing an abort terminates the kernel's use of that call's tag.
+ Any messages waiting in the receive queue for that call will be discarded.
+
+ (*) Aborts, busy notifications and challenge packets are delivered by recvmsg,
+ and control data messages will be set to indicate the context. Receiving
+ an abort or a busy message terminates the kernel's use of that call's tag.
+
+ (*) The control data part of the msghdr struct is used for a number of things:
+
+ (*) The tag of the intended or affected call.
+
+ (*) Sending or receiving errors, aborts and busy notifications.
+
+ (*) Notifications of incoming calls.
+
+ (*) Sending debug requests and receiving debug replies [TODO].
+
+ (*) When the kernel has received and set up an incoming call, it sends a
+ message to server application to let it know there's a new call awaiting
+ its acceptance [recvmsg reports a special control message]. The server
+ application then uses sendmsg to assign a tag to the new call. Once that
+ is done, the first part of the request data will be delivered by recvmsg.
+
+ (*) The server application has to provide the server socket with a keyring of
+ secret keys corresponding to the security types it permits. When a secure
+ connection is being set up, the kernel looks up the appropriate secret key
+ in the keyring and then sends a challenge packet to the client and
+ receives a response packet. The kernel then checks the authorisation of
+ the packet and either aborts the connection or sets up the security.
+
+ (*) The name of the key a client will use to secure its communications is
+ nominated by a socket option.
+
+
+Notes on recvmsg:
+
+ (*) If there's a sequence of data messages belonging to a particular call on
+ the receive queue, then recvmsg will keep working through them until:
+
+ (a) it meets the end of that call's received data,
+
+ (b) it meets a non-data message,
+
+ (c) it meets a message belonging to a different call, or
+
+ (d) it fills the user buffer.
+
+ If recvmsg is called in blocking mode, it will keep sleeping, awaiting the
+ reception of further data, until one of the above four conditions is met.
+
+ (2) MSG_PEEK operates similarly, but will return immediately if it has put any
+ data in the buffer rather than sleeping until it can fill the buffer.
+
+ (3) If a data message is only partially consumed in filling a user buffer,
+ then the remainder of that message will be left on the front of the queue
+ for the next taker. MSG_TRUNC will never be flagged.
+
+ (4) If there is more data to be had on a call (it hasn't copied the last byte
+ of the last data message in that phase yet), then MSG_MORE will be
+ flagged.
+
+
+================
+CONTROL MESSAGES
+================
+
+AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex
+calls, to invoke certain actions and to report certain conditions. These are:
+
+ MESSAGE ID SRT DATA MEANING
+ ======================= === =========== ===============================
+ RXRPC_USER_CALL_ID sr- User ID App's call specifier
+ RXRPC_ABORT srt Abort code Abort code to issue/received
+ RXRPC_ACK -rt n/a Final ACK received
+ RXRPC_NET_ERROR -rt error num Network error on call
+ RXRPC_BUSY -rt n/a Call rejected (server busy)
+ RXRPC_LOCAL_ERROR -rt error num Local error encountered
+ RXRPC_NEW_CALL -r- n/a New call received
+ RXRPC_ACCEPT s-- n/a Accept new call<