aboutsummaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/feature-removal-schedule.txt9
-rw-r--r--Documentation/infiniband/ipoib.txt12
-rw-r--r--Documentation/kernel-parameters.txt9
-rw-r--r--Documentation/memory-barriers.txt348
-rw-r--r--Documentation/networking/README.ipw220010
-rw-r--r--Documentation/networking/bonding.txt323
-rw-r--r--Documentation/networking/ip-sysctl.txt7
-rw-r--r--Documentation/networking/netdevices.txt8
8 files changed, 532 insertions, 194 deletions
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 43ab119963d..f50cf8fac3f 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -212,15 +212,6 @@ Who: Greg Kroah-Hartman <gregkh@suse.de>
---------------------------
-What: Support for NEC DDB5074 and DDB5476 evaluation boards.
-When: June 2006
-Why: Board specific code doesn't build anymore since ~2.6.0 and no
- users have complained indicating there is no more need for these
- boards. This should really be considered a last call.
-Who: Ralf Baechle <ralf@linux-mips.org>
-
----------------------------
-
What: USB driver API moves to EXPORT_SYMBOL_GPL
When: Febuary 2008
Files: include/linux/usb.h, drivers/usb/core/driver.c
diff --git a/Documentation/infiniband/ipoib.txt b/Documentation/infiniband/ipoib.txt
index 5c5a4ccce76..187035560d7 100644
--- a/Documentation/infiniband/ipoib.txt
+++ b/Documentation/infiniband/ipoib.txt
@@ -1,10 +1,10 @@
IP OVER INFINIBAND
The ib_ipoib driver is an implementation of the IP over InfiniBand
- protocol as specified by the latest Internet-Drafts issued by the
- IETF ipoib working group. It is a "native" implementation in the
- sense of setting the interface type to ARPHRD_INFINIBAND and the
- hardware address length to 20 (earlier proprietary implementations
+ protocol as specified by RFC 4391 and 4392, issued by the IETF ipoib
+ working group. It is a "native" implementation in the sense of
+ setting the interface type to ARPHRD_INFINIBAND and the hardware
+ address length to 20 (earlier proprietary implementations
masqueraded to the kernel as ethernet interfaces).
Partitions and P_Keys
@@ -53,3 +53,7 @@ References
IETF IP over InfiniBand (ipoib) Working Group
http://ietf.org/html.charters/ipoib-charter.html
+ Transmission of IP over InfiniBand (IPoIB) (RFC 4391)
+ http://ietf.org/rfc/rfc4391.txt
+ IP over InfiniBand (IPoIB) Architecture (RFC 4392)
+ http://ietf.org/rfc/rfc4392.txt
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b3a6187e530..a9d3a1794b2 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1402,6 +1402,15 @@ running once the system is up.
If enabled at boot time, /selinux/disable can be used
later to disable prior to initial policy load.
+ selinux_compat_net =
+ [SELINUX] Set initial selinux_compat_net flag value.
+ Format: { "0" | "1" }
+ 0 -- use new secmark-based packet controls
+ 1 -- use legacy packet controls
+ Default value is 0 (preferred).
+ Value can be changed at runtime via
+ /selinux/compat_net.
+
serialnumber [BUGS=IA-32]
sg_def_reserved_size= [SCSI]
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index c61d8b876fd..4710845dbac 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -19,6 +19,7 @@ Contents:
- Control dependencies.
- SMP barrier pairing.
- Examples of memory barrier sequences.
+ - Read memory barriers vs load speculation.
(*) Explicit kernel barriers.
@@ -248,7 +249,7 @@ And there are a number of things that _must_ or _must_not_ be assumed:
we may get either of:
STORE *A = X; Y = LOAD *A;
- STORE *A = Y;
+ STORE *A = Y = X;
=========================
@@ -344,9 +345,12 @@ Memory barriers come in four basic varieties:
(4) General memory barriers.
- A general memory barrier is a combination of both a read memory barrier
- and a write memory barrier. It is a partial ordering over both loads and
- stores.
+ A general memory barrier gives a guarantee that all the LOAD and STORE
+ operations specified before the barrier will appear to happen before all
+ the LOAD and STORE operations specified after the barrier with respect to
+ the other components of the system.
+
+ A general memory barrier is a partial ordering over both loads and stores.
General memory barriers imply both read and write memory barriers, and so
can substitute for either.
@@ -546,9 +550,9 @@ write barrier, though, again, a general barrier is viable:
=============== ===============
a = 1;
<write barrier>
- b = 2; x = a;
+ b = 2; x = b;
<read barrier>
- y = b;
+ y = a;
Or:
@@ -563,6 +567,18 @@ Or:
Basically, the read barrier always has to be there, even though it can be of
the "weaker" type.
+[!] Note that the stores before the write barrier would normally be expected to
+match the loads after the read barrier or data dependency barrier, and vice
+versa:
+
+ CPU 1 CPU 2
+ =============== ===============
+ a = 1; }---- --->{ v = c
+ b = 2; } \ / { w = d
+ <write barrier> \ <read barrier>
+ c = 3; } / \ { x = a;
+ d = 4; }---- --->{ y = b;
+
EXAMPLES OF MEMORY BARRIER SEQUENCES
------------------------------------
@@ -600,8 +616,8 @@ STORE B, STORE C } all occuring before the unordered set of { STORE D, STORE E
| | +------+
+-------+ : :
|
- | Sequence in which stores committed to memory system
- | by CPU 1
+ | Sequence in which stores are committed to the
+ | memory system by CPU 1
V
@@ -683,14 +699,12 @@ then the following will occur:
| : : | |
| : : | CPU 2 |
| +-------+ | |
- \ | X->9 |------>| |
- \ +-------+ | |
- ----->| B->2 | | |
- +-------+ | |
- Makes sure all effects ---> ddddddddddddddddd | |
- prior to the store of C +-------+ | |
- are perceptible to | B->2 |------>| |
- successive loads +-------+ | |
+ | | X->9 |------>| |
+ | +-------+ | |
+ Makes sure all effects ---> \ ddddddddddddddddd | |
+ prior to the store of C \ +-------+ | |
+ are perceptible to ----->| B->2 |------>| |
+ subsequent loads +-------+ | |
: : +-------+
@@ -699,73 +713,239 @@ following sequence of events:
CPU 1 CPU 2
======================= =======================
+ { A = 0, B = 9 }
STORE A=1
- STORE B=2
- STORE C=3
<write barrier>
- STORE D=4
- STORE E=5
- LOAD A
+ STORE B=2
LOAD B
- LOAD C
- LOAD D
- LOAD E
+ LOAD A
Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in
some effectively random order, despite the write barrier issued by CPU 1:
- +-------+ : :
- | | +------+
- | |------>| C=3 | }
- | | : +------+ }
- | | : | A=1 | }
- | | : +------+ }
- | CPU 1 | : | B=2 | }---
- | | +------+ } \
- | | wwwwwwwwwwwww} \
- | | +------+ } \ : : +-------+
- | | : | E=5 | } \ +-------+ | |
- | | : +------+ } \ { | C->3 |------>| |
- | |------>| D=4 | } \ { +-------+ : | |
- | | +------+ \ { | E->5 | : | |
- +-------+ : : \ { +-------+ : | |
- Transfer -->{ | A->1 | : | CPU 2 |
- from CPU 1 { +-------+ : | |
- to CPU 2 { | D->4 | : | |
- { +-------+ : | |
- { | B->2 |------>| |
- +-------+ | |
- : : +-------+
-
-
-If, however, a read barrier were to be placed between the load of C and the
-load of D on CPU 2, then the partial ordering imposed by CPU 1 will be
-perceived correctly by CPU 2.
+ +-------+ : : : :
+ | | +------+ +-------+
+ | |------>| A=1 |------ --->| A->0 |
+ | | +------+ \ +-------+
+ | CPU 1 | wwwwwwwwwwwwwwww \ --->| B->9 |
+ | | +------+ | +-------+
+ | |------>| B=2 |--- | : :
+ | | +------+ \ | : : +-------+
+ +-------+ : : \ | +-------+ | |
+ ---------->| B->2 |------>| |
+ | +-------+ | CPU 2 |
+ | | A->0 |------>| |
+ | +-------+ | |
+ | : : +-------+
+ \ : :
+ \ +-------+
+ ---->| A->1 |
+ +-------+
+ : :
- +-------+ : :
- | | +------+
- | |------>| C=3 | }
- | | : +------+ }
- | | : | A=1 | }---
- | | : +------+ } \
- | CPU 1 | : | B=2 | } \
- | | +------+ \
- | | wwwwwwwwwwwwwwww \
- | | +------+ \ : : +-------+
- | | : | E=5 | } \ +-------+ | |
- | | : +------+ }--- \ { | C->3 |------>| |
- | |------>| D=4 | } \ \ { +-------+ : | |
- | | +------+ \ -->{ | B->2 | : | |
- +-------+ : : \ { +-------+ : | |
- \ { | A->1 | : | CPU 2 |
- \ +-------+ | |
- At this point the read ----> \ rrrrrrrrrrrrrrrrr | |
- barrier causes all effects \ +-------+ | |
- prior to the storage of C \ { | E->5 | : | |
- to be perceptible to CPU 2 -->{ +-------+ : | |
- { | D->4 |------>| |
- +-------+ | |
- : : +-------+
+
+If, however, a read barrier were to be placed between the load of E and the
+load of A on CPU 2:
+
+ CPU 1 CPU 2
+ ======================= =======================
+ { A = 0, B = 9 }
+ STORE A=1
+ <write barrier>
+ STORE B=2
+ LOAD B
+ <read barrier>
+ LOAD A
+
+then the partial ordering imposed by CPU 1 will be perceived correctly by CPU
+2:
+
+ +-------+ : : : :
+ | | +------+ +-------+
+ | |------>| A=1 |------ --->| A->0 |
+ | | +------+ \ +-------+
+ | CPU 1 | wwwwwwwwwwwwwwww \ --->| B->9 |
+ | | +------+ | +-------+
+ | |------>| B=2 |--- | : :
+ | | +------+ \ | : : +-------+
+ +-------+ : : \ | +-------+ | |
+ ---------->| B->2 |------>| |
+ | +-------+ | CPU 2 |
+ | : : | |
+ | : : | |
+ At this point the read ----> \ rrrrrrrrrrrrrrrrr | |
+ barrier causes all effects \ +-------+ | |
+ prior to the storage of B ---->| A->1 |------>| |
+ to be perceptible to CPU 2 +-------+ | |
+ : : +-------+
+
+
+To illustrate this more completely, consider what could happen if the code
+contained a load of A either side of the read barrier:
+
+ CPU 1 CPU 2
+ ======================= =======================
+ { A = 0, B = 9 }
+ STORE A=1
+ <write barrier>
+ STORE B=2
+ LOAD B
+ LOAD A [first load of A]
+ <read barrier>
+ LOAD A [second load of A]
+
+Even though the two loads of A both occur after the load of B, they may both
+come up with different values:
+
+ +-------+ : : : :
+ | | +------+ +-------+
+ | |------>| A=1 |------ --->| A->0 |
+ | | +------+ \ +-------+
+ | CPU 1 | wwwwwwwwwwwwwwww \ --->| B->9 |
+ | | +------+ | +-------+
+ | |------>| B=2 |--- | : :
+ | | +------+ \ | : : +-------+
+ +-------+ : : \ | +-------+ | |
+ ---------->| B->2 |------>| |
+ | +-------+ | CPU 2 |
+ | : : | |
+ | : : | |
+ | +-------+ | |
+ | | A->0 |------>| 1st |
+ | +-------+ | |
+ At this point the read ----> \ rrrrrrrrrrrrrrrrr | |
+ barrier causes all effects \ +-------+ | |
+ prior to the storage of B ---->| A->1 |------>| 2nd |
+ to be perceptible to CPU 2 +-------+ | |
+ : : +-------+
+
+
+But it may be that the update to A from CPU 1 becomes perceptible to CPU 2
+before the read barrier completes anyway:
+
+ +-------+ : : : :
+ | | +------+ +-------+
+ | |------>| A=1 |------ --->| A->0 |
+ | | +------+ \ +-------+
+ | CPU 1 | wwwwwwwwwwwwwwww \ --->| B->9 |
+ | | +------+ | +-------+
+ | |------>| B=2 |--- | : :
+ | | +------+ \ | : : +-------+
+ +-------+ : : \ | +-------+ | |
+ ---------->| B->2 |------>| |
+ | +-------+ | CPU 2 |
+ | : : | |
+ \ : : | |
+ \ +-------+ | |
+ ---->| A->1 |------>| 1st |
+ +-------+ | |
+ rrrrrrrrrrrrrrrrr | |
+ +-------+ | |
+ | A->1 |------>| 2nd |
+ +-------+ | |
+ : : +-------+
+
+
+The guarantee is that the second load will always come up with A == 1 if the
+load of B came up with B == 2. No such guarantee exists for the first load of
+A; that may come up with either A == 0 or A == 1.
+
+
+READ MEMORY BARRIERS VS LOAD SPECULATION
+----------------------------------------
+
+Many CPUs speculate with loads: that is they see that they will need to load an
+item from memory, and they find a time where they're not using the bus for any
+other loads, and so do the load in advance - even though they haven't actually
+got to that point in the instruction execution flow yet. This permits the
+actual load instruction to potentially complete immediately because the CPU
+already has the value to hand.
+
+It may turn out that the CPU didn't actually need the value - perhaps because a
+branch circumvented the load - in which case it can discard the value or just
+cache it for later use.
+
+Consider:
+
+ CPU 1 CPU 2
+ ======================= =======================
+ LOAD B
+ DIVIDE } Divide instructions generally
+ DIVIDE } take a long time to perform
+ LOAD A
+
+Which might appear as this:
+
+ : : +-------+
+ +-------+ | |
+ --->| B->2 |------>| |
+ +-------+ | CPU 2 |
+ : :DIVIDE | |
+ +-------+ | |
+ The CPU being busy doing a ---> --->| A->0 |~~~~ | |
+ division speculates on the +-------+ ~ | |
+ LOAD of A : : ~ | |
+ : :DIVIDE | |
+ : : ~ | |
+ Once the divisions are complete --> : : ~-->| |
+ the CPU can then perform the : : | |
+ LOAD with immediate effect : : +-------+
+
+
+Placing a read barrier or a data dependency barrier just before the second
+load:
+
+ CPU 1 CPU 2
+ ======================= =======================
+ LOAD B
+ DIVIDE
+ DIVIDE
+ <read barrier>
+ LOAD A
+
+will force any value speculatively obtained to be reconsidered to an extent
+dependent on the type of barrier used. If there was no change made to the
+speculated memory location, then the speculated value will just be used:
+
+ : : +-------+
+ +-------+ | |
+ --->| B->2 |------>| |
+ +-------+ | CPU 2 |
+ : :DIVIDE | |
+ +-------+ | |
+ The CPU being busy doing a ---> --->| A->0 |~~~~ | |
+ division speculates on the +-------+ ~ | |
+ LOAD of A : : ~ | |
+ : :DIVIDE | |
+ : : ~ | |
+ : : ~ | |
+ rrrrrrrrrrrrrrrr~ | |
+ : : ~ | |
+ : : ~-->| |
+ : : | |
+ : : +-------+
+
+
+but if there was an update or an invalidation from another CPU pending, then
+the speculation will be cancelled and the value reloaded:
+
+ : : +-------+
+ +-------+ | |
+ --->| B->2 |------>| |
+ +-------+ | CPU 2 |
+ : :DIVIDE | |
+ +-------+ | |
+ The CPU being busy doing a ---> --->| A->0 |~~~~ | |
+ division speculates on the +-------+ ~ | |
+ LOAD of A : : ~ | |
+ : :DIVIDE | |
+ : : ~ | |
+ : : ~ | |
+ rrrrrrrrrrrrrrrrr | |
+ +-------+ | |
+ The speculation is discarded ---> --->| A->1 |------>| |
+ and an updated value is +-------+ | |
+ retrieved : : +-------+
========================
@@ -901,7 +1081,7 @@ IMPLICIT KERNEL MEMORY BARRIERS
===============================
Some of the other functions in the linux kernel imply memory barriers, amongst
-which are locking, scheduling and memory allocation functions.
+which are locking and scheduling functions.
This specification is a _minimum_ guarantee; any particular architecture may
provide more substantial guarantees, but these may not be relied upon outside
@@ -966,6 +1146,20 @@ equivalent to a full barrier, but a LOCK followed by an UNLOCK is not.
barriers is that the effects instructions outside of a critical section may
seep into the inside of the critical section.
+A LOCK followed by an UNLOCK may not be assumed to be full memory barrier
+because it is possible for an access preceding the LOCK to happen after the
+LOCK, and an access following the UNLOCK to happen before the UNLOCK, and the
+two accesses can themselves then cross:
+
+ *A = a;
+ LOCK
+ UNLOCK
+ *B = b;
+
+may occur as:
+
+ LOCK, STORE *B, STORE *A, UNLOCK
+
Locks and semaphores may not provide any guarantee of ordering on UP compiled
systems, and so cannot be counted on in such a situation to actually achieve
anything at all - especially with respect to I/O accesses - unless combined
@@ -1016,8 +1210,6 @@ Other functions that imply barriers:
(*) schedule() and similar imply full memory barriers.
- (*) Memory allocation and release functions imply full memory barriers.
-
=================================
INTER-CPU LOCKING BARRIER EFFECTS
diff --git a/Documentation/networking/README.ipw2200 b/Documentation/networking/README.ipw2200
index acb30c5dcff..4f2a40f1dbc 100644
--- a/Documentation/networking/README.ipw2200
+++ b/Documentation/networking/README.ipw2200
@@ -14,8 +14,8 @@ Copyright (C) 2004-2006, Intel Corporation
README.ipw2200
-Version: 1.0.8
-Date : October 20, 2005
+Version: 1.1.2
+Date : March 30, 2006
Index
@@ -103,7 +103,7 @@ file.
1.1. Overview of Features
-----------------------------------------------
-The current release (1.0.8) supports the following features:
+The current release (1.1.2) supports the following features:
+ BSS mode (Infrastructure, Managed)
+ IBSS mode (Ad-Hoc)
@@ -247,8 +247,8 @@ and can set the contents via echo. For example:
% cat /sys/bus/pci/drivers/ipw2200/debug_level
Will report the current debug level of the driver's logging subsystem
-(only available if CONFIG_IPW_DEBUG was configured when the driver was
-built).
+(only available if CONFIG_IPW2200_DEBUG was configured when the driver
+was built).
You can set the debug level via:
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 8d8b4e5ea18..afac780445c 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -1,7 +1,7 @@
Linux Ethernet Bonding Driver HOWTO
- Latest update: 21 June 2005
+ Latest update: 24 April 2006
Initial release : Thomas Davis <tadavis at lbl.gov>
Corrections, HA extensions : 2000/10/03-15 :
@@ -12,6 +12,8 @@ Corrections, HA extensions : 2000/10/03-15 :
- Jay Vosburgh <fubar at us dot ibm dot com>
Reorganized and updated Feb 2005 by Jay Vosburgh
+Added Sysfs information: 2006/04/24
+ - Mitch Williams <mitch.a.williams at intel.com>
Introduction
============
@@ -38,61 +40,62 @@ Table of Contents
2. Bonding Driver Options
3. Configuring Bonding Devices
-3.1 Configuration with sysconfig support
-3.1.1 Using DHCP with sysconfig
-3.1.2 Configuring Multiple Bonds with sysconfig
-3.2 Configuration with initscripts support
-3.2.1 Using DHCP with initscripts
-3.2.2 Configuring Multiple Bonds with initscripts
-3.3 Configuring Bonding Manually
+3.1 Configuration with Sysconfig Support
+3.1.1 Using DHCP with Sysconfig
+3.1.2 Configuring Multiple Bonds with Sysconfig
+3.2 Configuration with Initscripts Support
+3.2.1 Using DHCP with Initscripts
+3.2.2 Configuring Multiple Bonds with Initscripts
+3.3 Configuring Bonding Manually with Ifenslave
3.3.1 Configuring Multiple Bonds Manually
+3.4 Configuring Bonding Manually via Sysfs
-5. Querying Bonding Configuration
-5.1 Bonding Configuration
-5.2 Network Configuration
+4. Querying Bonding Configuration
+4.1 Bonding Configuration
+4.2 Network Configuration
-6. Switch Configuration
+5. Switch Configuration
-7. 802.1q VLAN Support
+6. 802.1q VLAN Support
-8. Link Monitoring
-8.1 ARP Monitor Operation
-8.2 Configuring Multiple ARP Targets
-8.3 MII Monitor Operation
+7. Link Monitoring
+7.1 ARP Monitor Operation
+7.2 Configuring Multiple ARP Targets
+7.3 MII Monitor Operation
-9. Potential Trouble Sources
-9.1 Adventures in Routing
-9.2 Ethernet Device Renaming
-9.3 Painfully Slow Or No Failed Link Detection By Miimon
+8. Potential Trouble Sources
+8.1 Adventures in Routing
+8.2 Ethernet Device Renaming
+8.3 Painfully Slow Or No Failed Link Detection By Miimon
-10. SNMP agents
+9. SNMP agents
-11. Promiscuous mode
+10. Promiscuous mode
-12. Configuring Bonding for High Availability
-12.1 High Availability in a Single Switch Topology
-12.2 High Availability in a Multiple Switch Topology
-12.2.1 HA Bonding Mode Selection for Multiple Switch Topology
-12.2.2 HA Link Monitoring for Multiple Switch Topology
+11. Configuring Bonding for High Availability
+11.1 High Availability in a Single Switch Topology
+11.2 High Availability in a Multiple Switch Topology
+11.2.1 HA Bonding Mode Selection for Multiple Switch Topology
+11.2.2 HA Link Monitoring for Multiple Switch Topology
-13. Configuring Bonding for Maximum Throughput
-13.1 Maximum Throughput in a Single Switch Topology
-13.1.1 MT Bonding Mode Selection for Single Switch Topology
-13.1.2 MT Link Monitoring for Single Switch Topology
-13.2 Maximum Throughput in a Multiple Switch Topology
-13.2.1 MT Bonding Mode Selection for Multiple Switch Topology
-13.2.2 MT Link Monitoring for Multiple Switch Topology
+12. Configuring Bonding for Maximum Throughput
+12.1 Maximum Throughput in a Single Switch Topology
+12.1.1 MT Bonding Mode Selection for Single Switch Topology
+12.1.2 MT Link Monitoring for Single Switch Topology
+12.2 Maximum Throughput in a Multiple Switch Topology
+12.2.1 MT Bonding Mode Selection for Multiple Switch Topology
+12.2.2 MT Link Monitoring for Multiple Switch Topology
-14. Switch Behavior Issues
-14.1 Link Establishment and Failover Delays
-14.2 Duplicated Incoming Packets
+13. Switch Behavior Issues
+13.1 Link Establishment and Failover Delays
+13.2 Duplicated Incoming Packets
-15. Hardware Specific Considerations
-15.1 IBM BladeCenter
+14. Hardware Specific Considerations
+14.1 IBM BladeCenter
-16. Frequently Asked Questions
+15. Frequently Asked Questions
-17. Resources and Links
+16. Resources and Links
1. Bonding Driver Installation
@@ -156,6 +159,9 @@ you're trying to build it for. Some distros (e.g., Red Hat from 7.1
onwards) do not have /usr/include/linux symbolically linked to the
default kernel source include directory.
+SECOND IMPORTANT NOTE:
+ If you plan to configure bonding using sysfs, you do not need
+to use ifenslave.
2. Bonding Driver Options
=========================
@@ -270,7 +276,7 @@ mode
In bonding version 2.6.2 or later, when a failover
occurs in active-backup mode, bonding will issue one
or more gratuitous ARPs on the newly active slave.
- One gratutious ARP is issued for the bonding master
+ One gratuitous ARP is issued for the bonding master
interface and each VLAN interfaces configured above
it, provided that the interface has at least one IP
address configured. Gratuitous ARPs issued for VLAN
@@ -377,7 +383,7 @@ mode
When a link is reconnected or a new slave joins the
bond the receive traffic is redistributed among all
active slaves in the bond by initiating ARP Replies
- with the selected mac address to each of the
+ with the selected MAC address to each of the
clients. The updelay parameter (detailed below) must
be set to a value equal or greater than the switch's
forwarding delay so that the ARP Replies sent to the
@@ -498,11 +504,12 @@ not exist, and the layer2 policy is the only policy.
3. Configuring Bonding Devices
==============================
- There are, essentially, two methods for configuring bonding:
-with support from the distro's network initialization scripts, and
-without. Distros generally use one of two packages for the network
-initialization scripts: initscripts or sysconfig. Recent versions of
-these packages have support for bonding, while older versions do not.
+ You can configure bonding using either your distro's network
+initialization scripts, or manually using either ifenslave or the
+sysfs interface. Distros generally use one of two packages for the
+network initialization scripts: initscripts or sysconfig. Recent
+versions of these packages have support for bonding, while older
+versions do not.
We will first describe the options for configuring bonding for
distros using versions of initscripts and sysconfig with full or
@@ -530,7 +537,7 @@ $ grep ifenslave /sbin/ifup
If this returns any matches, then your initscripts or
sysconfig has support for bonding.
-3.1 Configuration with sysconfig support
+3.1 Configuration with Sysconfig Support
----------------------------------------
This section applies to distros using a version of sysconfig
@@ -538,7 +545,7 @@ with bonding support, for example, SuSE Linux Enterprise Server 9.
SuSE SLES 9's networking configuration system does support
bonding, however, at this writing, the YaST system configuration
-frontend does not provide any means to work with bonding devices.
+front end does not provide any means to work with bonding devices.
Bonding devices can be managed by hand, however, as follows.
First, if they have not already been configured, configure the
@@ -660,7 +667,7 @@ format can be found in an example ifcfg template file:
Note that the template does not document the various BONDING_
settings described above, but does describe many of the other options.
-3.1.1 Using DHCP with sysconfig
+3.1.1 Using DHCP with Sysconfig
-------------------------------
Under sysconfig, configuring a device with BOOTPROTO='dhcp'
@@ -670,7 +677,7 @@ attempt to obtain the device address from DHCP prior to adding any of
the slave devices. Without active slaves, the DHCP requests are not
sent to the network.
-3.1.2 Configuring Multiple Bonds with sysconfig
+3.1.2 Configuring Multiple Bonds with Sysconfig
-----------------------------------------------
The sysconfig network initialization system is capable of
@@ -685,7 +692,7 @@ ifcfg-bondX files.
options in the ifcfg-bondX file, it is not necessary to add them to
the system /etc/modules.conf or /etc/modprobe.conf configuration file.
-3.2 Configuration with initscripts support
+3.2 Configuration with Initscripts Support
------------------------------------------
This section applies to distros using a version of initscripts
@@ -756,7 +763,7 @@ options for your configuration.
will restart the networking subsystem and your bond link should be now
up and running.
-3.2.1 Using DHCP with initscripts
+3.2.1 Using DHCP with Initscripts
---------------------------------
Recent versions of initscripts (the version supplied with
@@ -768,7 +775,7 @@ above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp"
and add a line consisting of "TYPE=Bonding". Note that the TYPE value
is case sensitive.
-3.2.2 Configuring Multiple Bonds with initscripts
+3.2.2 Configuring Multiple Bonds with Initscripts
-------------------------------------------------
At this writing, the initscripts package does not directly
@@ -784,8 +791,8 @@ Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels
exhibiting this problem, it will be impossible to configure multiple
bonds with differing parameters.
-3.3 Configuring Bonding Manually
---------------------------------
+3.3 Configuring Bonding Manually with Ifenslave
+-----------------------------------------------
This section applies to distros whose network initialization
scripts (the sysconfig or initscripts package) do not have specific
@@ -889,11 +896,139 @@ install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \
This may be repeated any number of times, specifying a new and
unique name in place of bond1 for each subsequent instance.
+3.4 Configuring Bonding Manually via Sysfs
+------------------------------------------
+
+ Starting with version 3.0, Channel Bonding may be configured
+via the sysfs interface. This interface allows dynamic configuration
+of all bonds in the system without unloading the module. It also
+allows for adding and removing bonds at runtime. Ifenslave is no
+longer required, though it is still supported.
+
+ Use of the sysfs interface allows you to use multiple bonds
+with different configurations without having to reload the module.
+It also allows you to use multiple, differently configured bonds when
+bonding is compiled into the kernel.
+
+ You must have the sysfs filesystem mounted to configure
+bonding this way. The examples in this document assume that you
+are using the standard mount point for sysfs, e.g. /sys. If your
+sysfs filesystem is mounted elsewhere, you will need to adjust the
+example paths accordingly.
+
+Creating and Destroying Bonds
+-----------------------------
+To add a new bond foo:
+# echo +foo > /sys/class/net/bonding_masters
+
+To remove an existing bond bar:
+# echo -bar > /sys/class/net/bonding_masters
+
+To show all existing bonds:
+# cat /sys/class/net/bonding_masters
+
+NOTE: due to 4K size limitation of sysfs files, this list may be
+truncated if you have more than a few hundred bonds. This is unlikely
+to occur under normal operating conditions.
+
+Adding and Removing Slaves
+--------------------------
+ Interfaces may be enslaved to a bond using the file
+/sys/class/net/<bond>/bonding/slaves. The semantics for this file
+are the same as for the bonding_masters file.
+
+To enslave interface eth0 to bond bond0:
+# ifconfig bond0 up
+# echo +eth0 > /sys/class/net/bond0/bonding/slaves
+
+To free slave eth0 from bond bond0:
+# echo -eth0 > /sys/class/net/bond0/bonding/slaves
+
+ NOTE: The bond must be up before slaves can be added. All
+slaves are freed when the interface is brought down.
+
+ When an interface is enslaved to a bond, symlinks between the
+two are created in the sysfs filesystem. In this case, you would get
+/sys/class/net/bond0/slave_eth0 pointing to /sys/class/net/eth0, and
+/sys/class/net/eth0/master pointing to /sys/class/net/bond0.
+
+ This means that you can tell quickly whether or not an
+interface is enslaved by looking for the master symlink. Thus:
+# echo -eth0 > /sys/class/net/eth0/master/bonding/slaves
+will free eth0 from whatever bond it is enslaved to, regardless of
+the name of the bond interface.
+
+Changing a Bond's Configuration
+-------------------------------
+ Each bond may be configured individually by manipulating the
+files located in /sys/class/net/<bond name>/bonding
+
+ The names of these files correspond directly with the command-
+line parameters described elsewhere in in this file, and, with the
+exception of arp_ip_target, they accept the same values. To see the
+current setting, simply cat the appropriate file.
+
+ A few examples will be given here; for specific usage
+guidelines for each parameter, see the appropriate section in this
+document.
+
+To configure bond0 for balance-alb mode:
+# ifconfig bond0 down
+# echo 6 > /sys/class/net/bond0/bonding/mode
+ - or -
+# echo balance-alb > /sys/class/net/bond0/bonding/mode