📄 bonding.txt
字号:
# example options for ARP monitoring with one targetalias bond0 bondingoptions bond0 arp_interval=60 arp_ip_target=192.168.0.1007.3 MII Monitor Operation------------------------- The MII monitor monitors only the carrier state of the localnetwork interface. It accomplishes this in one of three ways: bydepending upon the device driver to maintain its carrier state, byquerying the device's MII registers, or by making an ethtool query tothe device. If the use_carrier module parameter is 1 (the default value),then the MII monitor will rely on the driver for carrier stateinformation (via the netif_carrier subsystem). As explained in theuse_carrier parameter information, above, if the MII monitor fails todetect carrier loss on the device (e.g., when the cable is physicallydisconnected), it may be that the driver does not supportnetif_carrier. If use_carrier is 0, then the MII monitor will first query thedevice's (via ioctl) MII registers and check the link state. If thatrequest fails (not just that it returns carrier down), then the MIImonitor will make an ethtool ETHOOL_GLINK request to attempt to obtainthe same information. If both methods fail (i.e., the driver eitherdoes not support or had some error in processing both the MII registerand ethtool requests), then the MII monitor will assume the link isup.8. Potential Sources of Trouble===============================8.1 Adventures in Routing------------------------- When bonding is configured, it is important that the slavedevices not have routes that supersede routes of the master (or,generally, not have routes at all). For example, suppose the bondingdevice bond0 has two slaves, eth0 and eth1, and the routing table isas follows:Kernel IP routing tableDestination Gateway Genmask Flags MSS Window irtt Iface10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth010.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth110.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 bond0127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo This routing configuration will likely still update thereceive/transmit times in the driver (needed by the ARP monitor), butmay bypass the bonding driver (because outgoing traffic to, in thiscase, another host on network 10 would use eth0 or eth1 before bond0). The ARP monitor (and ARP itself) may become confused by thisconfiguration, because ARP requests (generated by the ARP monitor)will be sent on one interface (bond0), but the corresponding replywill arrive on a different interface (eth0). This reply looks to ARPas an unsolicited ARP reply (because ARP matches replies on aninterface basis), and is discarded. The MII monitor is not affectedby the state of the routing table. The solution here is simply to insure that slaves do not haveroutes of their own, and if for some reason they must, those routes donot supersede routes of their master. This should generally be thecase, but unusual configurations or errant manual or automatic staticroute additions may cause trouble.8.2 Ethernet Device Renaming---------------------------- On systems with network configuration scripts that do notassociate physical devices directly with network interface names (sothat the same physical device always has the same "ethX" name), it maybe necessary to add some special logic to either /etc/modules.conf or/etc/modprobe.conf (depending upon which is installed on the system). For example, given a modules.conf containing the following:alias bond0 bondingoptions bond0 mode=some-mode miimon=50alias eth0 tg3alias eth1 tg3alias eth2 e1000alias eth3 e1000 If neither eth0 and eth1 are slaves to bond0, then when thebond0 interface comes up, the devices may end up reordered. Thishappens because bonding is loaded first, then its slave device'sdrivers are loaded next. Since no other drivers have been loaded,when the e1000 driver loads, it will receive eth0 and eth1 for itsdevices, but the bonding configuration tries to enslave eth2 and eth3(which may later be assigned to the tg3 devices). Adding the following:add above bonding e1000 tg3 causes modprobe to load e1000 then tg3, in that order, whenbonding is loaded. This command is fully documented in themodules.conf manual page. On systems utilizing modprobe.conf (or modprobe.conf.local),an equivalent problem can occur. In this case, the following can beadded to modprobe.conf (or modprobe.conf.local, as appropriate), asfollows (all on one line; it has been split here for clarity):install bonding /sbin/modprobe tg3; /sbin/modprobe e1000; /sbin/modprobe --ignore-install bonding This will, when loading the bonding module, rather thanperforming the normal action, instead execute the provided command.This command loads the device drivers in the order needed, then callsmodprobe with --ignore-install to cause the normal action to then takeplace. Full documentation on this can be found in the modprobe.confand modprobe manual pages.8.3. Painfully Slow Or No Failed Link Detection By Miimon--------------------------------------------------------- By default, bonding enables the use_carrier option, whichinstructs bonding to trust the driver to maintain carrier state. As discussed in the options section, above, some drivers donot support the netif_carrier_on/_off link state tracking system.With use_carrier enabled, bonding will always see these links as up,regardless of their actual state. Additionally, other drivers do support netif_carrier, but donot maintain it in real time, e.g., only polling the link state atsome fixed interval. In this case, miimon will detect failures, butonly after some long period of time has expired. If it appears thatmiimon is very slow in detecting link failures, try specifyinguse_carrier=0 to see if that improves the failure detection time. Ifit does, then it may be that the driver checks the carrier state at afixed interval, but does not cache the MII register values (so theuse_carrier=0 method of querying the registers directly works). Ifuse_carrier=0 does not improve the failover, then the driver may cachethe registers, or the problem may be elsewhere. Also, remember that miimon only checks for the device'scarrier state. It has no way to determine the state of devices on orbeyond other ports of a switch, or if a switch is refusing to passtraffic while still maintaining carrier on.9. SNMP agents=============== If running SNMP agents, the bonding driver should be loadedbefore any network drivers participating in a bond. This requirementis due to the interface index (ipAdEntIfIndex) being associated tothe first interface found with a given IP address. That is, there isonly one ipAdEntIfIndex for each IP address. For example, if eth0 andeth1 are slaves of bond0 and the driver for eth0 is loaded before thebonding driver, the interface for the IP address will be associatedwith the eth0 interface. This configuration is shown below, the IPaddress 192.168.1.1 has an interface index of 2 which indexes to eth0in the ifDescr table (ifDescr.2). interfaces.ifTable.ifEntry.ifDescr.1 = lo interfaces.ifTable.ifEntry.ifDescr.2 = eth0 interfaces.ifTable.ifEntry.ifDescr.3 = eth1 interfaces.ifTable.ifEntry.ifDescr.4 = eth2 interfaces.ifTable.ifEntry.ifDescr.5 = eth3 interfaces.ifTable.ifEntry.ifDescr.6 = bond0 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 5 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 4 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1 This problem is avoided by loading the bonding driver beforeany network drivers participating in a bond. Below is an example ofloading the bonding driver first, the IP address 192.168.1.1 iscorrectly associated with ifDescr.2. interfaces.ifTable.ifEntry.ifDescr.1 = lo interfaces.ifTable.ifEntry.ifDescr.2 = bond0 interfaces.ifTable.ifEntry.ifDescr.3 = eth0 interfaces.ifTable.ifEntry.ifDescr.4 = eth1 interfaces.ifTable.ifEntry.ifDescr.5 = eth2 interfaces.ifTable.ifEntry.ifDescr.6 = eth3 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 6 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 5 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1 While some distributions may not report the interface name inifDescr, the association between the IP address and IfIndex remainsand SNMP functions such as Interface_Scan_Next will report thatassociation.10. Promiscuous mode==================== When running network monitoring tools, e.g., tcpdump, it iscommon to enable promiscuous mode on the device, so that all trafficis seen (instead of seeing only traffic destined for the local host).The bonding driver handles promiscuous mode changes to the bondingmaster device (e.g., bond0), and propagates the setting to the slavedevices. For the balance-rr, balance-xor, broadcast, and 802.3ad modes,the promiscuous mode setting is propagated to all slaves. For the active-backup, balance-tlb and balance-alb modes, thepromiscuous mode setting is propagated only to the active slave. For balance-tlb mode, the active slave is the slave currentlyreceiving inbound traffic. For balance-alb mode, the active slave is the slave used as a"primary." This slave is used for mode-specific control traffic, forsending to peers that are unassigned or if the load is unbalanced. For the active-backup, balance-tlb and balance-alb modes, whenthe active slave changes (e.g., due to a link failure), thepromiscuous setting will be propagated to the new active slave.11. Configuring Bonding for High Availability============================================= High Availability refers to configurations that providemaximum network availability by having redundant or backup devices,links or switches between the host and the rest of the world. Thegoal is to provide the maximum availability of network connectivity(i.e., the network always works), even though other configurationscould provide higher throughput.11.1 High Availability in a Single Switch Topology-------------------------------------------------- If two hosts (or a host and a single switch) are directlyconnected via multiple physical links, then there is no availabilitypenalty to optimizing for maximum bandwidth. In this case, there isonly one switch (or peer), so if it fails, there is no alternativeaccess to fail over to. Additionally, the bonding load balance modessupport link monitoring of their members, so if individual links fail,the load will be rebalanced across the remaining devices. See Section 13, "Configuring Bonding for Maximum Throughput"for information on configuring bonding with one peer device.11.2 High Availability in a Multiple Switch Topology---------------------------------------------------- With multiple switches, the configuration of bonding and thenetwork changes dramatically. In multiple switch topologies, there isa trade off between network availability and usable bandwidth. Below is a sample network, configured to maximize theavailability of the network: | | |port3 port3| +-----+----+ +-----+----+ | |port2 ISL port2| | | switch A +--------------------------+ switch B | | | | | +-----+----+ +-----++---+ |port1 port1| | +-------+ | +-------------+ host1 +---------------+ eth0 +-------+ eth1 In this configuration, there is a link between the twoswitches (ISL, or inter switch link), and multiple ports connecting tothe outside world ("port3" on each switch). There is no technicalreason that this could not be extended to a third switch.11.2.1 HA Bonding Mode Selection for Multiple Switch Topology------------------------------------------------------------- In a topology such as the example above, the active-backup andbroadcast modes are the only useful bonding modes when optimizing foravailability; the other modes require all links to terminate on thesame peer for them to behave rationally.active-backup: This is generally the preferred mode, particularly if the switches have an ISL and play together well. If the network configuration is such that one switch is specifically a backup switch (e.g., has lower capacity, higher cost, etc), then the primary option can be used to insure that the preferred link is always used when it is available.broadcast: This mode is really a special purpose mode, and is suitable only for very specific needs. For example, if the two switches are not connected (no ISL), and the networks beyond them are totally independent. In this case, if it is necessary for some specific one-way traffic to reach both independent networks, then the broadcast mode may be suitable.11.2.2 HA Link Monitoring Selection for Multiple Switch Topology---------------------------------------------------------------- The choice of link monitoring ultimately depends upon yourswitch. If the switch can reliably fail ports in response to otherfailures, then either the MII or ARP monitors should work. Forexample, in the above example, if the "port3" link fails at the remoteend, the MII monitor has no direct means to detect this. The ARPmonitor could be configured with a target at the remote end of port3,thus detecting that failure without switch support. In general, however, in a multiple switch topology, the ARPmonitor can provide a higher level of reliability in detecting end toend connectivity failures (which may be caused by the failure of anyindividual component to pass traffic for any reason). Additionally,the ARP monitor should be configured with multiple targets (at leastone for each switch in the network). This will insure that,regardless of which switch is active, the ARP monitor has a suitabletarget to query.12. Configuring Bonding for Maximum Throughput==============================================12.1 Maximizing Throughput in a Single Switch Topology------------------------------------------------------ In a single switch configuration, the best method to maximizethroughput depends upon the application and network environment. Thevarious load balancing modes each have strengths and weaknesses indifferent environments, as detailed below. For this discussion, we will break down the topologies intotwo categories. Depending upon the destination of most
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -