📄 draft-ietf-dhc-failover-07.txt
字号:
addresses be manually or permanently divided between servers. 9. Continue to meet the goals and objectives of this protocol in the event of server failure or network partition. 10. Provide graceful reintegration of full protocol service after server failure or network partition. 11. Allow for one computer to act as a secondary server for multi- ple primary servers. The protocol must allow failover primary and secondary configuration choices to be made at a granular- ity smaller than "all of the subnets served by a single server", though individual implementations may not choose to allow such flexibility. 12. Ensure that an existing client can keep its existing IP address binding if it can communicate with either the primary or secondary DHCP server implementing this protocol - not just whichever server that originally offered it the binding. 13. Ensure that a new client can get an IP address from some server. Ensure that in the face of partition, where servers continue to run but cannot communicate with each other, the above goals and requirements may be met. In addition, when the partition condition is removed, allow graceful automatic re-integration without requiring human intervention. 14. If either primary or secondary server loses all of the infor- mation that it has stored in stable storage, ensure that it be able to refresh its stable storage from the other server. 15. Support load balancing between the primary and secondary servers, and allow configuration of the percentage of the client population served by each with a moderately fine granu- larity.4.2. Limitations of this protocol The following are explicit limitations of this protocol. 1. This protocol provides only one level of redundancy through a single secondary server for each primary server.Droms, et. al. Expires January 2001 [Page 16]Internet Draft DHCP Failover Protocol July 2000 2. A subset of the address pool is reserved for secondary server use. In order to handle the failure case where both servers are able to communicate with DHCP clients, but unable to com- municate with each other, a subset of the IP address pool must be set aside as a private address pool for the secondary server. The secondary can use these to service newly arrived DHCP clients during such a period. The required size of this private pool is based only on the arrival rate of new DHCP clients and the length of expected downtime, and is not influ- enced in any way by the total number of DHCP clients supported by the server pair. The failover protocol can be used in a mode where both the primary and secondary servers can share the load between them when both are operating. In this load balancing mode, the addresses allocated by the primary server to the secondary server are not unused, but are used instead to service the portion of the client base to which the secondary server is required to respond. See section 5.3 for more information on load balancing. 3. The primary and secondary servers do not respond to client requests at all while recovering from a failure that could have resulted in duplicate IP assignments. (When synchroniz- ing in POTENTIAL-CONFLICT state).5. Protocol Overview This section will discuss the failover protocol at a relatively high level of detail. In the event that a description in this section conflicts (or appears to conflict due to the overview nature of this section) with information in later sections of this draft, the infor- mation in the later sections should be considered authoritative.5.1. Messages and States This protocol is centered around the message exchange used by one server to update the other server of binding database changes result- ing from DHCP client activity: o Communication of binding database changes The binding update (BNDUPD) message is used to send the binding database changes to the partner server, and the partner server responds with a binding acknowledgement (BNDACK) message when it has successfully committed those changes to its own stable storage.Droms, et. al. Expires January 2001 [Page 17]Internet Draft DHCP Failover Protocol July 2000 All of the other messages involve ancillary issues: o Management of available IP addresses The pool request (POOLREQ) is used by the secondary server to request an allocation of IP addresses from the primary server. The pool response (POOLRESP) is used by the primary server to inform the secondary server how many IP addresses were allocated to the secondary server as the result of the pool request. o Synchronization of the binding databases between the servers after they've been out of communications The update request (UPDREQ) message is used by one server to request that its partner send it all binding database informa- tion that it has not already seen. The update request all (UPDREQALL) message is used by one server to request that all binding database information be sent in order to recover from a total loss of its binding database by the requesting server. The update done (UPDDONE) message is used by the responding server to indicate that all requested updates have been sent the responding server and acked by the requesting server. o Connection establishment The connect (CONNECT) message is used by the primary server to establish a high level connection with the other server, and to transmit several important configuration data items between the servers. The connect acknowledgement message (CONNECTACK) is used by the secondary server to respond to a CONNECT message from the primary server. The disconnect (DISCONNECT) message is used by either server when closing a connection. o Server synchronization The state change (STATE) message is used by either server to inform the other server of a change of failover state. o Connection integrity management The contact (CONTACT) message is used by either server to ensure that the other server continues to see the connection as opera- tional. It MUST be transmitted periodically over every esta- blished connection if other message traffic is not flowing, and it MAY be sent at any time.Droms, et. al. Expires January 2001 [Page 18]Internet Draft DHCP Failover Protocol July 20005.1.1. Failover endpoints The proper operation of the failover protocol requires more than the transmission of messages between one server and the other. Each end- point might seem to be a single DHCP server, but in fact there are many situations where additional flexibility in configuration is use- ful. For instance, there might be several servers which are each primary for a distinct set of address pools, and one server which is secon- dary for all of those address pools. The situation with the pri- maries is straightforward, but the secondary will need to maintain a separate failover state, partner state, and communications up/down status for each of the separate primary servers for which it is act- ing as a secondary. The failover protocol calls for there to be a unique failover end- point per partner per role (where role is primary or secondary). This failover endpoint can take actions and hold unique states. There are thus a maximum of two failover endpoints per partner (one for the partner as a primary and one for that same partner as a secondary.) Thus, in the case where there are two primary servers A and B each backed up by a single common secondary server C, there is one fail- over endpoint on each of A and B, and two different failover end- points on C. The two different failover endpoints on C each have unique states and independent TCP connections. This document frequently describes the behavior of the protocol in terms of primary and secondary servers, not primary and secondary failover endpoints. However, it is important to remember that every 'server' described in this document is in reality a failover endpoint that resides in a particular process, and that many failover end- points may reside in the same process. It is not the case that there is a unique failover endpoint for each subnet address pool that participates in a failover relationship. On one server, there is one failover endpoint per partner per role, regardless of how many subnet address pools are managed by that com- bination of partner and role. Conversely, on a particular server, any given subnet address pool will be associated with exactly one failover endpoint. When a connection is received from the partner, the unique failover endpoint to which the message is directed is determined solely by the IP address of the partner and the port to which the connection is directed by the partner. See section 8.2.Droms, et. al. Expires January 2001 [Page 19]Internet Draft DHCP Failover Protocol July 20005.2. Fundamental guarantees There a several fundamental restrictions this protocol places on what one server can do in the absence of knowledge of the other server. Operating within these restrictions allows certain guarantees to be made to the partner server, and these are key to the correct opera- tion of the protocol.5.2.1. Control of lease time The key problem with lazy update is that when a server fails after updating a client with a particular lease time and before updating its partner, the partner will believe that a lease has expired even though the client still retains a valid lease on that IP address. In order to handle this problem, a period of time known as the "Max- imum Client Lead Time" (MCLT) is defined and must be known to both the primary and secondary servers. Proper use of this time interval places an upper bound on the difference allowed between the lease time provided to a DHCP client by a server and the lease time known by that server's partner. However, the MCLT is typically much less than the lease time that a server has been configured to offer a client, and so some strategy must exist to allow a server to offer the configured lease time to a client. During a lazy update the updating server typically updates its partner with a potential expiration time which is longer than the lease time previously given to the client and which is longer than the lease time that the server has been configured to give a client. This allows that server to give a longer lease time to the client the next time the client renews its lease, since the time that it will give to the client will not exceed the MCLT beyond the potential expiration time acknowledged by its partner. The PARTNER-DOWN state exists so that a server can be sure that its partner is, indeed, down. Correct operation while in that state requires (generally) that the server wait the MCLT after anything that happened prior to its transition into PARTNER-DOWN state (or, more accurately, when the other server went down if that is known). Thus, the server MUST wait the MCLT after the partner server went down before allocating any of the partner's addresses which were available for allocation. In the event the partner was not in com- munication prior to going down, it might have allocated one or more of its FREE addresses to a DHCP client and been unable to inform the server entering PARTNER-DOWN prior to going down itself. By waiting the MCLT after the time the partner went down, the server in PARTNER-DOWN state ensures that any clients which have a lease on one of the partner's FREE addresses will either time out or contact the server in PARTNER-DOWN by the time that period ends.Droms, et. al. Expires January 2001 [Page 20]Internet Draft DHCP Failover Protocol July 2000 In addition, once a server has transitioned to PARTNER-DOWN state, it MUST NOT reallocate an IP address from one client to another client until an additional MCLT interval after the lease by the original client expires. (Actually, until the maximum client lead time after what it believes to be the lease expiration time of the client.) Some optimizations exist for this restriction, in that it only
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -