📄 rfc677.txt
字号:
Network Working Group Paul R. Johnson (BBN-TENEX)RFC # 677 Robert H. Thomas (BBN-TENEX)NIC # 31507 January 27, 1975 The Maintenance of Duplicate DatabasesPreface:This RFC is a working paper on the problem of maintaining duplicateddatabases in an ARPA-like network. It briefly discusses the generalduplicate database problem, and then outlines in some detail a solutionfor a particular type of duplicate database. The concepts developedhere were used in the design of the User Identification Database for theTIP user authentication and accounting system. We believe that theseconcepts are generally applicable to distributed database problems.Johnson & Thomas [Page 1]RFC 677 The Maintenance of Duplicate Databases January 1975Introduction There are a number of motivations for maintaining redundant,duplicate copies of databases in a distributed network environment. Twoimportant motivations are: - to increase reliability of data access. The accessibility of critical data can be increased by redundantly maintaining it. The database used for TIP login and accounting is redundantly distributed to achieve highly reliable access. - to insure efficiency of data access. Data can be more quickly and efficiently accessed when it is "near" the accessing process. A copy of the TIP user ID database is maintained at each site supporting the TIP login service to insure rapid, efficient access. (Reliability considerations dictate that this database be redundantly maintained, and efficiency considerations dictate that a copy be maintained at each authentication site.) The design of a system to maintain redundant, duplicate databases isa challenging task because of the inherent communication delay betweencopies of the database, as well as the real world constraints of systemcrashes, operator error, communication channel failure, etc. This paperdiscusses some of the problems we encountered in designing such asystem, and outlines a system design for maintaining a particular typeof database which solves those problems.The Model A system for supporting duplicate copies of a database can bemodeled by a group of independent database management processes (DBMPs)each maintaining its own copy of the database. These processescommunicate with each other over network communication paths. Each DBMPhas complete control over its copy of the database. It handles allaccesses to and modifications of the database in response to requestsfrom other processes. Though the DBMPs act only upon requests, in thefollowing they will often be said to be actually causing or originatingthe modifications. An important design consideration is that the communication pathsbetween the DBMPs are subject to failures. Thus one DBMP may have itsinteractions with other DBMPs interrupted and/or have to wait untilcommunication paths are re-established before it can communicate withother DBMPs. An assumption made in this paper about the communicationJohnson & Thomas [Page 2]RFC 677 The Maintenance of Duplicate Databases January 1975paths is that messages from one process to another are delivered in thesame order that they are sent. This is true of the ARPANET. For networksthat make no such guarantee, communication protocols between the DBMPscan be used to correctly order the messages. In order to proceed further, it is necessary to be more preciseabout the nature of the duplicated database and the operations allowedon it. A constant, read-only database is at one extreme. The task of theDBMPs is trivial in this case. They simply respond to value retrievalrequests. At the other extreme is a general shared database wherefunctional modification requests (such as "X := f(X,Y,Z)") are allowedand/or where it is necessary to completely restrict access to entrieswhile they are being modified. In this case all the problems of shareddatabases on a single computer system arise (e.g., the need forsynchronization mechanisms and the resulting potential deadlocksituations), as well as those unique to having multiple database copiesdistributed among independent computers. For example, a completelygeneral system must deal with the possibility of communication failureswhich cause the network to become partitioned into two or more sub-networks. Any solution which relies on locking an element of thedatabase for synchronized modification must cope with the possibility ofprocesses in non-communicating sub-networks attempting to lock the sameelement. Either they both must be allowed to do so (which violates thelock discipline), or they both must wait till the partition ceases(which may take arbitrarily long), or some form of centralized orhierarchical control must be used, with a resulting dependency of someDBMPs on others for all modifications and perhaps accesses as well.The Database The type of database to be examined in this paper can be representedas a collection of entries which are (Selector,Value) pairs. Eachselector is unique and the values are atomic entities as far as theDBMPs are concerned. The mechanisms to be presented may be extended tohandle databases with greater structure - where the values maythemselves be collections of (selector,value) pairs - but this extensionwill not be considered further here.Four operations are to be allowed on the database: 1) Selection - given a selector, the current associated value is returned. 2) Assignment - a selector and a value are given and the given value replaces the old value associated with the selector.Johnson & Thomas [Page 3]RFC 677 The Maintenance of Duplicate Databases January 1975 3) Creation - a new selector and an initial value are given and a new (selector,initial value) entry is added to the database. 4) Deletion- a selector is given and the existing (selector,value) entry is removed from the database.Note that value modification is limited to assignment. Functionalmodification requests - such as "Change X to be Factorial(X)" - arespecifically ruled out. Allowing them would force the use of system widesynchronization interlocks.Consistency The extent to which the copies of the database can be kept"identical" must be examined. Because of the inherent delay incommunications between DBMPs, it is impossible to guarantee that thedata bases are identical at all times. Rather, our goal is to guaranteethat the copies are "consistent" with each other. By this we mean thatgiven a cessation of update activity to any entry, and enough time foreach DBMP to communicate with all other DBMPs, then the state of thatentry (its existence and value) will be identical in all copies of thedatabase.Timestamps We permit modifications to the database to originate at any of theDBMPs maintaining it. These changes must, of course, be communicated tothe other DBMPs. To insure consistency, all of the DBMPs must make thesame decision as to which modification to a particular entry is to beconsidered "final". It is desirable to select the "most recent" change.However, since there is no way to absolutely determine the time sequenceof events in a distributed system without a universal, always accessiblesequence number generator (a network time standard should suffice),"most recent" can only be approximated. We accomplish the approximationby associating a timestamp with each modification and with each entry,the latter being the timestamp of the modification which set its currentvalue.(1) Since the uniqueness of timestamps given out at more than one----------(1) Time is useful in this context because it has the desired propertiesof being monotonically increasing, and of being available with areasonable degree of accuracy. Any other sequence numbering scheme withthese properties can be used, "time-of-day" was chosen because it issimple to obtain. Its main faults are that it is often manually set (andthus prone to error), and it may stop during system serviceJohnson & Thomas [Page 4]RFC 677 The Maintenance of Duplicate Databases January 1975interruptions. As computer systems learn to adapt to a networkenvironment, the use of an independent time source should become morecommon. This is beginning to happen with the TENEX sites on the ARPANET.DBMP can not be guaranteed, a "DBMP of origin" is included as part ofeach timestamp. By (arbitrarily) ordering the DBMPs, we thus have ameans of uniquely ordering timestamps. Each timestamp is a pair (T,D): T is a time, D is a DBMP identifier. For twotimestamps (T1,D1) and (T2,D2) we have the following: (T1,D1)>(T2,D2) <=> (T1>T2) or (T1=T2 and D1>D2) (T1,D1) is said to be "more recent" than (T2,D2) If D1=D2 and T1=T2 it is assumed that the two modifications are really two copies of the same modification request. In order to insure the uniqueness of timestamps, it is necessarythat each individual DBMP associate different times with differentmodifications. This is certainly possible to do, though the fineness ofthe unit of time may restrict the frequency of modifications at a singleDBMP site. Each entry in the data base is now a triple: E ::= (S,V,T), where S is the selector V is the associated value T is the timestamp (a Time,DBMP pair) of the last change to the entry The task of each DBMP is to keep its copy of the database up-to-date, given the information on modifications that it has received sofar. At the same time it must insure that each of its entries staysconsistent with those of all the other DBMPs. This must be done despitethe fact that the order in which it receives modifications may be verydifferent from the order in which they are received by other DBMPs. Inthe remainder of this paper we examine the allowable database operationswith respect to their effect on DBMP operation.Assignment Consider the case of assignment to an existing entry. When theassignment is initiated (by a person or process) the DBMP involved makesthe change locally, and creates a copy of the modified entry and anassociated list of DBMPs to which the change must be sent. As themodification is delivered to the other DBMPs, they are removed from thelist until no DBMPs remain. The copy of the modification is thendeleted. This distribution mechanism must be error free (i.e., receiptJohnson & Thomas [Page 5]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -