📄 rfc528.txt

📁 著名的RFC文档,其中有一些文档是已经翻译成中文的的.
💻 TXT
📖 第 1 页 / 共 2 页
字号:
上一页 12
   but the problem is easily understood to be of a general nature.  In   fact, we recently had another network-wide failure that was traced to   a hardware error that resulted in erroneous routing messages, after   we had installed a software checksum on all inter-IMP transmissions.   The problem we had were due to a single broken instruction in the   part of the IMP program that builds the routing message.  As a   result, the routing messages from that IMP were random data, and the   neighboring IMPs interpreted these messages as routing update   information.  When this happened, traffic flow through the Network   was completely disrupted and no useful work could be done until the   failed IMP was halted.   This kind of problem, the introduction of incorrect routing   information into the Network, can happen in three ways:      *  The routing message is changed in transmission.  The inter-IMP         checksum should catch this.  The bad routing messages we saw in         the Network had good checksums.      *  The routing message is changed as it is constructed, say by a         memory or processor failure, or before it is transmitted.  This         is what we termed above an intra-IMP failure.McQuillan                                                       [Page 5]RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973      *  The routing program is incorrect for hardware or software         reasons.   We have attempted to solve the last two kinds of problems by   extending the concept of software checksums.  The routing program has   been modified to build a software checksum for the routing message as   it builds the message, just as if it came from a Host.  It is   important that this checksum refer to the intended contents of the   routing message, not the actual contents.  That is, the program which   generates the routing message builds its own software checksum as it   proceeds, not by reading what has been stored in the routing message   area, but by adding up the intended contents for each entry as it   computes them.  The process which sends out routing messages then   always verifies the checksum before transmitting them.  This scheme   should detect all intra-IMP failures.   Finally, the routing program itself can be checksummed to detect any   changes in the code.  The programs which copy in received routing   messages, compute new routing tables, and send out routing messages   each calculate the checksum of the code before executing it.  If the   program finds a discrepancy in the checksum of the program it is   about to run, it immediately requests a program reload from an   adjacent IMP.  These checksums include the checksum computation   itself, the routing program and any constants referenced.  This   modification should prevent a hardware failure at one IMP from   affecting the Network at large by stopping the IMP before it does any   damage in terms of spreading bad routing.  A version of the IMP   program with this added protection for routing was released on May   22.   In the first few months of 1973, there have been several other   efforts aimed at improving the reliability of the Network, in   addition to software checksumming in the IMPs.  At the same time that   we were discovering inter-IMP failures with the software checksum   packets, we began to notice a different kind of problem with intra-   IMP failures.  In these cases we were primarily faced with memory   problems, and they often affected the IMP program itself, rather than   the packets flowing through the IMP.  Our first attack on this   problem was to build a PDP-1 program to verify the running IMP and   TIP programs at a site against the correct core images held at the   PDP-1.  The program interrogates the IMP with DDT messages, and   prints out a list of discrepancies.  Using this program, we have   already found memory failures at one site.McQuillan                                                       [Page 6]RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 19734. TIP Modifications   The hardware difficulties which we began to experience during the   first few months of 1973 had two effects on Host-to-Host   communication.  First, the intermittent modem interface failures, of   the type seen at Belvoir, Aberdeen, and ETAC, meant that messages   were occasionally lost by the network.  This loss is reported to the   transmitting Host by the "Incomplete Transmission" message generated   by the source IMP; the Host must then decide whether to retransmit or   to take some other action.  Second, the higher than normal incidence   of machine failures meant that the network sometimes "partitioned" so   that there was no path between the two communicating Hosts. (It   should be noted that, contrary to the original design, two sites are   currently connected to the network by only a single path; other   similar connections are planned.  For any such sites, any failure   along the single path will be seen as a partition.) Since a TIP acts   as a Host for its users, its resilience when these types of failures   occur has a major effect on user satisfaction.   Prior to this time the TIP program "aborted" the user's connection if   it received an Incomplete Transmission indication from the IMP   program.  In March the TIP program (and the programs of several other   Hosts) was changed to retransmit messages for which the Incomplete   Transmission indication was returned; some Hosts (e.g. MULTICs) have   done this from the start.  This modification has turned out to be   relatively simple, and we urge other Hosts to consider implementing   some sort of error recovery software.  On the other hand, it has not   seemed reasonable to continue attempting to transmit when the program   receives a "Destination Unreachable" indication, since this could   arise either from a network partition or from a failure at the   destination site.  The interactive user is, of course, free to try   again manually.   A different situation pertains to tape transfers involving TIPs with   the magnetic tape option.  In these cases, the user would like to   start the process and then ignore it until the transfer is finished.   Network partitions, even if infrequent, may occur when tape transfers   many hours in length are in progress.  Therefore, we made a   significant modification to the TIP magnetic tape option to include a   sequencing mechanism in the tape transfer protocol which permits   automatic recovery and transmission continuation after most kinds of   network transients.  With this mechanism in effect, and assuming a   tape is mounted at the "other end", the complete transfer of a tape   is possible with a single command given at either end.  If the   connection goes dead in mid-transfer, the TIP magnetic tape software   will attempt to reopen the connection until successful and then   continue the transfer from where it was left off.  In addition to   modifying the TIP magnetic tape option as specified above, we alsoMcQuillan                                                       [Page 7]RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973   modified the TENEX program which is able to communicate with the TIP   magnetic tape option so that it remained compatible.  These changes   were installed in April.5. Future Plans   We have been considering some of the issues of network reliability   discussed above in connection with the development of the new High   Speed Modular IMP.  This design effort and the experiences with the   current IMP system are, of course, linked together, and we have   already decided on several approaches to be taken in the new line of   IMPs:      *  The IMP will have a hardware CRC checksum generator which         returns the checksum on a specified range of memory.      *  The IMP will use this facility to generate and check an end-         to-end checksum on messages.  This checksum will therefore be         more comprehensive and better for error detection than the         current software checksum.  It will insure a high degree of         reliability for Host transmissions.      *  In addition, the IMP will perform a verification of a packet         checksum at each hop to provide diagnostic information.  This         check will be on an optional basis, whenever the system has         available resources for the check.      *  The code for the new IMP system will be read-only (this is         impractical for the present 516 and 316 IMPs), and the program         will periodically checksum itself using the hardware CRC         generator.  We hope to design the program so that it can be         reloaded in segments in the event of a detected error in the         code, with no service interruption.      *  Finally, we are looking into the structure of an optional IMP-         Host/Host-IMP checksum to complete Host/Host end-to-end         checksum.  Under such an arrangement, the IMP and Host could         agree to verify the checksums on the messages transferred over         the interface between them, and the appropriate signalling         mechanisms would be provided to handled errors.  With this         technique in effect, two Hosts could be certain that their         messages were delivered error-free or else they would be         notified of an error, and could then retransmit their message         if desired.McQuillan                                                       [Page 8]RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973         More details on any such modifications to the IMP and to the         IMP-Host interface will be published when appropriate.             [This RFC was put into machine readable form for entry]               [into the online RFC archives by Via Genie 12/1999]McQuillan                                                       [Page 9]
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -