📄 rfc992.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 4 页
字号:
上一页 1 2 34
   at all clear how to use ethernet style broadcast mechanisms to optim-
   ize the performance of this sort of protocol, although it should be
   possible.  We view this as an interesting area for research.

   A forthcoming paper will describe higher level software that we are
   building on top of the basic fault-tolerant process group mechanism
   described above.

9. Conclusions

   The experience of implementing a substantial fault-tolerant system
   left us with insights into the properties to be desired from a com-
   munication subsystem.  In particular, we became convinced that to
   build a reliable distributed system, one must start with a reliable
   communication subsystem.  The multicast primitives described in this
   memo present a simple interface, achieve a high level of concurrency,
   can be used in both local and wide area networks, and are applicable
   to software ranging from distributed database systems to the fault-
   tolerant objects and bulletin boards provided by ISIS.  Because they
   are integrated with failure handling mechanisms and respect desired
   event orderings, they introduce a desirable form of determinism into



Birman & Joseph                                                [Page 14]

RFC 992                                                    November 1986


   distributed computation without compromising efficiency.  A conse-
   quence is that high-level algorithms are greatly simplified, reducing
   the probability of error.  We believe that this is a very promising
   and practical approach to building large fault-tolerant distributed
   systems, and it is the only one we know of that leads to a rigorous
   form of confidence in the resulting software.

NOTES:

   <1> A problem arises if a process p fails without receiving some mes-
   sage after that message has already been delivered to some other pro-
   cess q: q's VIEW when it received the message would show p to be
   operational; hence, q will assume that p received the message,
   although p is physically incapable of doing so.  However, the state
   of the system is now equivalent to one in which p did receive the
   message, but failed before acting on it.  In effect, there exists an
   interpretation of the actual system state that is consistent with q's
   assumption.  Thus, GBCAST satisfies the sort of logical delivery pro-
   perty cited in the introduction.



































Birman & Joseph                                                [Page 15]

RFC 992                                                    November 1986


10. References

[RFC966] Deering, S. and Cheriton, D.  Host groups: A multicast exten-
      sion to the internet protocol.  Stanford University, December
      1985.

[RFC988] Deering, S.  Host extensions for IP multicasting.  Stanford
      University, July 1986.

[Allchin] Allchin, J., McKendry, M.  Synchronization and recovery of
      actions.  Proc. 2nd ACM SIGACT/SIGOPS Principles of Distributed
      Computing, Montreal, Canada, 1983.

[Babaoglu] Babaoglu, O., Drummond, R.  The streets of Byzantium: Network
      architectures for fast reliable multicast.  IEEE Trans. on
      Software Engineering TSE-11, 6 (June 1985).

[Bernstein] Bernstein, P., Goodman, N.  Concurrency control algorithms
      for replicated database systems.  ACM Computing Surveys 13, 2
      (June 1981), 185-222.

[Birman-a] Birman, K.  Replication and fault-tolerance in the ISIS sys-
      tem.  Proc. 10th ACM SIGOPS Symposium on Operating Systems Princi-
      ples.  Orcas Island, Washington, Dec. 1985, 79-86.

[Birman-b] Birman, K., Joseph, T.  Reliable communication in the pres-
      ence of failures.  Dept. of Computer Science, Cornell Univ., TR
      85-694, Aug. 1985.  To appear in ACM TOCS (Feb. 1987).

[Birman-c] Birman, K., Joseph, T., Stephenson, P.  Programming with
      fault tolerant bulletin boards in asynchronous distributed sys-
      tems.  Dept. of Computer Science, Cornell Univ., TR 85-788, Aug.
      1986.

[Birrell] Birrell, A., Nelson, B.  Implementing remote procedure calls.
      ACM Transactions on Computer Systems 2, 1 (Feb. 1984), 39-59.

[Chang] Chang, J., Maxemchuck, M. Reliable multicast protocols.  ACM
      TOCS 2, 3 (Aug. 1984), 251-273.

[Cheriton] Cheriton, D. The V Kernel: A software base for distributed
      systems.  IEEE Software 1 12, (1984), 19-43.

[Cooper] Cooper, E. Replicated procedure call.  Proc. 3rd ACM Symposium
      on Principles of Distributed Computing., August 1984, 220-232.
      (May 1985).

[Cristian] Cristian, F. et al Atomic multicast: From simple diffusion to
      Byzantine agreement.  IBM Technical Report RJ 4540 (48668), Oct.
      1984.




Birman & Joseph                                                [Page 16]

RFC 992                                                    November 1986


[Eswaren] Eswaren, K.P., et al The notion of consistency and predicate
      locks in a database system.  Comm. ACM 19, 11 (Nov. 1976), 624-
      633.

[Hadzilacos] Hadzilacos, V.  Byzantine agreement under restricted types
      of failures (not telling the truth is different from telling of
      lies).  Tech. ARep. TR-19-83, Aiken Comp. Lab., Harvard University
      (June 1983).

[Halpern] Halpern, J., and Moses, Y.  Knowledge and common knowledge in
      a distributed environment.  Tech. Report RJ-4421, IBM San Jose
      Research Laboratory, 1984.

[Joseph-a] Joseph, T.  Low cost management of replicated data.  Ph.D.
      dissertation, Dept. of Computer Science, Cornell Univ., Ithaca
      (Dec. 1985).

[Joseph-b] Joseph, T., Birman, K.  Low cost management of replicated
      data in fault-tolerant distributed systems.  ACM TOCS 4, 1 (Feb
      1986), 54-70.

[Lamport] Lamport, L.  Time, clocks, and the ordering of events in a
      distributed system.  CACM 21, 7, July 1978, 558-565.

[Lazowska] Lazowska, E. et al The architecture of the EDEN system.
      Proc. 8th Symposium on Operating Systems Principles, Dec. 1981,
      148-159.

[Liskov] Liskov, B., Scheifler, R. Guardians and actions: Linguistic
      support for robust, distributed programs.  ACM TOPLAS 5, 3 (July
      1983), 381-404.

[Moss] Moss, E.  Nested transactions: An approach to reliable, distri-
      buted computing.  Ph.D. thesis, MIT Dept of EECS, TR 260, April
      1981.

[Papadimitrou] Papadimitrou, C.  The serializability of concurrent data-
      base updates.  JACM 26, 4 (Oct. 1979), 631-653.

[Popek] Popek, G. et al.  Locus: A network transparent, high reliability
      distributed system.  Proc. 8th Symposium on Operating Systems
      Principles, Dec. 1981, 169-177.

[Schlicting] Schlicting, R, Schneider, F.  Fail-stop processors: An
      approach to designing fault-tolerant distributed computing sys-
      tems.  ACM TOCS 1, 3, August 1983, 222-238.

[Schneider] Schneider, F., Gries, D., Schlicting, R.  Reliable multicast
      protocols.  Science of computer programming 3, 2 (March 1984).

[Skeen-a] Skeen, D.  Determining the last process to fail.  ACM TOCS 3,



Birman & Joseph                                                [Page 17]

RFC 992                                                    November 1986


      1, Feb. 1985, 15-30.

[Skeen-b] Skeen, D.  A reliable multicast protocol.  Unpublished.

[Spector] Spector, A., et al  Distributed transactions for reliable sys-
      tems.  Proc. 10th ACM SIGOPS Symposium on Operating Systems Prin-
      ciples, Dec. 1985, 127-146.

[Strong] Strong, H.R., Dolev, D. Byzantine agreement. Digest of papers,
      Spring Compcon 83, San Francisco, CA, March 1983, 77-81.












































Birman & Joseph                                                [Page 18]
上一页 1 2 34
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -