📄 rfc956.txt
字号:
RFC 956 September 1985Algorithms for Synchronizing Network Clocks3. Clustering Algorithms Another method for developing a maximum-likelihood estimator is through the use of clustering algorithms. These algorithms operate to associate points in a sample set with clusters on the basis of stochastic properties and are most useful when large numbers of samples are available. One such algorithm operates on a sample set to selectively discard points presumed outside the cluster as follows: 1. Start with a sample set of n observations {x(1),x(2),...,x(n) 2. Compute the mean of the n observations in the sample set. Discard the single sample x(i) with value furthest from the mean, leaving n-1 observations in the set. 3. Continue with step 2 until only a single observation is left, at which point declare its value the maximum-likelihood estimator. This algorithm will usually (but not necessarily) converge to the desired result if the majority of observations are the result of "good" clocks, which by hypothesis are clustered about zero offset relative to the reference clock, with the remainder scattered randomly over the observation interval. The following Table 3 summarizes the results of this algorithm applied to the UDP data shown in Appendix A, which represents the measured clock offsets of 163 host clocks in the Internet system. These data were assembled using the UDP Time protocol [7], in which time is represented to a precision of one second. Each line of the table represents the result of step 2 above along with the size of the sample set and its (unweighted) mean and variance. The "Discard" column shows the value of the observation discarded at that step.Mills [Page 6]RFC 956 September 1985Algorithms for Synchronizing Network Clocks Size Mean Var Discard ------------------------------- 163 -210 9.1E+6 -38486 162 26 172289 3728 161 3 87727 3658 160 -20 4280 -566 150 -17 1272 88 100 -18 247 -44 50 -4 35 8 20 -1 0 -2 19 -1 0 -2 18 -1 0 -2 17 -1 0 1 16 -1 0 -1 15 -1 0 -1 14 -1 0 -1 13 0 0 0 1 0 0 0 Table 3. Clustering Algorithm using UDP Time Protocol Data In Table 3 only a few of the 163 steps are shown, including those near the beginning which illustrate a rapid convergence as the relatively few outliers are discarded. The large outlier discarded in the first step is almost certainly due to equipment or operator failure. The two outliers close to one hour discarded in the next two steps are probably simple operator mistakes like entering summer time instead of standard time. By the time only 50 samples are left, the error has shrunk to about 4 sec and the largest outlier is within 12 sec of the estimate. By the time only 20 samples are left, the error has shrunk to about a second and the variance has vanished for practical purposes. The following Table 4 summarizes the results of the clustering algorithm applied to the ICMP data shown in Appendix A, which represents the measured clock offsets of 504 host clocks in the Internet system. These data were assembled using ICMP Timestamp messages [6], in which time is represented to a precision of one millisecond. The columns in Table 4 should be interpreted in the same way as in Table 3, except that the data in Table 4 are in milliseconds, while the data in Table 3 are in seconds.Mills [Page 7]RFC 956 September 1985Algorithms for Synchronizing Network Clocks Size Mean Var Discard ------------------------------- 504 -3.0E+6 3.2E+14 8.6E+7 500 -3.3E+6 2.9E+14 8.6E+7 450 -1.6E+6 3.0E+13 -2.5E+7 400 29450 2.2E+11 3.6E+6 350 -3291 4.1E+9 -185934 300 3611 1.6E+9 -95445 250 2967 6.8E+8 66743 200 4047 2.3E+8 39288 150 1717 8.6E+7 21346 100 803 1.9E+7 10518 80 1123 8.4E+6 -4863 60 1119 3.1E+6 4677 50 502 1.5E+6 -2222 40 432 728856 2152 30 84 204651 -987 20 30 12810 338 15 28 2446 122 10 7 454 49 8 -2 196 24 6 -9 23 0 4 -10 5 -13 2 -8 0 -8 Table 4. Clustering Algorithm using ICMP Timestamp Data As in Table 3 above, only some of the 504 steps are shown in Table 4. The distinguishing feature of the data in Table 4 is that the raw data are much more noisy - only some 30 host clocks are closer than one second from the reference clock, while half were further than one minute and over 100 further than one hour from it. The fact that the algorithm converged to within 8 msec of the reference time under these conditions should be considered fairly remarkable in view of the probability that many of the outliers discarded are almost certainly due to defective protocol implementations. Additional information on these experiments is presented in Appendix A.Mills [Page 8]RFC 956 September 1985Algorithms for Synchronizing Network Clocks4. Application to Time-Synchronization Data A variation of the above algorithms can be used to improve the offset estimates from a single clock by discarding noise samples produced by occasional retransmissions in the network, for example. A set of n independent samples is obtained by polling the clock. Then, a majority-subset table is used to compute the m(j) and s(j) statistics and the maximum-likelihood estimate determined as above. For this purpose the majority-subset table could include larger subsets as well. In this manner the maximum-likelihood estimates from each of several clocks can be determined and used in the algorithm above. In order to test the effectiveness of this algorithm, a set of experiments was performed using two WIDEBAND/EISN gateways equipped with WWVB radio clocks and connected to the ARPANET. These experiments were designed to determine the limits of accuracy when comparing these clocks via ARPANET paths. One of the gateways (ISI-MCON-GW) is located at the Information Sciences Institute near Los Angeles, while the other (LL-GW) is located at Lincoln Laboratories near Boston. Both gateways consist of PDP11/44 computers running the EPOS operating system and clock-interface boards with oscillators phase-locked to the WWVB clock. The clock indications of the WIDEBAND/EISN gateways were compared with the DCNet WWVB reference clock using ICMP Timestamp messages, which record the individual timestamps with a precision of a millisecond. However, the path over the ARPANET between these gateways and the measurement host can introduce occasional measurement errors as large as several seconds. In principle the effect of these errors can be minimized by using a large sample population; however, use of the above algorithms allows higher accuracies to be obtained with far fewer samples. Measurements were made separately with each of the two gateways by sending an ICMP Timestamp Request message from the ARPANET address of DCN1 to the ARPANET address of the gateway and computing the round-trip delay and clock offset from the ICMP Timestamp Reply message. This process was continued for 1000 message exchanges, which took from seven minutes to several hours, depending on the sample interval selected. The tables below summarize the results of both the majority-subset and clustering algorithms applied to the data from three experiments, one with ISI-MCON-GW and two with LL-GW. The ISI-MCON-GW and LL-GW (a) experiments were designed to determine the limits of accuracy when using a continuous sequence of request/reply volleys, whichMills [Page 9]RFC 956 September 1985Algorithms for Synchronizing Network Clocks resulted in over two samples per second. The remaining LL-GW (b) experiment was designed to determine the limits of accuracy using a much lower rate of about one sample every ten seconds. For each of the three experiments two tables are shown, one using the majority-subset algorithm and the other the clustering algorithm. The two rows of the majority-subset tables show the statistics derived both from the raw data and from the filtered data processed by a C(5,3) majority-subset algorithm. In all cases the extrema and variance are dramatically less for the filtered data than the raw data, lending credence to the conjecture that the mean statistic for the filtered data is probably a good maximum-likelihood estimator of the true offset. Mean Var Max Min -------------------------------------------- Raw data 637 2.1E+7 32751 -1096 C(5,3) -15 389 53 -103 Table 5. ISI-MCON-GW Majority-Subset Algorithm Size Mean Var Discard ------------------------------- 1000 637 2.1E+7 32751 990 313 1.0E+7 32732 981 15 1.0E+6 32649 980 -18 2713 -1096 970 -15 521 -122 960 -15 433 -97 940 -15 332 -75 900 -15 239 26 800 -15 141 12 700 -16 87 5 600 -17 54 -31 500 -16 33 -5 400 -18 18 -9 300 -19 7 -12 200 -19 2 -21 100 -18 0 -19 1 -17 0 -17 Table 6. ISI-MCON-GW Clustering AlgorithmMills [Page 10]RFC 956 September 1985
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -