📄 rfc2330.txt
字号:
more important, as it can induce systematic trends in packet transit times measured by comparing timestamps produced by the two clocks. These distinctions arise because for Internet measurement what is often most important are differences in time as computed by comparing the output of two clocks. The process of computing the difference removes any error due to clock inaccuracies with respect to true time; but it is crucial that the differences themselves accurately reflect differences in true time. Measurement methodologies will often begin with the step of assuring that two clocks are synchronized and have minimal skew and drift. {Comment: An effective way to assure these conditions (and also clock accuracy) is by using clocks that derive their notion of time from an external source, rather than only the host computer's clock. (These latter are often subject to large errors.) It is further preferable that the clocks directly derive their time, for example by having immediate access to a GPS (Global Positioning System) unit.} Two important concerns arise if the clocks indirectly derive their time using a network time synchronization protocol such as NTP: + First, NTP's accuracy depends in part on the properties (particularly delay) of the Internet paths used by the NTP peers, and these might be exactly the properties that we wish to measure, so it would be unsound to use NTP to calibrate such measurements. + Second, NTP focuses on clock accuracy, which can come at the expense of short-term clock skew and drift. For example, when a host's clock is indirectly synchronized to a time source, if the synchronization intervals occur infrequently, then the host will sometimes be faced with the problem of how to adjust its current, incorrect time, Ti, with a considerably different, more accurate time it has just learned, Ta. Two general ways in which this isPaxson, et. al. Informational [Page 16]RFC 2330 Framework for IP Performance Metrics May 1998 done are to either immediately set the current time to Ta, or to adjust the local clock's update frequency (hence, its skew) so that at some point in the future the local time Ti' will agree with the more accurate time Ta'. The first mechanism introduces discontinuities and can also violate common assumptions that timestamps are monotone increasing. If the host's clock is set backward in time, sometimes this can be easily detected. If the clock is set forward in time, this can be harder to detect. The skew induced by the second mechanism can lead to considerable inaccuracies when computing differences in time, as discussed above. To illustrate why skew is a crucial concern, consider samples of one-way delays between two Internet hosts made at one minute intervals. The true transmission delay between the hosts might plausibly be on the order of 50 ms for a transcontinental path. If the skew between the two clocks is 0.01%, that is, 1 part in 10,000, then after 10 minutes of observation the error introduced into the measurement is 60 ms. Unless corrected, this error is enough to completely wipe out any accuracy in the transmission delay measurement. Finally, we note that assessing skew errors between unsynchronized network clocks is an open research area. (See [Pa97] for a discussion of detecting and compensating for these sorts of errors.) This shortcoming makes use of a solid, independent clock source such as GPS especially desirable.10.2. The Notion of "Wire Time" Internet measurement is often complicated by the use of Internet hosts themselves to perform the measurement. These hosts can introduce delays, bottlenecks, and the like that are due to hardware or operating system effects and have nothing to do with the network behavior we would like to measure. This problem is particularly acute when timestamping of network events occurs at the application level. In order to provide a general way of talking about these effects, we introduce two notions of "wire time". These notions are only defined in terms of an Internet host H observing an Internet link L at a particular location: + For a given packet P, the 'wire arrival time' of P at H on L is the first time T at which any bit of P has appeared at H's observational position on L.Paxson, et. al. Informational [Page 17]RFC 2330 Framework for IP Performance Metrics May 1998 + For a given packet P, the 'wire exit time' of P at H on L is the first time T at which all the bits of P have appeared at H's observational position on L. Note that intrinsic to the definition is the notion of where on the link we are observing. This distinction is important because for large-latency links, we may obtain very different times depending on exactly where we are observing the link. We could allow the observational position to be an arbitrary location along the link; however, we define it to be in terms of an Internet host because we anticipate in practice that, for IPPM metrics, all such timing will be constrained to be performed by Internet hosts, rather than specialized hardware devices that might be able to monitor a link at locations where a host cannot. This definition also takes care of the problem of links that are comprised of multiple physical channels. Because these multiple channels are not visible at the IP layer, they cannot be individually observed in terms of the above definitions. It is possible, though one hopes uncommon, that a packet P might make multiple trips over a particular link L, due to a forwarding loop. These trips might even overlap, depending on the link technology. Whenever this occurs, we define a separate wire time associated with each instance of P seen at H's position on the link. This definition is worth making because it serves as a reminder that notions like *the* unique time a packet passes a point in the Internet are inherently slippery. The term wire time has historically been used to loosely denote the time at which a packet appeared on a link, without exactly specifying whether this refers to the first bit, the last bit, or some other consideration. This informal definition is generally already very useful, as it is usually used to make a distinction between when the packet's propagation delays begin and cease to be due to the network rather than the endpoint hosts. When appropriate, metrics should be defined in terms of wire times rather than host endpoint times, so that the metric's definition highlights the issue of separating delays due to the host from those due to the network. We note that one potential difficulty when dealing with wire times concerns IP fragments. It may be the case that, due to fragmentation, only a portion of a particular packet passes by H's location. Such fragments are themselves legitimate packets and have well-defined wire times associated with them; but the larger IP packet corresponding to their aggregate may not.Paxson, et. al. Informational [Page 18]RFC 2330 Framework for IP Performance Metrics May 1998 We also note that these notions have not, to our knowledge, been previously defined in exact terms for Internet traffic. Consequently, we may find with experience that these definitions require some adjustment in the future. {Comment: It can sometimes be difficult to measure wire times. One technique is to use a packet filter to monitor traffic on a link. The architecture of these filters often attempts to associate with each packet a timestamp as close to the wire time as possible. We note however that one common source of error is to run the packet filter on one of the endpoint hosts. In this case, it has been observed that some packet filters receive for some packets timestamps corresponding to when the packet was *scheduled* to be injected into the network, rather than when it actually was *sent* out onto the network (wire time). There can be a substantial difference between these two times. A technique for dealing with this problem is to run the packet filter on a separate host that passively monitors the given link. This can be problematic however for some link technologies. See [Pa97] for a discussion of the sorts of errors packet filters can exhibit. Finally, we note that packet filters will often only capture the first fragment of a fragmented IP packet, due to the use of filtering on fields in the IP and transport protocol headers. As we generally desire our measurement methodologies to avoid the complexity of creating fragmented traffic, one strategy for dealing with their presence as detected by a packet filter is to flag that the measured traffic has an unusual form and abandon further analysis of the packet timing.}11. Singletons, Samples, and Statistics With experience we have found it useful to introduce a separation between three distinct -- yet related -- notions: + By a 'singleton' metric, we refer to metrics that are, in a sense, atomic. For example, a single instance of "bulk throughput capacity" from one host to another might be defined as a singleton metric, even though the instance involves measuring the timing of a number of Internet packets. + By a 'sample' metric, we refer to metrics derived from a given singleton metric by taking a number of distinct instances together. For example, we might define a sample metric of one-way delays from one host to another as an hour's worth of measurements, each made at Poisson intervals with a mean spacing of one second.Paxson, et. al. Informational [Page 19]RFC 2330 Framework for IP Performance Metrics May 1998 + By a 'statistical' metric, we refer to metrics derived from a given sample metric by computing some statistic of the values defined by the singleton metric on the sample. For example, the mean of all the one-way delay values on the sample given above might be defined as a statistical metric. By applying these notions of singleton, sample, and statistic in a consistent way, we will be able to reuse lessons learned about how to define samples and statistics on various metrics. The orthogonality among these three notions will thus make all our work more effective and more intelligible by the community. In the remainder of this section, we will cover some topics in sampling and statistics that we believe will be important to a variety of metric definitions and measurement efforts.11.1. Methods of Collecting Samples The main reason for collecting samples is to see what sort of variations and consistencies are present in the metric being measured. These variations might be with respect to different points in the Internet, or different measurement times. When assessing variations based on a sample, one generally makes an assumption that the sample is "unbiased", meaning that the process of collecting the measurements in the sample did not skew the sample so that it no longer accurately reflects the metric's variations and consistencies. One common way of collecting samples is to make measurements separated by fixed amounts of time: periodic sampling. Periodic sampling is particularly attractive because of its simplicity, but it suffers from two potential problems: + If the metric being measured itself exhibits periodic behavior, then there is a possibility that the sampling will observe only part of the periodic behavior if the periods happen to agree (either directly, or if one is a multiple of the other). Related to this problem is the notion that periodic sampling can be easily anticipated. Predictable sampling is susceptible to manipulation if there are mechanisms by which a network component's behavior can be temporarily changed such that the sampling only sees the modified behavior. + The act of measurement can perturb what is being measured (for example, injecting measurement traffic into a network alters the congestion level of the network), and repeated periodic perturbations can drive a network into a state of synchronization (cf. [FJ94]), greatly magnifying what might individually be minor effects.Paxson, et. al. Informational [Page 20]RFC 2330 Framework for IP Performance Metrics May 1998 A more sound approach is based on "random additive sampling": samples are separated by independent, randomly generated intervals that have a common statistical distribution G(t) [BM92]. The quality of this sampling depends on the distribution G(t). For example, if G(t) generates a constant value g with probability one, then the sampling reduces to periodic sampling with a period of g. Random additive sampling gains significant advantages. In general, it avoids synchronization effects and yields an unbiased estimate of the property being sampled. The only significant drawbacks with it are: + it complicates frequency-domain analysis, because the samples do not occur at fixed intervals such as assumed by Fourier-transform techniques; and + unless G(t) is the exponential distribution (see below), sampling still remains somewhat predictable, as discussed for periodic sampling above.11.1.1. Poisson Sampling
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -