📄 rfc1016.txt
字号:
| |/ +--------------+ SQK level (70%) | |\ | | \ datagrams SQed but forwarded if SQK level | | / exceeded & SQLW or lower not yet reached | |/ +--------------+ SQLW level (50%) | |\ | | \ | | \ | | \ datagrams forwarded | | / | | / | | / | |/ +--------------+Description of the Test Model We needed some way of testing our algorithm and its various parameters. It was important to check the interaction between IP with the SQuID algorithm and TCP. We also wanted to try various combinations of retransmission strategy and source quench strategy which required control of the entire test network. We therefore decided to build an Internet model.Prue & Postel [Page 5]RFC 1016 Source Quench Introduced Delay -- SQuID July 1987 Using this example configuration for illustration: _______ LAN _______ WAN _______ LAN _______| 1 | | 2 | | 3 | | 4 ||TCP/IP |---10 Mb/s--| IP |---56 kb/s--| IP |---10 Mb/s--|TCP/IP ||_______| |_______| |_______| |_______| A program was written in C which created queues and structures to put on the queues representing datagrams carrying data, acknowledgments and SQs. The program moved datagrams from one queue to the next based upon rules defined below A client fed the TCP in node 1 data at the rate it would accept. The TCP function in node 1 would chop the data up into fixed 512 byte datagrams for transmission to the IP in node 1. When the datagrams were given to IP for transmission, a timestamp was put on it and a copy of it was put on a TCP ack-wait queue (data sent but not yet acknowledged). In particular TCP assumed that once it handed data to IP, the data was sent immediately for purposes of retransmission timeouts even though our algorithm has IP add delay before transmission. Each IP node had one queue in each direction (left and right). For each IP in the model IP would forward datagrams at the rate of the communications line going to the next node. Thus the fifth datagram on IP 2's queue going right would take 5 X 73 msec or 365 msec before it would appear at the end of IP 3's queue. The time to process each datagram was considered to be less than the time it took for the data to be sent over the 56 kb/s lines and therefore done during those transmission times and not included in the model. For the LAN communications this is not the case but since they were not at the bottleneck of the path this processing time was ignored. However because LAN communications are typically shared band width, the LAN band width available to each IP instance was considered to be 1 Mb/s, a crude approximation. When the data arrived at node 4 the data was immediately given to the TCP receive function which validated the sequence number. If the datagram was in sequence the datagram was turned into an ack datagram and sent back to the source. An ack datagram carries no data and will move the right edge of the window, the window size past the just acked data sequence number. The ack datagram is assumed to be 1/8 of the length of a data datagram and thus can be transmitted from one node to the next 8 times faster. If the sequence number is less than expected (a retransmission due to a missed ack) then it too is turned into an ack. A larger sequence number datagram is queued indefinitely until the missing datagrams are received.Prue & Postel [Page 6]RFC 1016 Source Quench Introduced Delay -- SQuID July 1987 We also modeled the gateway source quench algorithm. When a datagram was put on an IP queue the number on the queue was compared to an SQ keep level (SQK). If it was greater, an SQ was generated and returned to the sender. If it was larger than the SQ toss (SQT) level it was also discarded. Once SQs were generated they would continue to be sent until the queue level went below SQ Low Water (SQLW) level which was below the original SQK level. These percentages were modifiable as were many parameters. An SQ could be lost if it exceeded the maximum queue size (MaxQ), but a source quench was never sent about tossing a source quench. Upon each transition from one node to the next, the datagram was vulnerable to datagram loss due to errors. The loss rate could be set as M losses out of N datagrams sent, thus the model allowed for multi-datagram loss bursts or single datagram losses. We used a single datagram loss rate of 1 lost datagram per 300 datagrams sent for much of our testing. While this may seem low for Internet simulation, remember it does not include losses due to congestion. Some network parameters we used were a maximum queue length of 15 datagrams per IP direction left and right. We started sending SQ if the queue was 70% full, SQK level, tossed data datagrams, but not SQ datagrams, if 95% of the queue was reached, SQT level, and stopped SQing when a 50% SQLW level was reached (see above). We ignored additional SQs for 2 seconds after receipt of one SQ. This was done because some Internet nodes only send one SQ for every 20 datagrams they discard even though our model sent SQs for every datagram discarded. Other IP node may send one SQ per discarded packet. The SQuID algorithm needed a way to handle both types of SQ generation. We therefore treated one or a burst of SQs as a single event and incremented our D by a larger amount than would be appropriate for responding individually to the multiple SQs of the verbose nodes. The simulation did not do any fragmenting of datagrams. Silly window syndrome was avoided. The model did not implement nor simulate the TTL (time-to-live) function. The model allowed for a flexible topology definition with many TCP source/destination pairs on host IP nodes or gateway IP nodes with various windows allowed. An IP node could have any number of TCPs assigned to it. Each line could have an individually set speed. Any TCP could send to any other TCP. The routing from one location to another was fixed. Therefore datagrams did not arrive out of sequence. However, datagrams arrived in ascending order, but not consecutively, on a regular basis because of datagram losses. Datagrams going "left" through a node did not affect the queue size,Prue & Postel [Page 7]RFC 1016 Source Quench Introduced Delay -- SQuID July 1987 or SQ chances, of data going "right" through the node. The TCP retransmission timer algorithm used an Alpha of .15 and a Beta of 1.5. The test was run without the benefit of the more sophisticated retransmission timer algorithm proposed by Van Jacobson [5]. The program would display either the queue sizes of the various IP nodes and the TCP under test as time passed or do a crude plot of various parameters of interest including SRTT, perceived round trip time, throughput, and the critical queue size. As we observed the effects of various algorithms for responding to SQ we adapted our model to better react to SQ. Initial tests showed if we incremented slowly and decremented quickly we observed oscillations around the correct value but more of the time was spent over driving the network, thus losing datagrams, than at a value which helped the congestion situation. A significant problem is the delay between when some intermediate node starts dropping datagrams and sending source quenches to the time when the source quenches arrive at the source host and can begin to effect the behavior at the data source. Because of this and the possibility that a IP might send only one SQ for each 20 datagrams lost, we decided that the increase in D per source quench should be substantial (for example, D should increase by 20 msec for every source quench), and the decrease with time should be very slow (for example, D should decrease 1 msec every second). Note that this is the opposite behavior than suggested in an early draft by one of the authors. However, when many source quenches are received (for example, when a source quench is received for every datagram dropped) in a short time period the D value is increased excessively. To prevent D from growing too large, we decided to ignore subsequent source quenches for a time (for example, 2 seconds) once we had increased D. Tests were run with only one TCP sending data to learn as much as possible how an unperturbed session might run. Other test runs would introduce and eliminate competing traffic dynamically between other TCP instances on the various nodes to see how the algorithms reacted to changes in network load. A potential flaw in the model is that the defined TCPs with open windows always tried to forward data. Their clients feeding them data never paused to think what they were going to type nor got swapped out in favor of other applications nor turned the session around logically to listen to the other end for more user commands. In other words all of the simulated TCP sessions were doing file transfers.Prue & Postel [Page 8]RFC 1016 Source Quench Introduced Delay -- SQuID July 1987 The model was defined to allow many mixes of competing algorithms for responding to SQ. It allowed comparing effective throughput between TCPs with small windows and large windows and those whose IP would introduce inter-datagram delays and those who totally ignored SQ. It also allowed comparisons with various inter-datagram increment amounts and decrement amounts. Because of the number of possible configurations and parameter combinations only a few combinations of parameters were tested. It is hoped they were the most appropriate ones upon which to concentrate.Observed Results All of our algorithms oscillate, some worse than others. If we put in just the right amount of introduced delay we seem to get the best throughput. But finding the right amount is not easy. Throughput is adversely affected, heavily, by a single lost datagram at least for the short time. Examine what happens when a window is 35 datagrams wide with an average round trip delay of 2500 msec using 512 byte datagrams when a single datagram is lost at the beginning. Thirty five datagrams are given by TCP to IP and a timer is started on the first datagram. Since the first datagram is missing, the receiving TCP will not sent an acknowledgment but will buffer all 34 of the out-of-sequence datagrams. After 1.5 X 2500 msec, or 3750 msec, have elapsed the datagram times out and is resent. It arrives and is acked, along with the other 34, 2500 msec later. Before the lost datagram we might have been sending at the average rate a 56 kb/s line could accept, about one every 75 msec. After loss of the datagram we send at the rate of one in 6250 msec over 83 times slower. If the lost datagram in the above example is other than the first datagram the situation becomes the same when all of the datagrams before the lost datagram are acknowledged. The example holds true then for any single lost datagram in the window. When SQ doesn't always cause datagram loss the sender continues to send too fast (queue size oscillates a lot). It is important for the SQ to cause feed-back into the sending system as soon as possible, therefore when the source host IP receives an SQ it must make adjustments to the send rate for the datagrams still on the send queue not just datagrams IP is requested to send after the SQ. Through network delay goes up as the network queue lengths go up. Window size affect the chance of getting SQed. Look at our model above using a queue level of 15 for node 2 before SQs are generatedPrue & Postel [Page 9]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -