⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 effective-tcp.html

📁 SDK FAQ集
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"><html lang="en"><head><title>Winsock Programmer's FAQ: Articles</title><link rel="Stylesheet" type="text/css" href="../faq.css"></head><body bgcolor="#ffffee" text="#000000" link="#491e00" vlink="#7d2e01" alink="#da7417"><!--  ---- Header Bar ----  --><table border="0" width="95%" bgcolor="#006000" cellpadding="5" cellspacing="3" align="center">	<tr>		<td align="left" bgcolor="#e0e0c0">			<font size="2" face=Verdana,Arial,Helvetica>				<b><a href="../articles/io-strategies.html">&lt;&lt;</a></b>			</font>		</td>		<td align="center">			<font face=Verdana,Arial,Helvetica color="#ffffee">				<p align=center class=bigger3><b>				Winsock Programmer's FAQ<br>				Section 7: Articles<br>				</b></p>			</font>			</td>		<td align="right" bgcolor="#e0e0c0">			<font size="2" face=Verdana,Arial,Helvetica>				<b><a href="../articles/lame-list.html">&gt;&gt;</a></b>			</font>		</td>	</tr></table><!--  ---- Body Table ----  --><table width="95%" border="0" cellpadding="10">	<tr valign="top">		<td><H3>How to Use TCP Effectively</H3><p><I>by Warren Young</I></p><p>Newcomers to network programming almost always run into problemsearly on where it looks like the network or the TCP/IP stack is mungingyour data. This usually comes as quite a shock, because the newcomeris usually told just before this that TCP is a reliable data transportprotocol. In fact, TCP and Winsock are quite reliable if you use themproperly. This tutorial will discuss the most common problems peoplecome across when learning to use TCP.</p><h4>Problem 1: Packets Are Illusions</h4><p>This problem comes up in various guises:</p><ul><li>"My client program sent 100 bytes, but the server program onlygot 50."<li>"My client program sent several small packets, but the server programreceived one large packet."<li>"How can I find out how many bytes are waiting on a given socket,so I can set up a receive buffer for the size of the packet?"</ul><p>I think that understanding this issue is one of TCP/IP's rites ofpassage.</p><p>The core concept that you must grasp is that TCP is a <i>stream</i> protocol. This means that if you send 100 bytes, the receiving end could receive all 100 bytes at once, or 100 separate single bytes, or four 25-byte chunks. Or, the receiver could receive that 100 byte block plus some data from the previous send and some from the succeeding send.</p><p>So, you ask, how can you make a program receive whole packets only? Theeasiest method, in my experience, is to prefix each packet with a lengthvalue. For example, you could prefix every packet with a 2-byte unsignedinteger that tells how long the packet is. (See problem 2 for advice onproperly sending integers across a network.) Length prefixes are mosteffective when the data in each  protocol packet has no particularstructure, such as raw binary data. Here is some C++-like pseudocodeshowing a common algorithm for dealing with length prefixes:</p><pre>    int new_bytes_read = 0;    // Read in the prefix int packet_size = 0, prefix_bytes_read = 0;    const int prefix_size = 2; do {        char temp_buffer[prefix_size]; new_bytes_read = recv(sd,        temp_buffer,                prefix_size - prefix_bytes_read, 0);        if (new_bytes_read >= 1) {            for (int i = 0; i < new_bytes_read; ++i) {                packet_size <<= 8; packet_size |= temp_buffer[i];            } prefix_bytes_read += new_bytes_read;        } else {            // Handle the error...        }    } while (prefix_bytes_read < prefix_size);    // Allocate the buffer    char* packet_buffer = new char[packet_size];    // Read in the packet    int packet_bytes_read = 0;    do {        new_bytes_read = recv(sd, packet_buffer + packet_bytes_read,                 packet_size - packet_bytes_read, 0);        if (new_bytes_read >= 1) {            packet_bytes_read += new_bytes_read;        }        else {            // Handle the error...        }           } while (packet_bytes_read < packet_size);</pre><p>Notice how we loop on <code>recv()</code> for both the length prefix aswell as for the remainder of the packet.</p><p>Another method for setting up packets on top of a stream protocol iscalled "delimiting". Each packet you send in such a scheme is followedby a unique delimiter. The trick is to think of a good delimiter;it must be a character or string of characters that will <i>never</i>occur inside a packet. Some good examples of delimited protocols areNNTP, POP3, SMTP and HTTP, all of which use a carriage-return/line-feed("CRLF") pair as their delimiter. Delimiting generally only works wellwith text-based protocols, because by design they limit themselves toa subset of all the legal characters; that leaves plenty of possibledelimiters to choose from. Note that several of the above-mentionedprotocols also have aspects of length-prefixing: HTTP, for example,sends a "Content-length:" header in its replies.</p><p>Of these two methods, I prefer length-prefixing, because delimitingrequires your program to blindly read until it finds the end of thepacket, whereas length prefixing lets the program start dealing with thepacket just as soon as the length prefix comes in. On the other hand,delimiting schemes lend themselves to flexibility, if you design theprotocol like a computer language; this implies that your protocolsparsers will be complex.</p><p>There are a couple of other concerns for properly handling packetsatop TCP. First, always check the return value of <code>recv()</code>,which indicates how many bytes it placed in your buffer  <imgsrc="../bitmaps/waist-dot.gif" alt="--" width=14 height=6 hspace=2>it may well return fewer bytes than you expect. Second, don't tryto <a href="lame-list.html#item12">peek</a> into the Winsock stack'sbuffers to see if a complete packet has arrived. For various reasons,peeking causes problems. Instead, read all the data directly into yourapplication's buffers and process it there.</p><h4>Problem 2: Byte Ordering</h4><p>You have undoubtedly noticed all the <code>ntohs()</code>and <code>htonl()</code> calls required in Winsock programming,but you might not know <i>why</i> they are required. The reason isthat there are two major ways of storing integers on a computer: <ahref="http://www.netmeg.net/jargon/terms/b/big-endian.html">big-endian</a>and <ahref="http://www.netmeg.net/jargon/terms/l/little-endian.html">little-endian</a>.Big-endian numbers are stored with the most significant byte in the lowestmemory location ("big-end first"), whereas little-endian systems reversethis. (There are even bizarre "middle-endian" systems!) Obviously twocomputers must agree on a common number format if they are to communicate,so the TCP/IP specification defines a "network byte order" that theheaders (and thus Winsock) all use.</p><p>The end result is, if you are sending bare integers as part of yournetwork protocol, and the receiving end is on a platform that uses adifferent integer representation, it will perceive the data as garbled. Tofix this, follow the lead of the TCP protocol and use network byte order,always.</p><p>The same principles apply to other platform-specific dataformats, such as floating-point values. Winsock does notdefine functions to create platform-neutral representations ofdata other than integers, but there is a protocol called the <ahref="http://www/Public/programming/rfcs/useful.html#rfc1832">ExternalData Representation</a> (XDR) which does handle this. XDR formalizesa platform-independent way for two computers to send each othervarious types of data. XDR is simple enough that you can probablyimplement it yourself; alternately, you might take a look at the <ahref="../resources/libraries.html">Libraries</a> page to find librariesthat implement the XDR protocol.</p><p>For what it's worth, network byte order is big-endian, though youshould never take advantage of this fact. Some programmers working onbig-endian machines ignore byte ordering issues, but this is bad style,if for no other reason than because it creates bad habits that canbite you later. Other interesting trivia: the most common little-endianmachines are the Intel x86 and the Digital Alpha. Most everything else,including the Motorola 680x0, the Sun SPARC and the MIPS Rx000, arebig-endian. Oddly enough, there are a few "bi-endian" devices that canoperate in either mode, like the PowerPC and the HP PA-RISC 8000. MostPowerPCs always run in big-endian mode, however, and I suspect that thesame is true of the PA-RISC.</p><h4>Problem 3: Structure Padding</h4><p>To illustrate the structure padding problem, consider this Cdeclaration:</p><pre>    struct foo {        char a;        int b;        char c;    } foo_instance;</pre><p>Assuming 32-bit <code>int</code>s, you might guess that the structureoccupies 6 bytes. The problem is, many compilers "pad" structures sothat every data member is aligned on a 4-byte boundary. Compilers do thisbecause modern CPUs can fetch data from properly-aligned memory locationsquicker than from nonaligned memory. With 4-byte padding on the abovestructure, it would actually take up 12 bytes. This issue rears its headwhen you try to send a structure over Winsock whole, like this:</p><pre>    send(sd, (char*)&foo_instance, sizeof(foo), 0);</pre><p>Unless the receiving program was compiled on the same machinearchitecture with the same compiler and the same compiler options,you have no guarantee that the other machine will receive the datacorrectly.</p><p>The solution is to always send structures "packed" by sending thedata members one at a time. Or, you can force your compiler to packthe structures for you. Visual C++ can do this with the <code>/Zp</code>command line option or the <code>#pragma pack</code> directive, and BorlandC++ can do this with the <code>-a</code> command line option. Keep the byteordering problem in mind, however: if you send a packed structure inplace, be sure to reorder its bytes properly before you send it.</p><h4>Conclusion</h4><p>The moral of the story is, trust Winsock to send your data correctly,but don't trust that it works the way you think that it ought to!</p><p><font size=-1>Copyright &copy; 1998, 1999 by Warren Young. All rightsreserved.</font></p>		</td>	</tr></table><!--  ---- Document Footer ----  --><hr noshade size=1 color=#404040><table cellpadding=5 cellspacing=0 border=0 width=95% align=center> 	<tr>		<td align=left>		    <a href="../articles/io-strategies.html">&lt;&lt; Which I/O Strategy Should I Use?</a>		</td>		<td align=right>		    <a href="../articles/lame-list.html">The Lame List &gt;&gt;</a>		</td>	</tr>	<tr>		<td align=left>			<i>Last modified on 29 April 2000 at 15:52 UTC-7</i>		</td>		<td align=right>			<font size=-1>Please send corrections to <a href="mailto:tangent@cyberport.com">tangent@cyberport.com</a>.</font>		</td>	</tr>	</table>	<table cellpadding=5 cellspacing=0 border=0 width=95% align=center> 	<tr>		<td align=left width=33%>			<font size=-1>				<a href="../index.html"><b>&lt;</b> Go to the main FAQ page</a>			</font>		</td>		<td width=33%>			<font size=-1>			<center>				<a href="http://www.cyberport.com/~tangent/programming"><b>&lt;&lt;</b> Go to my Programming pages</a>			</center>			</font>		</td>		<td align=right width=33%>			<font size=-1>				<a href="http://www.cyberport.com/~tangent/"><b>&lt;&lt;&lt;</b> Go to my Home Page</a>			</font>		</td>	</tr>	</table>	</body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -