📄 debugging-tcp.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"><html lang="en"><head><title>Winsock Programmer's FAQ: Articles</title><link rel="Stylesheet" type="text/css" href="../faq.css"></head><body bgcolor="#ffffee" text="#000000" link="#491e00" vlink="#7d2e01" alink="#da7417"><!-- ---- Header Bar ---- --><table border="0" width="95%" bgcolor="#006000" cellpadding="5" cellspacing="3" align="center"> <tr> <td align="left" bgcolor="#e0e0c0"> <font size="2" face=Verdana,Arial,Helvetica> <b><a href="../articles/lame-list.html"><<</a></b> </font> </td> <td align="center"> <font face=Verdana,Arial,Helvetica color="#ffffee"> <p align=center class=bigger3><b> Winsock Programmer's FAQ<br> Section 7: Articles<br> </b></p> </font> </td> <td align="right" bgcolor="#e0e0c0"> <font size="2" face=Verdana,Arial,Helvetica> <b><a href="../articles/bsd-compatibility.html">>></a></b> </font> </td> </tr></table><!-- ---- Body Table ---- --><table width="95%" border="0" cellpadding="10"> <tr valign="top"> <td><H3>Debugging TCP/IP</H3><p><I>by Warren Young</I></p><p>(This article was written with the Winsock programmer in mind,but the information in it can be used by Unix programmers, as well asadministrators and technicians.)</p><p>TCP is a simple protocol in a certain sense: you send data, it deliversit. Because it was engineered for reliability in networks of uncertainquality, it works around a lot of problems without bothering the enduser. But partially because of this reliability, TCP exhibits behaviorsthat surprise those that don't truly understand the protocol. Thistutorial will introduce you to the most important of these issues,but it's really the tip of the iceberg. For the submerged part, see <ahref="../reviews/tcpip-ill-v1.html">TCP/IP Illustrated</a>. Incidentally,the state/transition diagram below comes from volume 2 of that series. Ithappens to be printed in Volume 1 of the series as well, and in Stevens'<a href="../reviews/unp-v1.html">Unix Network Programming</a>, volume1. You can also get that diagram in Postscript format off the web;see the <a href="../resources/misc.html">Miscellaneous Resources</a>section of the FAQ for a pointer.</p><p>In this tutorial, we use the term "packet" to mean "frame" ratherthan "datagram." That is, a packet for our purposes is a collection ofdata wrapped in a TCP frame. The nebulous thing called "the network"is allowed to split the data in a TCP frame over multiple frames, orcoalesce data from multiple frames into a single frame, etc., but theframe itself will remain functionally intact. This is as opposed to the"datagram" meaning of "packet," for an inviolable block of data that isunchanged from sender to receiver.</p><h4>TCP Control Bits</h4><p>When a TCP implementation decides to send a packet of data to theremote peer, it first wraps the data with 20-plus bytes of data called the"header". Headers are an essential part of network protocols, because theyenable the participants in the network make decisions regarding the dataflowing over it. Every protocol adds headers (and sometimes trailers)to your data. We won't discuss the TCP and IP headers in detail here,as that's better left to books like W. Richard Stevens'.</p><p>Within the header is a field that I will call the "control bits," forlack of a better term. The bits that interest us here are called SYN, ACK,FIN and RST, for "synchronize," "acknowledge," "finish," and "reset,"respectively. These bits are set in TCP packets for the sole benefitof the remote peer's network stack <img src="../bitmaps/waist-dot.gif"alt="--" width=14 height=6 hspace=2> that is, they are the machineryunder the hood that most people never have occasion to examine.</p><h4>The State/Transition Diagram</h4><p>Below is the state/transition diagram for the TCP protocol. Thestates are in round-ended boxes, and the transitions are the labelledarrows. The transitions show how how your program can make TCP movefrom one state to another. It also shows how the remote peer can makeyour stack change TCP states, and how you can recognize these changes atthe application level. Note that transition labels come from the namesof BSD sockets functions; although there are differences in the WinsockAPI, the effects are the same at this level. (I apologize for the so-soreadability of the text in this image, but it's already too big at 20K, soI'm unwilling to make it any bigger <img src="../bitmaps/waist-dot.gif"alt="--" width=14 height=6 hspace=2> if you want a pretty, readablediagram, get the Postscript file and print your own copy.)</p><p class=lmargin align=center><img src="bitmaps/state-diagram-small.gif"alt="TCP/IP state-transition diagram" width=420 height=502> </p><p>Understanding this diagram is really one of the keys to understandingTCP, so let's go through a few exercises. But first, you need to knowabout the <code>netstat</code> tool. This tool comes with all MicrosoftTCP/IP stacks, and probably others as well. It is modelled after a Unixtool of the same name, with virtually the same output. (The differencesbetween each version of this tool are slight enough that once you learnto use one, the rest are trivial to pick up.)</p><p>The <code>netstat</code> tool is usually run from the command line,often with the <code>-n</code> flag to make it faster. (<code>-n</code>suppresses the DNS name lookups, using the raw address and port numbersinstead.) Another useful flag is the <code>-a</code> flag, which shows"all" entries, including listeners. (The <code>-a</code> feature is somewhatbroken under Windows 95/98, but works better under Windows NT.) It isalso very helpful to use this tool in combination with a "grep" tool<img src="../bitmaps/waist-dot.gif" alt="--" width=14 height=6 hspace=2>I recommend the <a href="http://sourceware.cygnus.com/cygwin/">Cygwin</a>port of GNU grep. The package also includes GNU's "less" pager, whichblows the doors off of "more", especially Microsoft's emasculatedversions.</p><p>Microsoft <code>netstat</code>s output four columns: the protocol (e.g. TCPor UDP), the local address/port combination, the remote address/portcombination, and the current state of that connection. The first threecolumns are self-explanatory, and are often collectively called the"connection 5-tuple," which uniquely describes a given TCP or UDPconnection. The last column corresponds directly to the states in thediagram above.</p><h4>A Micro-FAQ</h4><p>Now for those exercises I promised:</p><ol><li><b>Problem:</b> From the default CLOSED state, how does aclient program normally get to the ESTABLISHED state?<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all><b>Solution:</b> The client calls the <code>connect()</code> function (orsimilar), which causes TCP to send an empty packet with the SYN controlbit set (SYN_SENT). The remote peer's stack sees this "synchronize"request, and sends back an empty packet with the SYN and ACK bits set(i.e. "I acknowledge your synchronize request"). When the client receivesthe SYN/ACK packet, it sends back an ACK packet, and reports a successfulconnection to the client program.<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all><li><b>Problem:</b> What is the normal TCP shutdown sequence?<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all><b>Solution:</b> The important thing to understand is that TCP isa truly bidirectional protocol. So, the connection is shut down in twoidentical stages, one for each "direction". One peer sends a packet withthe FIN bit set, which the other end ACKnowledges; when the other end isalso finished sending data, it sends out a FIN packet, which the otherend ACKs, closing the connection.<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all><li><b>Problem:</b> What is the significance of the RST bit?<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all><b>Solution:</b> This is an abnormal close, also called "slamming theconnection shut." It happens under several circumstances, but none ofthe common ones are documented in the Stevens diagram. Two of these casesyou can cause from Winsock: the first method is to set <code>SO_LINGER</code>to 0 with <code>setsockopt()</code> and then call <code>closesocket()</code>. Thesecond method is to call <code>shutdown()</code> with <code>how</code> equal to 2,optionally followed by a <code>closesocket()</code> call.<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all>From the Winsock client level, the two other common RST occurrences are"connection refused" and "remote peer terminated connection." The firsthappens when you try to connect to a port that isn't open on a remotemachine. The second happens as a result of the remote peer using one ofthe two RST-forcing methods above; alternately, the application could havecrashed, and the peer's stack sent out a RST for its connection. Anotherway this can happen is the remote peer catastrophically crashed, andthen after the remote machine came back up, your program sent it a packetwhich the stack rightfully had no way of delivering, so it replied witha RST packet, because the connection's 5-tuple is now invalid.<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all>Generally speaking, RST signals a problem of some kind: eithersomething bad happened to the connection, or there's a bug somewhere. Forexample, some firewalls improperly use the RST bit to signal a closedconnection. The solution, of course, is to replace the firewallproduct. B-)<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all><li><b>Problem:</b> My system keeps getting into a TIME_WAIT orFIN_WAIT_<i>x</i> state, so my call to <code>bind()</code> for the same portfails. What's wrong?<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all><b>Solution:</b> The remote end is probably not closing the connectionproperly. The best thing to do is to redesign your program so thatit doesn't need to keep re-binding. For example, a server programgenerally keeps its listener socket alive so that it doesn't have tokeep re-binding it to the port; if you <code>closesocket()</code>the listener for some reason after each successful connection(e.g. your program can only handle one connection at a time), youwon't be able to rebind for somewhere between 30 and 120 seconds,typically. Similarly, calling <code>bind()</code> on the client side isa <a href="../advanced.html#clientbind">bad idea</a>.<img src="../bitmaps/dot-clear.gif" alt="" width=1 height=30 align=top> <br clear=all></ol><h4>Tools for TCPers</h4><p>Below is a small batch file I find helpful in dealing with TCP stateissues, which I call "showwait." Basically, it shows you the currentWAIT states every second until you hit Ctrl-C. I have a similar scripton my Unix machines as well.</p><pre> @echo off :loop netstat -na |grep WAIT delay 1 goto loop</pre><p>This script depends on a 4DOS feature called "delay". If you don'tuse this shell, get an implementation of the "sleep" command, which doesthe same thing. The Cygwin toolset, mentioned above, includes one. (Areyou maybe getting the impression that I'm a closet Unixhead? Oh,noooo.... B-> )</p><p>There's one problem with this tool: it only catches problems with"WAIT" in their name. Less common states like LAST_ACK and SYN_RCVD won'tbe seen by this script. SYN_RCVD in particular signals serious problems ifit stays around for a prolonged amount of time, because it indicates thata remote machine sent your machine a SYN packet, your machine ACK'd it,and the remote machine has failed to ACK your SYN/ACK. Since this exchangetypically only takes from a few tens to a few hundreds of milliseconds,a persistent SYN_RCVD indicates a badly-written network stack, or a very"crashy" computer. If you see many of these states at once, it may meanyou're under a "SYN attack", one of several "Denial of Service" attacksthat are going around these days. At that point, it's time to break outthe network sniffer and start some detective work.</p><h4>Conclusion</h4><p>The techniques and information in this article reflect the basicmental tools that your organization needs to develop, even if it's justappointing a single "networking guru" who will master this material,and become a resource for the other developers in the company. Thisknowledge is very widely useful; for example, it can make reading <ahref="../resources/debugging.html">sniffer</a> dumps less painful andmore productive. Also, these techniques can reasonably be applied bytechnicians working with knowledgeable users over the phone to gatherinformation about failures in your program that otherwise would getlogged as random failures.</p><p>I hope you have learned something about TCP/IP debugging from thisarticle. If you can think of anything else that would fit within thescope of this article, propose an extension and I'll seriously consideradding it.</p><p>Happy hacking!</p><p><font size=-1>Copyright © 1998 by Warren Young. All rightsreserved.</font></p> </td> </tr></table><!-- ---- Document Footer ---- --><hr noshade size=1 color=#404040><table cellpadding=5 cellspacing=0 border=0 width=95% align=center> <tr> <td align=left> <a href="../articles/lame-list.html"><< The Lame List</a> </td> <td align=right> <a href="../articles/bsd-compatibility.html">BSD Sockets Compatibility >></a> </td> </tr> <tr> <td align=left> <i>Last modified on 29 April 2000 at 15:52 UTC-7</i> </td> <td align=right> <font size=-1>Please send corrections to <a href="mailto:tangent@cyberport.com">tangent@cyberport.com</a>.</font> </td> </tr> </table> <table cellpadding=5 cellspacing=0 border=0 width=95% align=center> <tr> <td align=left width=33%> <font size=-1> <a href="../index.html"><b><</b> Go to the main FAQ page</a> </font> </td> <td width=33%> <font size=-1> <center> <a href="http://www.cyberport.com/~tangent/programming"><b><<</b> Go to my Programming pages</a> </center> </font> </td> <td align=right width=33%> <font size=-1> <a href="http://www.cyberport.com/~tangent/"><b><<<</b> Go to my Home Page</a> </font> </td> </tr> </table> </body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -