📄 index.html
字号:
<h2>How to read the sources</h2><ol><li>Read this whole Web page, including the <a href="#coding">coding standards</a>.<li>Read the pages this links to, including the documentation includedwith the sources (listed above).<li>If you don't know sockets, read one of the tutorials linked to below.<li>If poll() or "non-blocking I/O" is still mysterious to you, read <a href="http://www.amazon.com/exec/obidos/ASIN/013490012X/">Unix Network Programming</a> or<a href="http://www.amazon.com/exec/obidos/ASIN/0201563177/">Advanced Programming in the UNIX Environment</a>until it makes sense :-)<li>To start reading the sources, arrange the modules in order from lowest level(i.e. doesn't use any of the other modules) to highest level(ui.e. ses but isn't used by any of the other modules). For this project, the order is nbbio, fdmap, ftp_client_proto, ftp_client_pipe,robouser, bench.<li>Pick the first module in the list, e.g. nbbio.Review its .h file (<a href="nbbio.h">nbbio.h</a>).Note any confusing parts, and email the mailing list with any questions.They may respond by improving the comments in the file, bysimply answering your questions, or even by fixing a bug you find.<li>After the .h file makes sense to you, review the same module's .cc file(e.g. <a href="nbbio.cc">nbbio.cc</a>) and do the same thing.<li>When you understand that module's .h and .cc, move on to thenext module in the list that uses modules you've already reviewed. When you come across references to modules you've already reviewed,you'll have a good understanding of them, and they won't stump you.</ol><h2>Throttling</h2>ftp_client_pipe_t keeps track of the number of bytes read viathe network (in either data or control channels). When this exceedsa threshold, no more reads are executed for the appropriate amount of time.ftp_client_pipe_t sleeps for enough clock ticks to hide the granularity ofthe clock. In particular, ftp_client_pipe_t sleeps as soon asTw = (bytes_sent / desired_bandwidth - elapsed_time) is greater than eight clock ticks.For example, if eclock_hertz() is 100, the desired bandwidthis 28000 bits/sec, it's been one clock tick since it last woke up,and it has received 1500 bytes since it last woke up, thendesired_bandwidth = 28000 / eclock_hertz() = 280 bits/tick, andTw = 1500 * 8 / 280 - 1 = 42, so it would wait 42 clock ticks beforeaccepting any more reads.On the other hand, if it takes 60 clock ticks to receive 1500 bytes,it won't sleep at all.(Compare with SPECweb99's "Rated Receive" logic, which only sleepsat the end of each file fetch.)<p>But see <a href="rick.html">Rick Jones' post on comp.benchmarks</a>for a report of some trouble with this kind of throttling technique.<h2>To Thread or not to Thread</h2>I've chosen an event-driven approach to the problem. ("Event driven" isalso known as "non-threaded", "polling I/O", "non-blocking I/O", or "multiplexed I/O".)Many programmers today are familiar only with the threaded model ofwriting servers, where the server creates a new thread or processfor each client. This lets you write code in a stream-of-conciousnessway, but has several drawbacks: it can be very hard to debug, andit can have high overhead.<a href="http://www.scriptics.com/people/john.ousterhout/">John Ousterhout's</a> talk on "Why Threads are a Bad Idea (for most purposes)" explainssome of the reasons programmers familiar with threads should also learn about the alternatives to threads:<blockquote><i>The talk compares the threads style of programming to an alternativeapproach, events, that use only a single thread of control. Although eachapproach has its weaknesses, events result in simpler, more manageablecode than threads, with efficiency that is generally as good as or betterthan threads. Most of the applications for which threading is currentlyrecommended (including nearly all user-interface applications) would bebetter off with an event-based implementation.</i></blockquote>In an event-driven server, a single thread handles many clients at the same time. This is done by dividing up the work into small pieces, and explicitlyhandling a single stream of all the pieces of work from all theclients; each client gets a moment of attention just when it needs it.<p>I've chosen this approach because it will use much lessmemory to support tens of thousands of clients than would a thread-per-clientapproach. I may still introduce threads at some point to allow theprogram to make use of multiple CPU's, but I will do so sparingly.<h2>Support for alternatives to poll()</h2>dkftpbench supports both poll() and<a href="http://www.kegel.com/c10k.html">alternative readiness notification methods</a>.This was done by adding a Poller class which abstracts the poll() system call; concrete subclasses of thishave been written for poll(), select(), F_SETSIG, kqueue(), and /dev/poll.dkftpbench has been tested with most of these (not kqueue or /dev/poll yet),and you can choose which one to use from the dkftpbench commandline.<h2>Notes</h2><ul><li><a href="nonblocking.html">nonblocking.html</a> - introduction to non-blocking I/O<li><a href="callbackDemo.html">callbackDemo.html</a> - explanation of callback functions in C++<li><a href="classes.html">classes.html</a> - class relationships in this project (outdated)<li><a href="theory.html">theory.html</a> - explanation of how the code works (outdated)</ul><h2>Other Benchmarks</h2><ul><li><a href="ftp://ftp.cup.hp.com/dist/networking/benchmarks/netperf/experimental/netperf3.tar.gz">netperf3</a> (experimental) - includes an FTP benchmark,and shows the kind of remote control invocation of benchmark clientsI'd like to do. Doesn't try to include a general-purpose FTP clientlibrary, and is based on threads instead of nonblocking I/O, so it's not going to be as useful or fast as ours, but it's wellworth looking at and running.<li><a href="ftp://ftp.cup.hp.com/dist/networking/briefs/ftp_server_results.txt">Results</a> from netperf3 for a HP computer. Very interesting reading!<li><a href="http://www.spec.org/osg/web99/docs/whitepaper.html">http://www.spec.org/osg/web99/docs/whitepaper.html</a><li>See<a href="http://www.acme.com/software/http_load/">http://www.acme.com/software/http_load/</a> for a multiplexing HTTP benchmark (lacks throttling)<li>See<a href="ftp://ftp.lysator.liu.se/pub/unix/ptester/ptester-1.2.tar.gz">ftp://ftp.lysator.liu.se/pub/unix/ptester/ptester-1.2.tar.gz</a> for a simple multithreadedhttp load generator (lacks throttling, precise result reporting)<li> See<a href="http://www.kegel.com/nt-linux-benchmarks.html">http://www.kegel.com/nt-linux-benchmarks.html</a>for whereI eventually hope to publish FTP benchmark results.</ul><h2>Interesting Server Programs</h2><ul><li><a href="http://www.IN-Berlin.DE/User/kraxel/webfs.html">webfs</a> is a verysimple, single-threaded, multiplexing HTTP server. The eventprocessing is very simple and clear; it's a good server to look atto understand multiplexing. It uses select(), but the idea is the same forpoll().<li><a href="http://mathop.diva.nl">mathopd</a> is another serverthat uses poll() or select(), and has a very clear main loop.<li><a href="http://freshmeat.net/search.php3?query=ftpd">Search freshmeat.net for ftpd</a> -- there are a lot of FTP server programs out there.For instance, <a href="http://freshmeat.net/appindex/1999/02/17/919251275.html">Betaftpd</a> is a single-threaded FTP server.<li><a href="http://linuxmafia.com/pub/linux/security/ftp-daemons">Linuxmafia.com's list of Linux FTP daemons</a><li><a href="http://www.mycgiserver.com/~ranab/ftp/index.html">Rana Bhattacharyya's FTP Server</a> - multithreaded; resumable; lots of features; now part of Apache Avalon project - 17KLOC. Took about 5MB of RAM per connection in my tests.</ul><h2>Other FTP libraries</h2><ul><li><a href="http://www.cnj.digex.net/~pfau/ftplib/">Thomas Pfau's ftplib</a> -- existing library that implements the client side of the FTP protocol. Doesn't let you multiplex lots of connections, though.<li><a href="http://oss.software.ibm.com/developerworks/opensource/ftp/">IBM's FTP beans</a> -- open source Java ftp protocol and UI beans</ul><h2>Standards</h2><ul><li><a href="ftp://ftp.isi.edu/in-notes/rfc959.txt">RFC 959: the File Transfer Protocol</a><li><a href="ftp://ftp.ietf.org/internet-drafts/draft-ietf-ftpext-mlst-13.txt">draft-ietf-ftpext-mlst-13.txt</a>-- defines MDTM, SIZE, and RESTart extensions to the FTP protocol. Commonly used by web browsers.<li><a href="ftp://ftp.isi.edu/in-notes/rfc1945.txt">RFC 1945: the Hypertext Transfer Protocol -- HTTP/1.0</a><li><a href="ftp://ftp.isi.edu/in-notes/rfc2616.txt">RFC 2616: the Hypertext Transfer Protocol -- HTTP/1.1</a><li><a href="http://cr.yp.to/ftp.html">D.J.Bernstein's formal description of FTP as observed in the wild</a></ul><h2>Resources for learning about network programming</h2><ul><li> <a href="http://www.ecst.csuchico.edu/~beej/guide/net/">Beej's Guide to Network Programming Using Internet Sockets</a><li> <a href="http://ccnga.uwaterloo.ca/~mvlioy/stuff/ipc_adv_tut.txt">An Advanced 4.3BSD Interprocess Communication Tutorial</a> (old, but still relevant)<li> For those who already know the basics of sockets: <a href="http://www.amazon.com/exec/obidos/ASIN/013490012X/">Unix Network Programming : Networking Apis: Sockets and Xti (Volume 1)</a>by the late W. Richard Stevens describes many of the I/Ostrategies and pitfalls related to writing high-performance networking code.(His examples are at <a href="http://www.kohala.com/start/unpv12e.html">http://www.kohala.com/start/unpv12e.html</a>.They should take about one minute to download, unpack, and compile accordingto the README.)<li>See <a href="http://www.kegel.com/c10k.html">http://www.kegel.com/c10k.html</a>for info on how to write efficientnetwork code than handles lots of open connections (like a server,or like a benchmarking client).<li><a href="undump.pl">undump.pl</a> -- perl script to turn output of<tt>tcpdump -x -s 1024 tcp</tt> into human-readable form, so you cansnoop on what FTP commands a browser is using</ul><h2><a name="coding">Coding Standards</a></h2>Since this code is licensed under the GPL, you're free to do as you like with it. If you want to contribute to the project, though,please follow these guidelines:<ul><li> In general, documentation is written first, then the module self-test,then implementation. Documentation should be extremely brief, consist mostly of interface comments embedded in the .h files, and describe things just well enough that you could implement or use the codewith it as a guide. The self-test is written first so it can beused to help initial debugging of the module, and later as a regression test.<li>Anyone who adds code should first review the existing codeto look for places where the .h files are confusing or incomplete,and give feedback to Dan so he can fix this. This will helpeveryone understand the code, and will ensure the documentation is up to snuff for future contributors.<li> Comments at the beginning of modules or functions arecalled 'interface comments'. Design comments:<ul><li>document the interface, rationale, and intent of the module or function<li>avoid talking about implementation details that don't affect the interface<li>start with a /*------------- line<li>end with a ------------*/ line<li>enclose text whose left margin is indented to line up with the *<li>don't use stars at the left margin of the text.<li>are repeated verbatim in both the .h and .cc files. <li>explain what the module or function is for well enough that you don't have to look at the innards to use it or understand what it's for.<li>are kept up to date; when the interface of a function changes,the interface comment should change to reflect it.</ul><li> Each module has a simple self-test program at the bottom, surrounded with#ifdef modulename_MAIN ... #endif.The self-test should, if possible, provide a simple unsupervisedgo/no-go indication.<li> 'make test' will compile and run all module self-tests.<li> tab stops 4 and indent width 4 are used throughout. Try tofollow the style of the existing code (i.e. spaces afterkeywords, curly braces on same line as keywords, etc.)<li> A minimal subset of C++ is used; essentially, it'sC with classes.<li> No C++ - style I/O (no cout, cin, etc.). All I/O is native Unix I/O(e.g. read, write) or C - style I/O (printf, etc.). (That doesn't mean cin, cout, etc are bad, they're just not to be used in this project.)<li> No inheritance without good reason; check with Dan before using any. (Inheriting from Poller::Clientis ok.)<li> No templates in the main code. It's ok in unit tests, though...<li> No <a href="http://www.stlport.org">STL</a>. (That doesn't mean STLis evil; it's just not appropriate for this project.)</ul><hr><i>Last Change: 18 Mar 2002<br>Most files Copyright 1999-2002 Dan Kegel<br>nbbio.{cc,h} are Copyright 1999 Disappearing, Inc.<br>See AUTHORS in the tarball for more details</i></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -