📄 the c10k problem.mht
字号:
any method=20
of handling network connections other than one thread per client. <A=20
href=3D"http://www.volano.com/report/">Volanomark</A> is a good =
microbenchmark=20
which measures throughput in messsages per second at various numbers of=20
simultaneous connections. As of May 2003, JDK 1.3 implementations from =
various=20
vendors are in fact able to handle ten thousand simultaneous connections =
--=20
albeit with significant performance degradation. See <A=20
href=3D"http://www.volano.com/report/#nettable">Table 4</A> for an idea =
of which=20
JVMs can handle 10000 connections, and how performance suffers as the =
number of=20
connections increases.=20
<H4><A name=3D1:1>Note: 1:1 threading vs. M:N threading</A></H4>There is =
a choice=20
when implementing a threading library: you can either put all the =
threading=20
support in the kernel (this is called the 1:1 threading model), or you =
can move=20
a fair bit of it into userspace (this is called the M:N threading =
model). At one=20
point, M:N was thought to be higher performance, but it's so complex =
that it's=20
hard to get right, and most people are moving away from it.=20
<UL>
<LI><A=20
=
href=3D"http://marc.theaimsgroup.com/?l=3Dlinux-kernel&m=3D1032848792=
16107&w=3D2">Why=20
Ingo Molnar prefers 1:1 over M:N</A>=20
<LI><A =
href=3D"http://java.sun.com/docs/hotspot/threads/threads.html">Sun is=20
moving to 1:1 threads</A>=20
<LI><A href=3D"http://www-124.ibm.com/pthreads/">NGPT</A> is an M:N =
threading=20
library for Linux.=20
<LI>Although <A=20
href=3D"http://people.redhat.com/drepper/glibcthreads.html">Ulrich =
Drepper=20
planned to use M:N threads in the new glibc threading library</A>, he =
has=20
since <A =
href=3D"http://people.redhat.com/drepper/nptl-design.pdf">switched to=20
the 1:1 threading model.</A>=20
<LI><A=20
=
href=3D"http://developer.apple.com/technotes/tn/tn2028.html#MacOSXThreadi=
ng">MacOSX=20
appears to use 1:1 threading.</A>=20
<LI><A href=3D"http://people.freebsd.org/~julian/">FreeBSD</A> and <A=20
href=3D"http://web.mit.edu/nathanw/www/usenix/freenix-sa/">NetBSD</A> =
appear to=20
still believe in M:N threading... The lone holdouts? Looks like =
freebsd 7.0=20
might switch to 1:1 threading (see above), so perhaps M:N threading's=20
believers have finally been proven wrong everywhere. </LI></UL>
<H3><A name=3Dkio>5. Build the server code into the kernel</A></H3>
<P>Novell and Microsoft are both said to have done this at various =
times, at=20
least one NFS implementation does this, <A=20
href=3D"http://www.fenrus.demon.nl/">khttpd</A> does this for Linux and =
static web=20
pages, and <A=20
href=3D"http://slashdot.org/comments.pl?sid=3D00/07/05/0211257&cid=3D=
218">"TUX"=20
(Threaded linUX webserver)</A> is a blindingly fast and flexible =
kernel-space=20
HTTP server by Ingo Molnar for Linux. Ingo's <A=20
href=3D"http://marc.theaimsgroup.com/?l=3Dlinux-kernel&m=3D9809864801=
1183&w=3D2">September=20
1, 2000 announcement</A> says an alpha version of TUX can be downloaded =
from <A=20
href=3D"ftp://ftp.redhat.com/pub/redhat/tux">ftp://ftp.redhat.com/pub/red=
hat/tux</A>,=20
and explains how to join a mailing list for more info. <BR>The =
linux-kernel list=20
has been discussing the pros and cons of this approach, and the =
consensus seems=20
to be instead of moving web servers into the kernel, the kernel should =
have the=20
smallest possible hooks added to improve web server performance. That =
way, other=20
kinds of servers can benefit. See e.g. <A=20
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9906_03/msg01041.=
html">Zach=20
Brown's remarks</A> about userland vs. kernel http servers. It appears =
that the=20
2.4 linux kernel provides sufficient power to user programs, as the <A=20
href=3D"http://www.kegel.com/c10k.html#x15">X15</A> server runs about as =
fast as=20
Tux, but doesn't use any kernel modifications.=20
<P>
<H2><A name=3Dcomments>Comments</A></H2>
<P>Richard Gooch has written <A=20
href=3D"http://www.atnf.csiro.au/~rgooch/linux/docs/io-events.html">a =
paper=20
discussing I/O options</A>.=20
<P>In 2001, Tim Brecht and MMichal Ostrowski <A=20
href=3D"http://www.hpl.hp.com/techreports/2001/HPL-2001-314.html">measure=
d various=20
strategies</A> for simple select-based servers. Their data is worth a =
look.=20
<P>In 2003, Tim Brecht posted <A=20
href=3D"http://www.hpl.hp.com/research/linux/userver/">source code for=20
userver</A>, a small web server put together from several servers =
written by=20
Abhishek Chandra, David Mosberger, David Pariag, and Michal Ostrowski. =
It can=20
use select(), poll(), epoll(), or sigio.=20
<P>Back in March 1999, <A=20
href=3D"http://marc.theaimsgroup.com/?l=3Dapache-httpd-dev&m=3D921009=
77123493&w=3D2">Dean=20
Gaudet posted</A>:=20
<BLOCKQUOTE><I>I keep getting asked "why don't you guys use a =
select/event=20
based model like Zeus? It's clearly the fastest." ... =
</I></BLOCKQUOTE>His=20
reasons boiled down to "it's really hard, and the payoff isn't clear". =
Within a=20
few months, though, it became clear that people were willing to work on =
it.=20
<P>Mark Russinovich wrote <A =
href=3D"http://linuxtoday.com/stories/5499.html">an=20
editorial</A> and <A=20
href=3D"http://www.winntmag.com/Articles/Index.cfm?ArticleID=3D5048">an =
article</A>=20
discussing I/O strategy issues in the 2.2 Linux kernel. Worth reading, =
even he=20
seems misinformed on some points. In particular, he seems to think that =
Linux=20
2.2's asynchronous I/O (see F_SETSIG above) doesn't notify the user =
process when=20
data is ready, only when new connections arrive. This seems like a =
bizarre=20
misunderstanding. See also <A=20
href=3D"http://www.dejanews.com/getdoc.xp?AN=3D431444525">comments on an =
earlier=20
draft</A>, <A =
href=3D"http://www.dejanews.com/getdoc.xp?AN=3D472893693">Ingo=20
Molnar's rebuttal of 30 April 1999</A>, <A=20
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9905_01/msg00089.=
html">Russinovich's=20
comments of 2 May 1999</A>, <A=20
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9905_01/msg00263.=
html">a=20
rebuttal</A> from Alan Cox, and various <A=20
href=3D"http://www.dejanews.com/dnquery.xp?ST=3DPS&QRY=3Dthreads&=
DBS=3D1&format=3Dthreaded&showsort=3Dscore&maxhits=3D100&=
LNG=3DALL&groups=3Dfa.linux.kernel+&fromdate=3Djun+1+1998">posts =
to linux-kernel</A>. I suspect he was trying to say that Linux doesn't =
support=20
asynchronous disk I/O, which used to be true, but now that SGI has =
implemented=20
<A href=3D"http://www.kegel.com/c10k.html#aio">KAIO</A>, it's not so =
true anymore.=20
<P>See these pages at <A=20
href=3D"http://www.sysinternals.com/ntw2k/info/comport.shtml">sysinternal=
s.com</A>=20
and <A=20
href=3D"http://msdn.microsoft.com/library/techart/msdn_scalabil.htm">MSDN=
</A> for=20
information on "completion ports", which he said were unique to NT; in a =
nutshell, win32's "overlapped I/O" turned out to be too low level to be=20
convenient, and a "completion port" is a wrapper that provides a queue =
of=20
completion events, plus scheduling magic that tries to keep the number =
of=20
running threads constant by allowing more threads to pick up completion =
events=20
if other threads that had picked up completion events from this port are =
sleeping (perhaps doing blocking I/O).=20
<P>See also <A =
href=3D"http://www.as400.ibm.com/developer/v4r5/api.html">OS/400's=20
support for I/O completion ports</A>.=20
<P><A name=3D15k>There</A> was an interesting discussion on linux-kernel =
in=20
September 1999 titled "<A=20
href=3D"http://www.cs.helsinki.fi/linux/linux-kernel/Year-1999/1999-36/01=
60.html">>=20
15,000 Simultaneous Connections</A>" (and the <A=20
href=3D"http://www.cs.helsinki.fi/linux/linux-kernel/Year-1999/1999-37/06=
12.html">second=20
week</A> of the thread). Highlights:=20
<UL>
<LI>Ed Hall <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_01/msg00807.=
html">posted</A>=20
a few notes on his experiences; he's achieved >1000 connects/second =
on a UP=20
P2/333 running Solaris. His code used a small pool of threads (1 or 2 =
per CPU)=20
each managing a large number of clients using "an event-based model".=20
<LI>Mike Jagdis <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_01/msg00831.=
html">posted=20
an analysis of poll/select overhead</A>, and said "The current =
select/poll=20
implementation can be improved significantly, especially in the =
blocking case,=20
but the overhead will still increase with the number of descriptors =
because=20
select/poll does not, and cannot, remember what descriptors are =
interesting.=20
This would be easy to fix with a new API. Suggestions are welcome..."=20
<LI>Mike <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_01/msg00964.=
html">posted</A>=20
about his <A =
href=3D"http://www.purplet.demon.co.uk/linux/select/">work on=20
improving select() and poll()</A>.=20
<LI>Mike <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_01/msg00971.=
html">posted=20
a bit about a possible API to replace poll()/select()</A>: "How about =
a=20
'device like' API where you write 'pollfd like' structs, the 'device' =
listens=20
for events and delivers 'pollfd like' structs representing them when =
you read=20
it? ... "=20
<LI>Rogier Wolff <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_01/msg00979.=
html">suggested</A>=20
using "the API that the digital guys suggested", <A=20
=
href=3D"http://www.cs.rice.edu/~gaurav/papers/usenix99.ps">http://www.cs.=
rice.edu/~gaurav/papers/usenix99.ps</A>=20
<LI>Joerg Pommnitz <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_02/msg00001.=
html">pointed=20
out</A> that any new API along these lines should be able to wait for =
not just=20
file descriptor events, but also signals and maybe SYSV-IPC. Our=20
synchronization primitives should certainly be able to do what Win32's =
WaitForMultipleObjects can, at least.=20
<LI>Stephen Tweedie <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_02/msg01198.=
html">asserted</A>=20
that the combination of F_SETSIG, queued realtime signals, and =
sigwaitinfo()=20
was a superset of the API proposed in=20
http://www.cs.rice.edu/~gaurav/papers/usenix99.ps. He also mentions =
that you=20
keep the signal blocked at all times if you're interested in =
performance;=20
instead of the signal being delivered asynchronously, the process =
grabs the=20
next one from the queue with sigwaitinfo().=20
<LI>Jayson Nordwick <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_03/msg00002.=
html">compared</A>=20
completion ports with the F_SETSIG synchronous event model, and =
concluded=20
they're pretty similar.=20
<LI>Alan Cox <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_03/msg00043.=
html">noted</A>=20
that an older rev of SCT's SIGIO patch is included in 2.3.18ac.=20
<LI>Jordan Mendelson <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_03/msg00093.=
html">posted</A>=20
some example code showing how to use F_SETSIG.=20
<LI>Stephen C. Tweedie <A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_03/msg00095.=
html">continued</A>=20
the comparison of completion ports and F_SETSIG, and noted: "With a =
signal=20
dequeuing mechanism, your application is going to get signals destined =
for=20
various library components if libraries are using the same mechanism," =
but the=20
library can set up its own signal handler, so this shouldn't affect =
the=20
program (much).=20
<LI><A=20
=
href=3D"http://www.linuxhq.com/lnxlists/linux-kernel/lk_9909_04/msg00900.=
html">Doug=20
Royer</A> noted that he'd gotten 100,000 connections on Solaris 2.6 =
while he=20
was working on the Sun calendar server. Others chimed in with =
estimates of how=20
much RAM that would require on Linux, and what bottlenecks would be =
hit.=20
</LI></UL>
<P>Interesting reading!=20
<P>
<H2><A name=3Dlimits.filehandles>Limits on open filehandles</A></H2>
<UL>
<LI>Any Unix: the limits set by ulimit or setrlimit.=20
<LI>Solaris: see <A=20
href=3D"http://www.wins.uva.nl/pub/solaris/solaris2/Q3.46.html">the =
Solaris FAQ,=20
question 3.46</A> (or thereabouts; they renumber the questions =
periodically).=20
<LI>FreeBSD:<BR><BR>Edit /boot/loader.conf, add the line <PRE>set =
kern.maxfiles=3DXXXX</PRE>where XXXX is the desired system limit on=20
file descriptors, and reboot. Thanks to an anonymous reader, who wrote =
in to=20
say he'd achieved far more than 10000 connections on FreeBSD 4.3, and =
says=20
<BLOCKQUOTE>"FWIW: You can't actually tune the maximum number of =
connections=20
in FreeBSD
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -