📄 notes.html

📁 嵌入式http源代码的文档资料
💻 HTML
📖 第 1 页 / 共 2 页
字号:
上一页 12

<P>
A chroot setup like this is more secure than not doing chroot at all,
but obviously less secure than the bare minimum static-binaries-only
chroot jail.
A poorly-written CGI shell script might allow an attacker to run
arbitrary shell commands.
Without chroot, this attacker would have access to the entire machine;
with it, he or she is restricted to the chroot tree.

<P>
Also: it is actually possible to break out of chroot jail.
A process running as root, either via a setuid program or some security
hole, can change its own chroot tree to the next higher directory,
repeating as necessary to get to the top of the filesystem.
So, a chroot tree must be considered merely one aspect of a
multi-layered defense-in-depth.
If your chroot tree has enough tools in it for a cracker to
gain root access, then it's no good; so you want to keep the contents
to the minimum necessary.
In particular, don't include any setuid-root executables!

<P>
One idea I haven't tried, which might give improved security while
still allowing CGI access, is to use the "noexec" filesystem mount
option.
This is a flag you can set on a disk partition that tells the system
to not allow any programs to be run from that area.
The idea is, you would create two partitions for your chroot tree,
one with the noexec option and one without it.
The noexec partition becomes the main chroot tree; the execs-allowed
one gets mounted inside the other one, and is the only place that
CGI programs are allowed.
Then, after you have yourself all set up, you make the execs-allowed
partition read-only.

<HR>

<H3><A NAME="throttle">Throttling is good:</A></H3>

<P>
[not written yet]

<HR>

<H3><A NAME="select">select() is good:</A></H3>

<P>
select() is a Unix system call used to multiplex between a bunch
of file descriptors.
To understand why it's important we have to go back through the
history of web servers.

<P>
The basic operation of a web server is to accept a request and send
back a response.
The first web servers were probably written to do exactly that.
Their users no doubt noticed very quickly that while the server
was sending a response to someone else, they couldn't get their
own requests serviced.
There would have been long annoying pauses.

<P>
The second generation of web servers addressed this problem by
forking off a child process for each request.
This is very straightforward to do under Unix, only a few extra lines of code.
CERN and NCSA 1.3 are examples of type of server.
Unfortunately, forking a process is a fairly expensive operation,
so performance of this type of server is still pretty poor.
The long random pauses are gone, but instead every request has a short
constant pause at startup.
Because of this, the server can't handle a high rate of connections.

<P>
A slight variant of this type of server uses "lightweight processes" or
"threads" instead of full-blown Unix processes.
This is better, but there is no standard LWP/threads interface so
this approach is inherently non-portable.
Examples of these servers: MDMA and phttpd, both of which run
only under Solaris 2.x.

<P>
The third generation of servers is called "pre-forking".
Instead of starting a new subprocess for each request, they have a
pool of subprocesses that they keep around and re-use.
NCSA 1.4, Apache, and Netscape Netsite are examples of this type.
Performance of these servers is excellent,
they can handle from two to ten times as many
connections per second as the forking servers.
One problem, however, is that implementing this simple-to-state idea turns out
to be fairly complicated and non-portable.
The method used by NCSA involves transferring a file descriptor from the
parent process to an already-existing child process;
you can hardly use the same code on any two different OS's, and
some OS's (e.g. Linux) don't support it at all.
Apache uses a different method, with all the child processes
doing their own round-robin work queue via lock files,
which brings in issues of portability/speed/deadlock.
Besides, you still have multple processes hanging around using up
memory and context-switch CPU cycles.
Which brings us to...

<P>
The fourth generation.
One process only.
No non-portable threads/LWPs.
Sends multiple files concurrently using select() to tell which
ones are ready for more data.
Speed is excellent.
Memory use is excellent.
Portability is excellent.
Examples of this generation: Spinner, Open Market, and thttpd.
Perhaps NCSA and/or Netsite will switch to this method at some point.
I really can't understand why they went with that complicated pre-forking
stuff.
Using select() is just not that hard.

<HR>

<H3><A NAME="listen">On the listen queue length:</A></H3>

<P>
Many web admins think there are two main types of performance
bottlenecks for a web server, the raw data rate of the network
connection, and the CPU usage on the server machine.
In fact there is
a third common bottleneck that's still fairly obscure.
If you run into
this limit you may find that your web server isn't using much CPU, your
network link isn't particularly full, and yet there are consistent
complaints of timeouts and "connection refused" errors.
It can be a very frustrating situation.

<P>
Here's the deal: most versions of Unix have very short pending-conection
queues.
This queue is for connections waiting to be accept()ed, and
typically it's of length 5.
This puts a severe limit on how many
connections/second the server can handle - if one comes in while the
queue is full, it gets dropped on the floor and the client gets
"connection refused".
With only 5 slots in the queue, you'll start to
see this behavior at around 3 connections/second.
thttpd tries to
minimize the effect of this limit by accepting new connections as fast
as possible, and saving them in its own internal higher-capacity queue
for later processing.
Even so, for best performance you really want to
make the system's queue longer, more like 32, which will handle maybe
10 to 20 connections/second.

<P>
On Solaris systems you can increase the queue length with this command:

<BLOCKQUOTE><CODE><PRE>
/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max 32
</PRE></CODE></BLOCKQUOTE>

You have to run this as root, of course.
This should go in the system startup script "/etc/rc2.d/S69inet".
You can raise it higher than 32
if you like - if you're running Solaris 2.5 you can increase it to
1024, otherwise the limit is 512.

<P>
On BSD/OS you use:

<BLOCKQUOTE><CODE><PRE>
/usr/bin/bpatch -l -r somaxconn 32
</PRE></CODE></BLOCKQUOTE>

Not sure what the maximum is here.

<P>
HP-UX 10.0 sets the default limit to 20, which is not too bad.

<P>
Many other systems also have tiny queue limits - if I find out specifics
on how to raise those limits, I'll put the info here.

<HR>

<H3><A NAME="aliasing">On IP aliasing:</A></H3>

<P>
If you want to do multi-homing but your OS's ifconfig program doesn't
have the alias command, you may still be able to get it to work.

<P>
If you're running Solaris 2.3 or later, it's just a matter of a
different user-interface on the ifconfig program.
The Solaris equivalent of the example in the thttpd man page would be:

<BLOCKQUOTE><CODE><PRE>
ifconfig le0 www.acme.com
ifconfig le0:0 www.acme.com
ifconfig le0:1 www.joe.acme.com up
ifconfig le0:2 www.jane.acme.com up
</PRE></CODE></BLOCKQUOTE>

Not so hard.
Still, it would be nice if Sun got with the program and
supported the alias command.
Maybe some day.

<P>
If you're running IRIX, you can use the PPP driver to add IP aliases.
This is complicated but does not require kernel hacking.
First you start up PPP commands for the aliased addresses.
Sticking once again with the example in the man page:

<BLOCKQUOTE><CODE><PRE>
/usr/etc/ppp -r 192.100.66.200 &
/usr/etc/ppp -r 192.100.66.201 &
</PRE></CODE></BLOCKQUOTE>

These commands will complain that they can't find the address - that's
ok, you just need them to start.
In fact if you like, you can kill them after they complain.
Next you point the aliased addresses at the real one, using ifconfig:

<BLOCKQUOTE><CODE><PRE>
ifconfig ppp0 192.100.66.200 192.100.66.10
ifconfig ppp0 192.100.66.201 192.100.66.10
</PRE></CODE></BLOCKQUOTE>

Next you have to tell ARP that all the IP addresses go to the same ethernet
address.
You will need the ethernet address for you system, which you can
get from the netstat -ia command - it's the bunch of hex digits separated
by colons.

<BLOCKQUOTE><CODE><PRE>
arp -s 192.100.66.10 08:00:20:09:0e:86 pub
arp -s 192.100.66.200 08:00:20:09:0e:86 pub
arp -s 192.100.66.201 08:00:20:09:0e:86 pub
</PRE></CODE></BLOCKQUOTE>

Finally, you have to add routes from the new PPP interfaces to localhost:

<BLOCKQUOTE><CODE><PRE>
route add 192.100.66.200 localhost 1
route add 192.100.66.201 localhost 1
</PRE></CODE></BLOCKQUOTE>

<P>
If you're running SunOS 4.1.x, fetch this tar file:
<A HREF="tppmsgs/msgs0.htm#26" tppabs="ftp://ftp.cerf.net/pub/vendor/peggy/vif.tar.gz">ftp://ftp.cerf.net/pub/vendor/peggy/vif.tar.gz</A>
It contains some netnews articles with instructions and source code
for adding a "virtual interface" device to the kernel.
Installing this stuff is not trivial.
There's also supposedly a way to use
a PPP driver under SunOS, as with IRIX above, but I haven't found
details on this yet.

<P>
If you're running Linux, here's a pointer to some kernel patches to
add ip aliasing:
<A HREF="tppmsgs/msgs0.htm#27" tppabs="ftp://ftp.mindspring.com/users/rsanders/ipalias/">ftp://ftp.mindspring.com/users/rsanders/ipalias/</A>
I'm not sure what version of Linux this is for.
Recent/future versions
of Linux may come with aliasing already installed, so check your ifconfig
man page before you start hacking.

<HR>

<H3><A NAME="developers">For HTTP developers:</A></H3>

<P>
The package includes a simple library that you could use for embedding
an HTTP server in your own application.
The interface is somewhat more
complicated than most applications would need, to enable the
multi-connection stuff in the main program, but it should still be
quite useful.

<P>
The package also contains a nice little timer library, that could be used
for all sorts of stuff.
If you borrow this you will probably want to
make it do its own gettimeofday() calls - the only reason I'm passing
the time as a parameter is as an optimization, since the main program
already has the current time for other reasons.

<P>
Plus there's a cute little filename matcher routine, and a general
symbolic-link-expander routine.

<HR>

<H3><A NAME="syslog">On syslog performance and security:</A></H3>

<P>
Syslog is the standard Unix logging mechanism.
It's very flexible, and lets you do things like log different programs to
different files and get real-time notification of critical events.
However, many programmers avoid using it, possibly because they
worry it adds too much overhead.
Well, I did some simple benchmarks comparing syslog logging against
the stdio logging that other web servers use.
Under conditions approximating an extremely busy web server, I found
that syslog was slower by only a few percent.
Under less strenuous conditions there was no measurable difference.

<P>
Another concern about syslog is security against external attacks.
It's written somewhat casually, using a fixed-size internal buffer
without overflow checking.
That makes it vulnerable to a buffer-overrun attack such as used
by the Morris Worm against fingerd.
However, it's easy to call syslog in such as way that this attack
becomes impossible - just put a maximum size on all the string
formatting codes you use.  For instance, use %.80s instead of %s.
Thttpd does this.

<HR>
Back to the <A HREF="thttpd.html" tppabs="http://www.acme.com/software/thttpd/thttpd.html">thttpd page</A>.
<P>
<ADDRESS><A HREF="mailto:webmaster&#64;acme.com">ACME Labs Webmaster &lt;webmaster&#64;acme.com&gt;</A></ADDRESS>
</BODY>
</HTML>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -