📄 thttpd man page.htm
字号:
programs cannot be run at all. If you want to disable CGI as a security measure,
that's how you do it, just comment out the patterns in the config file and don't
run with the -c flag.
<P>Note: the current working directory when a CGI program gets run is the
directory that the CGI program lives in. This isn't in the CGI 1.1 spec, but
it's what most other HTTP servers do.
<P>Relevant config.h options: <A
href="http://www.acme.com/software/thttpd/options.html#CGI_PATTERN">CGI_PATTERN</A>,
<A
href="http://www.acme.com/software/thttpd/options.html#CGI_TIMELIMIT">CGI_TIMELIMIT</A>,
<A
href="http://www.acme.com/software/thttpd/options.html#CGI_NICE">CGI_NICE</A>,
<A
href="http://www.acme.com/software/thttpd/options.html#CGI_PATH">CGI_PATH</A>,
<A
href="http://www.acme.com/software/thttpd/options.html#CGI_LD_LIBRARY_PATH">CGI_LD_LIBRARY_PATH</A>,
<A
href="http://www.acme.com/software/thttpd/options.html#CGIBINDIR">CGIBINDIR</A>.
<H4><A name=BASIC_AUTHENTICATION>BASIC AUTHENTICATION</A></H4>
<P>Basic Authentication is available as an option at compile time. If enabled,
it uses a password file in the directory to be protected, called .htpasswd by
default. This file is formatted as the familiar colon-separated
username/encrypted-password pair, records delimited by newlines. The protection
does not carry over to subdirectories. The utility program <A
href="http://www.acme.com/software/thttpd/htpasswd_man.html">htpasswd(1)</A> is
included to help create and modify .htpasswd files.
<P>Relevant config.h option: <A
href="http://www.acme.com/software/thttpd/options.html#AUTH_FILE">AUTH_FILE</A>,
<H4><A name=THROTTLING>THROTTLING</A></H4>
<P>The throttle file lets you set maximum byte rates on URLs or URL groups. You
can optionally set a minimum rate too. The format of the throttle file is very
simple. A # starts a comment, and the rest of the line is ignored. Blank lines
are ignored. The rest of the lines should consist of a pattern, whitespace, and
a number. The pattern is a simple shell-style filename pattern, using ?/**/*, or
multiple such patterns separated by |.
<P>The numbers in the file are byte rates, specified in units of bytes per
second. For comparison, a v.90 modem gives about 5000 B/s depending on
compression, a double-B-channel ISDN line about 12800 B/s, and a T1 line is
about 150000 B/s. If you want to set a minimum rate as well, use number-number.
<P>Example:
<BLOCKQUOTE><CODE><PRE># throttle file for www.acme.com
** 2000-100000 # limit total web usage to 2/3 of our T1,
# but never go below 2000 B/s
**.jpg|**.gif 50000 # limit images to 1/3 of our T1
**.mpg 20000 # and movies to even less
jef/** 20000 # jef's pages are too popular
</PRE></CODE></BLOCKQUOTE>
<P>Throttling is implemented by checking each incoming URL filename against all
of the patterns in the throttle file. The server accumulates statistics on how
much bandwidth each pattern has accounted for recently (via a rolling average).
If a URL matches a pattern that has been exceeding its specified limit, then the
data returned is actually slowed down, with pauses between each block. If that's
not possible (e.g. for CGI programs) or if the bandwidth has gotten way larger
than the limit, then the server returns a special code saying 'try again later'.
<P>The minimum rates are implemented similarly. If too many people are trying to
fetch something at the same time, throttling may slow down each connection so
much that it's not really useable. Furthermore, all those slow connections clog
up the server, using up file handles and connection slots. Setting a minimum
rate says that past a certain point you should not even bother - the server
returns the 'try again later" code and the connection isn't even started.
<P>There is no provision for setting a maximum connections/second throttle,
because throttling a request uses as much cpu as handling it, so there would be
no point. There is also no provision for throttling the number of simultaneous
connections on a per-URL basis. However you can control the overall number of
connections for the whole server very simply, by setting the operating system's
per-process file descriptor limit before starting thttpd. Be sure to set the
hard limit, not the soft limit.
<H4><A name=MULTIHOMING>MULTIHOMING</A></H4>
<P>Multihoming means using one machine to serve multiple hostnames. For
instance, if you're an internet provider and you want to let all of your
customers have customized web addresses, you might have www.joe.acme.com,
www.jane.acme.com, and your own www.acme.com, all running on the same physical
hardware. This feature is also known as "virtual hosts". There are three steps
to setting this up.
<P>One, make DNS entries for all of the hostnames. The current way to do this,
allowed by HTTP/1.1, is to use CNAME aliases, like so:
<BLOCKQUOTE><CODE><PRE>www.acme.com IN A 192.100.66.1
www.joe.acme.com IN CNAME www.acme.com
www.jane.acme.com IN CNAME www.acme.com
</PRE></CODE></BLOCKQUOTE>However, this is incompatible with older HTTP/1.0
browsers. If you want to stay compatible, there's a different way - use A
records instead, each with a different IP address, like so:
<BLOCKQUOTE><CODE><PRE>www.acme.com IN A 192.100.66.10
www.joe.acme.com IN A 192.100.66.200
www.jane.acme.com IN A 192.100.66.201
</PRE></CODE></BLOCKQUOTE>This is bad because it uses extra IP addresses, a
somewhat scarce resource. But if you want people with older browsers to be able
to visit your sites, you still have to do it this way.
<P>Step two. If you're using the modern CNAME method of multihoming, then you
can skip this step. Otherwise, using the older multiple-IP-address method you
must set up IP aliases or multiple interfaces for the extra addresses. You can
use ifconfig(8)'s alias command to tell the machine to answer to all of the
different IP addresses. Example:
<BLOCKQUOTE><CODE><PRE>ifconfig le0 www.acme.com
ifconfig le0 www.joe.acme.com alias
ifconfig le0 www.jane.acme.com alias
</PRE></CODE></BLOCKQUOTE>If your OS's version of ifconfig doesn't have an alias
command, you're probably out of luck (but see the <A
href="http://www.acme.com/software/thttpd/notes.html#aliasing">notes</A>).
<P>Third and last, you must set up thttpd to handle the multiple hosts. The
easiest way is with the -v flag, or the ALWAYS_VHOST config.h option. This works
with either CNAME multihosting or multiple-IP multihosting. What it does is send
each incoming request to a subdirectory based on the hostname it's intended for.
All you have to do in order to set things up is to create those subdirectories
in the directory where thttpd will run. With the example above, you'd do like
so:
<BLOCKQUOTE><CODE><PRE>mkdir www.acme.com www.joe.acme.com www.jane.acme.com
</PRE></CODE></BLOCKQUOTE>If you're using old-style multiple-IP multihosting,
you should also create symbolic links from the numeric addresses to the names,
like so:
<BLOCKQUOTE><CODE><PRE>ln -s www.acme.com 192.100.66.1
ln -s www.joe.acme.com 192.100.66.200
ln -s www.jane.acme.com 192.100.66.201
</PRE></CODE></BLOCKQUOTE>This lets the older HTTP/1.0 browsers find the right
subdirectory.
<P>There's an optional alternate step three if you're using multiple-IP
multihosting: run a separate thttpd process for each hostname, using the -h flag
to specify which one is which. This gives you more flexibility, since you can
run each of these processes in separate directories, with different throttle
files, etc. Example:
<BLOCKQUOTE><CODE><PRE>thttpd -r -d /usr/www -h www.acme.com
thttpd -r -d /usr/www/joe -u joe -h www.joe.acme.com
thttpd -r -d /usr/www/jane -u jane -h www.jane.acme.com
</PRE></CODE></BLOCKQUOTE>But remember, this multiple-process method does not
work with CNAME multihosting - for that, you must use a single thttpd process
with the -v flag.
<H4><A name=CUSTOM_ERRORS>CUSTOM ERRORS</A></H4>
<P>thttpd lets you define your own custom error pages for the various HTTP
errors. There's a separate file for each error number, all stored in one special
directory. The directory name is "errors", at the top of the web directory tree.
The error files should be named "errNNN.html", where NNN is the error number. So
for example, to make a custom error page for the authentication failure error,
which is number 401, you would put your HTML into the file "errors/err401.html".
If no custom error file is found for a given error number, then the usual
built-in error page is generated.
<P>If you're using the virtual hosts option, you can also have different custom
error pages for each different virtual host. In this case you put another
"errors" directory in the top of that virtual host's web tree. thttpd will look
first in the virtual host errors directory, and then in the server-wide errors
directory, and if neither of those has an appropriate error file then it will
generate the built-in error.
<H4><A name=NON-LOCAL_REFERERS>NON-LOCAL REFERERS</A></H4>
<P>Sometimes another site on the net will embed your image files in their HTML
files, which basically means they're stealing your bandwidth. You can prevent
them from doing this by using non-local referer filtering. With this option,
certain files can only be fetched via a local referer. The files have to be
referenced by a local web page. If a web page on some other site references the
files, that fetch will be blocked. There are three config-file variables for
this feature:
<DL>
<DT>urlpat
<DD>A wildcard pattern for the URLs that should require a local referer. This
is typically just image files, sound files, and so on. For example:
<BLOCKQUOTE><CODE>urlpat=**.jpg|**.gif|**.au|**.wav </CODE></BLOCKQUOTE>For
most sites, that one setting is all you need to enable referer filtering.
<DT>noemptyreferers
<DD>By default, requests with no referer at all, or a null referer, or a
referer with no apparent hostname, are allowed. With this variable set, such
requests are disallowed.
<DT>localpat
<DD>A wildcard pattern that specifies the local host or hosts. This is used to
determine if the host in the referer is local or not. If not specified it
defaults to the actual local hostname. </DD></DL>
<H4><A name=SYMLINKS>SYMLINKS</A></H4>
<P>thttpd is very picky about symbolic links. Before delivering any file, it
first checks each element in the path to see if it's a symbolic link, and
expands them all out to get the final actual filename. Along the way it checks
for things like links with ".." that go above the server's directory, and
absolute symlinks (ones that start with a /). These are prohibited as security
holes, so the server returns an error page for them. This means you can't set up
your web directory with a bunch of symlinks pointing to individual users' home
web directories. Instead you do it the other way around - the user web
directories are real subdirs of the main web directory, and in each user's home
dir there's a symlink pointing to their actual web dir.
<P>The CGI pattern is also affected - it gets matched against the fully-expanded
filename. So, if you have a single CGI directory but then put a symbolic link in
it pointing somewhere else, that won't work. The CGI program will be treated as
a regular file and returned to the client, instead of getting run. This could be
confusing.
<H4><A name=PERMISSIONS>PERMISSIONS</A></H4>
<P>thttpd is also picky about file permissions. It wants data files (HTML,
images) to be world readable. Readable by the group that the thttpd process runs
as is not enough - thttpd checks explicitly for the world-readable bit. This is
so that no one ever gets surprised by a file that's not set world-readable and
yet somehow is readable by the HTTP server and therefore the *whole* world.
<P>The same logic applies to directories. As with the standard Unix "ls"
program, thttpd will only let you look at the contents of a directory if its
read bit is on; but as with data files, this must be the world-read bit, not
just the group-read bit.
<P>thttpd also wants the execute bit to be *off* for data files. A file that is
marked executable but doesn't match the CGI pattern might be a script or program
that got accidentally left in the wrong directory. Allowing people to fetch the
contents of the file might be a security breach, so this is prohibited. Of
course if an executable file *does* match the CGI pattern, then it just gets run
as a CGI.
<P>In summary, data files should be mode 644 (<CODE>rw-r--r--</CODE>),
directories should be 755 (<CODE>rwxr-xr-x</CODE>) if you want to allow indexing
and 711 (<CODE>rwx--x--x</CODE>) to disallow it, and CGI programs should be mode
755 (<CODE>rwxr-xr-x</CODE>) or 711 (<CODE>rwx--x--x</CODE>).
<H4><A name=LOGS>LOGS</A></H4>
<P>thttpd does all of its logging via syslog(3). The facility it uses is
configurable. Aside from error messages, there are only a few log entry types of
interest, all fairly similar to CERN Common Log Format:
<BLOCKQUOTE><CODE><PRE>Aug 6 15:40:34 acme thttpd[583]: 165.113.207.103 - - "GET /file" 200 357
Aug 6 15:40:43 acme thttpd[583]: 165.113.207.103 - - "HEAD /file" 200 0
Aug 6 15:41:16 acme thttpd[583]: referer http://www.acme.com/ -> /dir
Aug 6 15:41:16 acme thttpd[583]: user-agent Mozilla/1.1N
</PRE></CODE></BLOCKQUOTE>The package includes a script for translating these
log entries info CERN-compatible files. Note that thttpd does not translate
numeric IP addresses into domain names. This is both to save time and as a minor
security measure (the numeric address is harder to spoof).
<P>Relevant config.h option: <A
href="http://www.acme.com/software/thttpd/options.html#LOG_FACILITY">LOG_FACILITY</A>.
<P>If you'd rather log directly to a file, you can use the -l command-line flag.
But note that error messages still go to syslog.
<H4><A name=SIGNALS>SIGNALS</A></H4>
<P>thttpd handles a couple of signals, which you can send via the standard Unix
kill(1) command:
<DL>
<DT>INT,TERM
<DD>These signals tell thttpd to shut down immediately. Any requests in
progress get aborted.
<DT>USR1
<DD>This signal tells thttpd to shut down as soon as it's done servicing all
current requests. In addition, the network socket it uses to accept new
connections gets closed immediately, which means a fresh thttpd can be started
up immediately.
<DT>USR2
<DD>This signal tells thttpd to generate the statistics syslog messages
immediately, instead of waiting for the regular hourly update.
<DT>HUP
<DD>This signal tells thttpd to close and re-open its (non-syslog) log file,
for instance if you rotated the logs and want it to start using the new one.
This is a little tricky to set up correctly, for instance if you are using
chroot() then the log file must be within the chroot tree, but it's definitely
doable. </DD></DL>
<H4><A name=SEE_ALSO>SEE ALSO</A></H4>
<P><A
href="http://www.acme.com/software/thttpd/redirect_man.html">redirect(8)</A>, <A
href="http://www.acme.com/software/thttpd/ssi_man.html">ssi(8)</A>, <A
href="http://www.acme.com/software/thttpd/makeweb_man.html">makeweb(1)</A>, <A
href="http://www.acme.com/software/thttpd/htpasswd_man.html">htpasswd(1)</A>, <A
href="http://www.acme.com/software/thttpd/syslogtocern_man.html">syslogtocern(8)</A>,
<A href="http://www.acme.com/software/weblog_parse/">weblog_parse(1)</A>, <A
href="http://www.acme.com/software/http_get/">http_get(1)</A>
<H4><A name=THANKS>THANKS</A></H4>
<P>Many thanks to reviewers and testers: John LoVerso, Jordan Hayes, Chris
Torek, Jim Thompson, Barton Schaffer, Geoff Adams, Dan Kegel, John Hascall,
Bennett Todd, KIKUCHI Takahiro, Catalin Ionescu. Special thanks to Craig Leres
for substantial debugging and development, and for not complaining about my
coding style very much.
<H4><A name=AUTHOR>AUTHOR</A></H4>
<P>Copyright
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -