http:^^www.cs.washington.edu^homes^dbj^scripts.html

来自「This data set contains WWW-pages collect」· HTML 代码 · 共 210 行

HTML
210
字号
Date: Tue, 10 Dec 1996 22:42:47 GMTServer: NCSA/1.4.2Content-type: text/htmlLast-modified: Tue, 04 Oct 1994 03:39:12 GMTContent-length: 7901<html><head><title>Guidelines For Writing HTTP Server Scripts</title><body><h1>Guidelines For Writing HTTP Server Scripts</h1><hr>This document is intended to be an evolving series of ideas, pointersand other information about writing programs that can be executed byfollowing a WWW link, with particular emphasis on security issues.<p>The rest of this document assumes that you already know how to writeprograms. It also doesn't attempt to cover the same ground as<ahref="http://south.ncsa.uiuc.edu/forms.html">NCSA's introduction toforms</a>, which should be be considered the starting point forexplorations of forms. You should also be sure to read the <ahref="http://hoohoo.ncsa.uiuc.edu/cgi/">Common Gateway Interfacedocumentation</a> as well, which describes the interface that the HTTPserver defines between HTML messages and server-side executables.<h2>Security</h2>The potential problems with security cannot be overemphasized. Unlikeexisting network protocols, which generally allow either:<ul><li> specified users to execute arbitrary code<li> arbitrary users to execute specified code</ul>the existence of server-side scripts effectively permits<ul><li> arbitrary users to execute arbitrary code</ul>although clearly, the scope of "abitrary" in this case is at leastsomewhat reduced (specifically: arbitrary users can execute anyprogram installed in a ScriptAlias location prior to the last HTTPserver restart).<p>There are some basic features of server-side scripts that if usedcorrectly will minimize the potential for security problems:<ul><li><em>NEVER, EVER</em> treat data received from a remote source as    instructions to be executed.<li><em>NEVER, EVER</em> assume that necessary arguments are present,    and exit gracefully if this is the case. <em>ALWAYS</em> check    that such arguments are in fact present.<li><em>ALWAYS</em> have the program exit gracefully if it receives    arguments it does not understand.<li><em>NEVER</em> assume the size of the arguments or the data received    by the program. Always check the expected size of these objects    or know that the interpreter you are using (awk, sh, for example),    will make sane, secure decisions if an overflow occurs.<li><em>NEVER</em> trust any claimed remote identity (HTTP does not    currently support anything more secure than passwords, which are    not very secure at all).</ul><h2>The Basic Idea</h2>When you follow a link using a URL of the form:<p><pre>	http://foo.bar.baz/a/b/c</pre>the HTTP server at foo.bar.baz will check each successively longersubstring of <code>/a/b/c</code> (ie. <code>/a</code>,<code>/a/b</code>, etc.) against the list of "ScriptAliases" definedin the server's configuration files. A ScriptAlias looks like this:<pre>ScriptAlias /a /some/other/place/in/the/filesystem/a</pre>which the server interprets to mean: if anyone ever references<code>/a/something</code>, then execute<code>/some/other/place/in/the/filesystem/a</code> and return itsoutput. Note that this implies two things about the executed program:it must send a MIME Content-Type header as its first line of output,to tell the client (Mosaic) what the output actually is (HTML ? GIF ?JPEG ? etc), and then it should send some "useful" output, even if itsonly an "OK, message received" line. See the <code>mail-request</code>mentioned below for an example of how to do this.<p><em>Only</em> programs located in places referenced by a ScriptAliaswill ever be executed by the server. In addition, the server caches adirectory listing of all the programs in each location referenced in aScriptAlias whenever it is started (or <ahref="http://hoohoo.ncsa.uiuc.edu/docs/setup/admin/KillIt.html">restarted</a>),and uses it to check possible server-side programs before executingthem. This prevents random programs placed in the right place frombeing accessed without a server restart (which only a priviledged usercan do). <p><h2>What about arguments ? What about input ?</h2>Once the HTTP server has discovered that <code>/a/b</code> is actuallyan executable program in a ScriptAlias location, it executes theprogram, passing it data in two ways.First of all, any text left over from the URL that has not been "used"to find the script will be used to set the value of an environmentvariable named PATH_INFO. In the example above, this would relativelysimple: PATH_INFO would just be <code>/c</code>. However, neararbitrary text can be used here:<p><pre>http:/foo.bar.baz/a/b/long=4748.39?//limit:=$!!:h+aposto:*&%&^$$#{fhfh}</pre>This will result in PATH_INFO being set to:<p><pre>/long=4748.39?//limit:=$!!:h+aposto:*&%&^$$#{fhfh}</pre>(note the initial `/'). The main restriction is that spaces are notallowed, or rather, will terminate the component of the URL used toset PATH_INFO.<p>In addition, if you are using a forms interface, the values of all the<code>&lt;input&gt;</code> and <code>&lt;select&gt</code> tags in theform will be made available, as the standard input of theprogram. <p><h3>Encoding</h3>This will be <ahref="http://info.cern.ch/hypertext/WWW/Addressing/URL/4_Recommentations.html#z1">encoded</a> to guarantee safe transmission. This encoding is animportant issue, because to make reasonable use of the data sent toyour program, you need to decode it. Fortunately, you don't *have* todo this yourself. For the time being, a filter called<code>urldecode</code> can be used by your own programs (easily ifthey are an actual shell,awk or perl script) to do the decoding.Invoke it as:<pre>/cse/www/htbin-post/urldecode</pre>and it will convert any encoded data read from its standard input intoits original form on its standard output. At some point, I'll add aobject module you can link with to do this from a compiled langaugelike C (although you may get there before me, since the encoding is sosimple).<p>More details are available about writing server scripts in the <ahref="http://hoohoo.ncsa.uiuc.edu/cgi/">Common Gateway Interfacedocumentation</a>, where a number of other environment variables thatare available to the program are described.<p><h2>How to do this locally</h2>Currently, only two areas have been named using ScriptAliases. One ofthem is not currently publically accessible (the filesystem itresides on is not exported to the rest of the department's machines).The other is intended for use by participants in the 590i seminartaking place this quarter:<pre>	/projects/ai/590i/post-bin</pre>An example program, called <code>mail-request</code>, is already there(its a shell/awk script). This is the program I use for the interfaceto my <a href=/homes/pauld/music>music collection</a>, so take a lookat the HTML source for that stuff to see how this is used. I intendedit to be usable by anyone else, and a for a variety ofpurposes. Suggestions are welcome.<p>The HTTP daemon will be restarted about 4 times a day, and on the nextrestart after you have placed a program there, you will able to have alink to it result in its execution. After that, you can keep changingthe program in any way you see fit, and the daemon won't care - itmerely notes prescence or absence.<p><em>I want to reiterate that this is potentially a big security issue.Please take care in how you handle arguments, how to handle input andwhat your program does or might do</em>.<p>For the time being, all instances of these programs will run as theuid "nobody". Also access to both areas (the private one and the 590iarea) is currently limited to machines in the .cs.washington.edudomain. This restriction is inconvenient, and intended to create atemporary breathing space so that we can get more experience withpotential security issues.<p></body><address>webmaster@cs.washington.edu</address></html>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?