📄 ch49.htm
字号:
can be handled by a normal user. If you intend to use port 210, you don't have to
specify the number in the command line, although the <TT>-p</TT> option still must
be used.</P>
<P>If you want to let <TT>inetd</TT> handle the <TT>waisserver</TT> startup, you
need to ensure that the file <TT>/etc/services</TT> has an entry for WAIS. The line
in the <TT>/etc/services</TT> file looks like</P>
<PRE><FONT COLOR="#0066FF">z3950 210/tcp #WAIS
</FONT></PRE>
<P>where <TT>210</TT> is the port number WAIS uses, and <TT>tcp</TT> is the protocol.
After modifying or verifying the entry in <TT>/etc/services</TT>, you need to add
a WAIS entry to the <TT>inetd.conf</TT> file to start up <TT>waisserver</TT> whenever
a request is received on port 210 (or whatever other port you are using). The entry
looks like</P>
<PRE><FONT COLOR="#0066FF">z3950 stream tcp nowait root/usr/local/bin/waisserver/waisserver.d
-u username -d /usr/wais/wais_index
</FONT></PRE>
<P>where the options are the same as for the command line startup mentioned previously.
The daemon <TT>waisserver.d</TT> is used when starting up in <TT>inetd</TT> mode,
instead of <TT>waisserver</TT>. Again you can use the <TT>-e</TT> option to log activity
to a file.
<CENTER>
<H3><A NAME="Heading10<FONT COLOR="#000077">Building Your WAIS Indexes</FONT></H3>
</CENTER>
<P>Once you have the freeWAIS server ready to run and everything seems to be working,
it's time to provide some content for your WAIS system. Usually, documents are the
primary source of information for WAIS, although you can index any type of file.
The key step to providing WAIS service is to build the WAIS index using the <TT>waisindex</TT>
command. The <TT>waisindex</TT> command can be a bit obtuse at times, but a little
practice and some trial-and-error fiddling will help you master its somewhat awkward
behavior.</P>
<P>The <TT>waisindex</TT> program works by examining all the data in the files in
which you want to create an index. From its examination, <TT>waisindex</TT> usually
generates seven different index files (depending on the content and your commands).
Each file holds a list of unique words in the documents. The different index files
are then combined into one large database, often called the "source" (or
"WAIS source"). Whenever a client WAIS package submits a search, the search
strings are compared to the source, and the results displayed with accuracy analysis
(the match score).
<DL>
<DT></DT>
</DL>
<DL>
<DD>
<HR>
<A NAME="Heading11<FONT COLOR="#000077"><B>NOTE: </B></FONT><FONT COLOR="#000000">T</FONT>he
use of <TT>waisindex</TT> enables a client search to proceed much more quickly because
the keywords in the data files have already been extracted. However, the mass of
data in the index files can be sizable, so allow plenty of disk space for a WAIS
server to work with. (For a typical WAIS site, assume at least double the amount
of room needed for the source files.)
<HR>
</DL>
<CENTER>
<H4><A NAME="Heading12<FONT COLOR="#000077">WAIS Index Files</FONT></H4>
</CENTER>
<P>The freeWAIS index files are not usually readable by a system user (although one
or two files can be read with some success). Usually, <TT>waisindex</TT> creates
seven index files, although the number may vary depending on requirements. Each index
file has a specific file extension to show its purpose, based on a root name (specified
on the <TT>waisindex</TT> command line, or defaulting to <TT>index</TT>). The index
files and their purposes are listed here: <TT>index.doc</TT> A document file that
contains a table with the filename, a headline (title) from the file, the location
of the first and last characters of an entry, the length of the document, the number
of lines in the document, and the time and date the document was created.</P>
<P><TT>index.dct</TT> A dictionary file that contains a list of every unique word
in the files cross-indexed to the inverted file.</P>
<P><TT>index.fn</TT> A filename file that contains a table with a list of the filenames,
the date they were created in the index, and the type of file.</P>
<P><TT>index.hl</TT> A headline file that contains a table of all headlines (titles).
The headline is displayed in the search output when a match occurs.</P>
<P><TT>index.inv</TT> Inverted files that contain a table associating every unique
word in all the files with a pointer to the files themselves and the word's importance
(determined by how close the word is to the start of the file, the number of times
the word occurs in the document, and the percentage of times the word appears in
the document).</P>
<P><TT>index.src</TT> A source description file that contains descriptions of the
information indexed, including the host name and IP address, the port watched by
WAIS, the source filename, any cost information for the service, the headline of
the service, a description of the source, and the e-mail address of the administrator.
The source description file is editable by ASCII editors. We will look at this file
in a little more detail shortly.
<BLOCKQUOTE>
<P><TT>index.status</TT> A status file containing user-defined information.
</BLOCKQUOTE>
<P>The source description file is a standard ASCII file that is read by <TT>waisindex</TT>
at intervals to see if information has changed. If the changes are significant, <TT>waisindex</TT>
updates its internal information. A sample source file looks like this:</P>
<PRE><FONT COLOR="#0066FF"> (:source
:version 2
:ip-address "147.120.0.10"
:ip-name: "wizard.tpci.com"
:tcp-port 210
:database-name "Linux stuff"
:cost 0.00
:cost-unit: free
:maintainer "wais_help@tpci.com"
:subjects "Everything you need to know about Linux"
:description "If you need to know something about Linux, it's here."
</FONT></PRE>
<P>You should edit this file when you set up freeWAIS because the default descriptions
are rather sparse and useless.
<CENTER>
<H4><A NAME="Heading13<FONT COLOR="#000077">The waisindex Command</FONT></H4>
</CENTER>
<P>The <TT>waisindex</TT> command allows a number of options, some of which you have
seen earlier in this chapter. The following list contains the primary <TT>waisindex</TT>
options of interest to most users:<BR>
<TABLE BORDER="0" WIDTH="100%">
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-a</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Appends data to an existing index file (used to update index files instead of regenerating
them each time a new document is added).</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-contents</TT></TD>
<TD WIDTH="76%" VALIGN="TOP"><TT> </TT>Indexes the file contents (default action).</TD>
</TR>
<TR>
<TD WIDTH="9%" HEIGHT="27" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" HEIGHT="27" VALIGN="TOP"><TT>-d</TT></TD>
<TD WIDTH="76%" HEIGHT="27" VALIGN="TOP">Gives the filename root for index files (for example, <TT>-d /usr/wais/foo</TT> named
all index files as <TT>/usr/wais/foo.</TT>xxx).</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-e</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Gives the name of the log file for error information (default is <TT>stderr</TT>--usually
the console--although you can specify <TT>-s</TT> for <TT>/dev/null</TT>).</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-export</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Adds the host name and TCP port to descriptions for easier Internet access.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-l</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Gives the level of log messages. Valid values are <TT>0</TT>--no log, <TT>1</TT>--log
only high priority errors and warnings, <TT>5</TT>--log medium priority errors and
warnings, as well as index filename information, and <TT>10</TT>--log every event.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-M</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Links multiple types of files.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-mem</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Limits memory usage during indexing (the higher the number specified, the faster
the indexing process and the more memory used).</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-nocontents</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Prevents a file from being indexed (indexes only the document header and filename).</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-nopairs</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Instructs <TT>waisindex</TT> to ignore adjacent capitalized words from being indexed
together.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-nopos</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Ignores the location of keywords in a document when determining scores.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-pairs</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Indexes adjacent capitalized words as a single entry.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-pos</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Determines scores based on locations of keywords (proximity of keywords increases
scores).</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-r</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Recursive subdirectory indexing.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-register</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Registers your indexes with the WAIS Directory of Services.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-stdin</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Uses a filename from the keyboard instead of a filename on the command line.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -