📄 ch49.htm
字号:
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-stop</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Indicates a file containing stopwords (words too common to be indexed), usually defined
in <TT>src/ir/stoplist.c</TT>.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-t</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Data file type indicator.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>-T</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Sets the type of data to whatever follows.</TD>
</TR>
</TABLE>
<BR>
The <TT>waisindex</TT> program has to be told the type of information in a file,
otherwise it may not be able to generate an index properly. Many file types are currently
defined with freeWAIS, and you can display them by entering the command</P>
<PRE><FONT COLOR="#0066FF">waisindex
</FONT></PRE>
<P>with no argument. Although many different types are supported by freeWAIS, only
a few are really in common use. The most common file types supported by freeWAIS
are the following:<BR>
<TABLE BORDER="0" WIDTH="100%">
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>filename</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Same as text, except the filename is used as the headline.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>first_line</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Same as text, except the first line in the file is used as the headline.</TD>
</TR>
<TR>
<TD WIDTH="9%" HEIGHT="27" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" HEIGHT="27" VALIGN="TOP"><TT>ftp</TT></TD>
<TD WIDTH="76%" HEIGHT="27" VALIGN="TOP">Contains FTP code that users can use to retrieve information from another machine.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>GIF</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">GIF images, one image per file. The filename is used as the headline.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>mail_</TT>or<TT>_rmail</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Indexes the <TT>mbox</TT> mailbox contents as individual items.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>mail_digest</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Standard e-mail, indexed as individual messages. The subject field is the headline.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>netnews</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Standard Usenet news, each article a separate item. The subject field is the headline.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>one_line</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Indexes each sentence in a document separately.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>PICT</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">PICT image, one image per file. The filename is used as the headline.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>ps</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">A PostScript file with one document per file.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>text</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">Indexes the file as one document, the pathname as the heading.</TD>
</TR>
<TR>
<TD WIDTH="9%" VALIGN="TOP">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="15%" VALIGN="TOP"><TT>TIFF</TT></TD>
<TD WIDTH="76%" VALIGN="TOP">TIFF image, one image per file. The filename is used as the headline.</TD>
</TR>
</TABLE>
<BR>
To tell <TT>waisindex</TT> the type of file to be examined, use the <TT>-t</TT> option
followed by the proper type. For example, to index standard ASCII text, you could
use the following command:</P>
<PRE><FONT COLOR="#0066FF">waisindex -t text -r /usr/waisdata/*
</FONT></PRE>
<P>This command indexes all the files in <TT>/usr/waisdata</TT> recursively, assuming
they are all ASCII files.
<DL>
<DT></DT>
</DL>
<DL>
<DD>
<HR>
<A NAME="Heading14<FONT COLOR="#000077"><B>NOTE:</B></FONT> When a document
has been indexed, any changes in the document will not be reflected in the WAIS index
unless a complete reindex is performed. Using the <TT>-a</TT> option does not update
existing index entries. Instead, start the index process again. You should do this
at periodic intervals as a matter of course.
<HR>
</DL>
<CENTER>
<H4><A NAME="Heading15<FONT COLOR="#000077">Getting Fancy</FONT></H4>
</CENTER>
<P>You can provide some extra features for users of your freeWAIS service in a number
of ways. Although this section is not exhaustive by any means, it shows you two of
the easily imple-mentable features that make a WAIS site more attractive.</P>
<P>To begin, suppose you want to make video, graphics, or audio available on a particular
subject. Suppose, for example, your site deals with musical instruments, and you
have several documents on violins. You may want to provide an audio clip of a violin
being played, a video of the making of a violin body, or a graphic image of a Stradivarius
violin. To make these extra files available, you should have all the files with the
same filename but different extensions. For example, if your primary document on
violins is called <TT>violins.txt</TT>, you may have the following files in the WAIS
directories:<BR>
<TABLE BORDER="0" WIDTH="100%">
<TR>
<TD WIDTH="11%">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="20%"><TT>violins.TEXT</TT></TD>
<TD WIDTH="69%">Document describing violins</TD>
</TR>
<TR>
<TD WIDTH="11%">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="20%"><TT>violins.TIFF</TT></TD>
<TD WIDTH="69%">Image of a Stradivarius</TD>
</TR>
<TR>
<TD WIDTH="11%">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="20%"><TT>violins.MPEG</TT></TD>
<TD WIDTH="69%">Video of the making of a violin body</TD>
</TR>
<TR>
<TD WIDTH="11%">
<UL>
<LI> 
</UL>
</TD>
<TD WIDTH="20%"><TT>violins.MIDI</TT></TD>
<TD WIDTH="69%">MIDI file of a violin being played</TD>
</TR>
</TABLE>
<BR>
All these files should have the same root name (<TT>violins</TT>) but different types
(recognized by <TT>waisindex</TT>). Then, you have to associate the multimedia files
with the document file. You can do this with the following command:</P>
<PRE><FONT COLOR="#0066FF">waisindex -d violin -M TEXT,TIFF,MPEG,MIDI -export /usr/waisdata/violin/*
</FONT></PRE>
<P>This tells <TT>waisindex</TT> that all four types of files are to be handled.
When a user searches for the keyword "violin," all four types of files
will be matched, and options on the browser may let them play, view, or hear the
non-text components.</P>
<P>Another common feature is the use of synonyms to account for different methods
of specifying a subject. For example, a scientist may use the keyword "feline"
when a non-scientist may use "cat." You want to be able to match these
two words to the same thing. This is done through a file called <TT>SOURCE.syn</TT>,
which is automatically read by the search engine when it is working. The <TT>SOURCE.syn</TT>
file has the format</P>
<PRE><FONT COLOR="#0066FF">word synonym [synonym ...]
</FONT></PRE>
<P>where word is the word to be used to search the databases, and synonym is the
word(s) that should match it. For example, if you are dealing with domestic pets
in your WAIS site, you may have the following entries in the <TT>SOURCE.syn</TT>
file:</P>
<PRE><FONT COLOR="#0066FF">cat feline
dog canine hound pooch
bird parrot budgie
</FONT></PRE>
<P>The synonym file can be very useful when people use different terms to refer to
the same thing. An easy way to check for the need for synonyms is to set the logging
option for <TT>waisindex</TT> to <TT>10</TT> for a while, and see what words people
are using on your site. Don't keep it on too long, though, because the logfiles can
become enormous with a little traffic.
<CENTER>
<H3><A NAME="Heading16<FONT COLOR="#000077">Summary</FONT></H3>
</CENTER>
<P>Now that WAIS is up and running on your server, you can go about the process of
building your index files and letting others access your server. WAIS is quite easy
to manage, and offers a good way of letting other users access your system's documents.
The alternative approach, for text-based systems, is Gopher, which we examine in
the next chapter.
</td>
</tr>
</table>
<!-- begin footer information -->
</body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -