⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch49.htm

📁 linux-unix130.linux.and.unix.ebooks130 linux and unix ebookslinuxLearning Linux - Collection of 12 E
💻 HTM
📖 第 1 页 / 共 4 页
字号:


				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-stop</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Indicates a file containing stopwords (words too common to be indexed), usually defined



			in <TT>src/ir/stoplist.c</TT>.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-t</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Data file type indicator.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-T</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Sets the type of data to whatever follows.</TD>



	</TR>



</TABLE>



<BR>



The <TT>waisindex</TT> program has to be told the type of information in a file,



otherwise it may not be able to generate an index properly. Many file types are currently



defined with freeWAIS, and you can display them by entering the command</P>



<PRE><FONT COLOR="#0066FF">waisindex



</FONT></PRE>



<P>with no argument. Although many different types are supported by freeWAIS, only



a few are really in common use. The most common file types supported by freeWAIS



are the following:<BR>







<TABLE BORDER="0" WIDTH="100%">



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>filename</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Same as text, except the filename is used as the headline.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>first_line</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Same as text, except the first line in the file is used as the headline.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" HEIGHT="27" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" HEIGHT="27" VALIGN="TOP"><TT>ftp</TT></TD>



		<TD WIDTH="76%" HEIGHT="27" VALIGN="TOP">Contains FTP code that users can use to retrieve information from another machine.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>GIF</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">GIF images, one image per file. The filename is used as the headline.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>mail_</TT>or<TT>_rmail</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Indexes the <TT>mbox</TT> mailbox contents as individual items.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>mail_digest</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Standard e-mail, indexed as individual messages. The subject field is the headline.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>netnews</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Standard Usenet news, each article a separate item. The subject field is the headline.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>one_line</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Indexes each sentence in a document separately.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>PICT</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">PICT image, one image per file. The filename is used as the headline.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>ps</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">A PostScript file with one document per file.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>text</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Indexes the file as one document, the pathname as the heading.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>TIFF</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">TIFF image, one image per file. The filename is used as the headline.</TD>



	</TR>



</TABLE>



<BR>



To tell <TT>waisindex</TT> the type of file to be examined, use the <TT>-t</TT> option



followed by the proper type. For example, to index standard ASCII text, you could



use the following command:</P>



<PRE><FONT COLOR="#0066FF">waisindex -t text -r /usr/waisdata/*



</FONT></PRE>



<P>This command indexes all the files in <TT>/usr/waisdata</TT> recursively, assuming



they are all ASCII files.







<DL>



	<DT></DT>



</DL>











<DL>



	<DD>



<HR>



<A NAME="Heading14<FONT COLOR="#000077"><B>NOTE:</B></FONT> When a document



	has been indexed, any changes in the document will not be reflected in the WAIS index



	unless a complete reindex is performed. Using the <TT>-a</TT> option does not update



	existing index entries. Instead, start the index process again. You should do this



	at periodic intervals as a matter of course. 



<HR>







</DL>







<CENTER>



<H4><A NAME="Heading15<FONT COLOR="#000077">Getting Fancy</FONT></H4>



</CENTER>



<P>You can provide some extra features for users of your freeWAIS service in a number



of ways. Although this section is not exhaustive by any means, it shows you two of



the easily imple-mentable features that make a WAIS site more attractive.</P>



<P>To begin, suppose you want to make video, graphics, or audio available on a particular



subject. Suppose, for example, your site deals with musical instruments, and you



have several documents on violins. You may want to provide an audio clip of a violin



being played, a video of the making of a violin body, or a graphic image of a Stradivarius



violin. To make these extra files available, you should have all the files with the



same filename but different extensions. For example, if your primary document on



violins is called <TT>violins.txt</TT>, you may have the following files in the WAIS



directories:<BR>







<TABLE BORDER="0" WIDTH="100%">



	<TR>



		<TD WIDTH="11%">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="20%"><TT>violins.TEXT</TT></TD>



		<TD WIDTH="69%">Document describing violins</TD>



	</TR>



	<TR>



		<TD WIDTH="11%">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="20%"><TT>violins.TIFF</TT></TD>



		<TD WIDTH="69%">Image of a Stradivarius</TD>



	</TR>



	<TR>



		<TD WIDTH="11%">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="20%"><TT>violins.MPEG</TT></TD>



		<TD WIDTH="69%">Video of the making of a violin body</TD>



	</TR>



	<TR>



		<TD WIDTH="11%">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="20%"><TT>violins.MIDI</TT></TD>



		<TD WIDTH="69%">MIDI file of a violin being played</TD>



	</TR>



</TABLE>



<BR>



All these files should have the same root name (<TT>violins</TT>) but different types



(recognized by <TT>waisindex</TT>). Then, you have to associate the multimedia files



with the document file. You can do this with the following command:</P>



<PRE><FONT COLOR="#0066FF">waisindex -d violin -M TEXT,TIFF,MPEG,MIDI -export /usr/waisdata/violin/*



</FONT></PRE>



<P>This tells <TT>waisindex</TT> that all four types of files are to be handled.



When a user searches for the keyword &quot;violin,&quot; all four types of files



will be matched, and options on the browser may let them play, view, or hear the



non-text components.</P>



<P>Another common feature is the use of synonyms to account for different methods



of specifying a subject. For example, a scientist may use the keyword &quot;feline&quot;



when a non-scientist may use &quot;cat.&quot; You want to be able to match these



two words to the same thing. This is done through a file called <TT>SOURCE.syn</TT>,



which is automatically read by the search engine when it is working. The <TT>SOURCE.syn</TT>



file has the format</P>



<PRE><FONT COLOR="#0066FF">word   synonym [synonym ...]



</FONT></PRE>



<P>where word is the word to be used to search the databases, and synonym is the



word(s) that should match it. For example, if you are dealing with domestic pets



in your WAIS site, you may have the following entries in the <TT>SOURCE.syn</TT>



file:</P>



<PRE><FONT COLOR="#0066FF">cat    feline 



dog    canine hound pooch



bird   parrot budgie 



</FONT></PRE>



<P>The synonym file can be very useful when people use different terms to refer to



the same thing. An easy way to check for the need for synonyms is to set the logging



option for <TT>waisindex</TT> to <TT>10</TT> for a while, and see what words people



are using on your site. Don't keep it on too long, though, because the logfiles can



become enormous with a little traffic.



<CENTER>



<H3><A NAME="Heading16<FONT COLOR="#000077">Summary</FONT></H3>



</CENTER>



<P>Now that WAIS is up and running on your server, you can go about the process of



building your index files and letting others access your server. WAIS is quite easy



to manage, and offers a good way of letting other users access your system's documents.



The alternative approach, for text-based systems, is Gopher, which we examine in



the next chapter.



















</td>
</tr>
</table>

<!-- begin footer information -->



</body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -