⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch49.htm

📁 linux-unix130.linux.and.unix.ebooks130 linux and unix ebookslinuxLearning Linux - Collection of 12 E
💻 HTM
📖 第 1 页 / 共 4 页
字号:


can be handled by a normal user. If you intend to use port 210, you don't have to



specify the number in the command line, although the <TT>-p</TT> option still must



be used.</P>



<P>If you want to let <TT>inetd</TT> handle the <TT>waisserver</TT> startup, you



need to ensure that the file <TT>/etc/services</TT> has an entry for WAIS. The line



in the <TT>/etc/services</TT> file looks like</P>



<PRE><FONT COLOR="#0066FF">z3950   210/tcp   #WAIS



</FONT></PRE>



<P>where <TT>210</TT> is the port number WAIS uses, and <TT>tcp</TT> is the protocol.



After modifying or verifying the entry in <TT>/etc/services</TT>, you need to add



a WAIS entry to the <TT>inetd.conf</TT> file to start up <TT>waisserver</TT> whenever



a request is received on port 210 (or whatever other port you are using). The entry



looks like</P>



<PRE><FONT COLOR="#0066FF">z3950   stream   tcp   nowait   root/usr/local/bin/waisserver/waisserver.d 



-u  username -d /usr/wais/wais_index



</FONT></PRE>



<P>where the options are the same as for the command line startup mentioned previously.



The daemon <TT>waisserver.d</TT> is used when starting up in <TT>inetd</TT> mode,



instead of <TT>waisserver</TT>. Again you can use the <TT>-e</TT> option to log activity



to a file.



<CENTER>



<H3><A NAME="Heading10<FONT COLOR="#000077">Building Your WAIS Indexes</FONT></H3>



</CENTER>



<P>Once you have the freeWAIS server ready to run and everything seems to be working,



it's time to provide some content for your WAIS system. Usually, documents are the



primary source of information for WAIS, although you can index any type of file.



The key step to providing WAIS service is to build the WAIS index using the <TT>waisindex</TT>



command. The <TT>waisindex</TT> command can be a bit obtuse at times, but a little



practice and some trial-and-error fiddling will help you master its somewhat awkward



behavior.</P>



<P>The <TT>waisindex</TT> program works by examining all the data in the files in



which you want to create an index. From its examination, <TT>waisindex</TT> usually



generates seven different index files (depending on the content and your commands).



Each file holds a list of unique words in the documents. The different index files



are then combined into one large database, often called the &quot;source&quot; (or



&quot;WAIS source&quot;). Whenever a client WAIS package submits a search, the search



strings are compared to the source, and the results displayed with accuracy analysis



(the match score).







<DL>



	<DT></DT>



</DL>











<DL>



	<DD>



<HR>



<A NAME="Heading11<FONT COLOR="#000077"><B>NOTE: </B></FONT><FONT COLOR="#000000">T</FONT>he



	use of <TT>waisindex</TT> enables a client search to proceed much more quickly because



	the keywords in the data files have already been extracted. However, the mass of



	data in the index files can be sizable, so allow plenty of disk space for a WAIS



	server to work with. (For a typical WAIS site, assume at least double the amount



	of room needed for the source files.) 



<HR>







</DL>







<CENTER>



<H4><A NAME="Heading12<FONT COLOR="#000077">WAIS Index Files</FONT></H4>



</CENTER>



<P>The freeWAIS index files are not usually readable by a system user (although one



or two files can be read with some success). Usually, <TT>waisindex</TT> creates



seven index files, although the number may vary depending on requirements. Each index



file has a specific file extension to show its purpose, based on a root name (specified



on the <TT>waisindex</TT> command line, or defaulting to <TT>index</TT>). The index



files and their purposes are listed here: <TT>index.doc</TT> A document file that



contains a table with the filename, a headline (title) from the file, the location



of the first and last characters of an entry, the length of the document, the number



of lines in the document, and the time and date the document was created.</P>



<P><TT>index.dct</TT> A dictionary file that contains a list of every unique word



in the files cross-indexed to the inverted file.</P>



<P><TT>index.fn</TT> A filename file that contains a table with a list of the filenames,



the date they were created in the index, and the type of file.</P>



<P><TT>index.hl</TT> A headline file that contains a table of all headlines (titles).



The headline is displayed in the search output when a match occurs.</P>



<P><TT>index.inv</TT> Inverted files that contain a table associating every unique



word in all the files with a pointer to the files themselves and the word's importance



(determined by how close the word is to the start of the file, the number of times



the word occurs in the document, and the percentage of times the word appears in



the document).</P>



<P><TT>index.src</TT> A source description file that contains descriptions of the



information indexed, including the host name and IP address, the port watched by



WAIS, the source filename, any cost information for the service, the headline of



the service, a description of the source, and the e-mail address of the administrator.



The source description file is editable by ASCII editors. We will look at this file



in a little more detail shortly.











<BLOCKQUOTE>



	<P><TT>index.status</TT> A status file containing user-defined information.







</BLOCKQUOTE>







<P>The source description file is a standard ASCII file that is read by <TT>waisindex</TT>



at intervals to see if information has changed. If the changes are significant, <TT>waisindex</TT>



updates its internal information. A sample source file looks like this:</P>



<PRE><FONT COLOR="#0066FF"> (:source



   :version 2



   :ip-address &quot;147.120.0.10&quot;



   :ip-name: &quot;wizard.tpci.com&quot;



   :tcp-port 210



   :database-name &quot;Linux stuff&quot;



   :cost 0.00



   :cost-unit: free



   :maintainer &quot;wais_help@tpci.com&quot;



   :subjects &quot;Everything you need to know about Linux&quot;



   :description &quot;If you need to know something about Linux, it's here.&quot;



</FONT></PRE>



<P>You should edit this file when you set up freeWAIS because the default descriptions



are rather sparse and useless.



<CENTER>



<H4><A NAME="Heading13<FONT COLOR="#000077">The waisindex Command</FONT></H4>



</CENTER>



<P>The <TT>waisindex</TT> command allows a number of options, some of which you have



seen earlier in this chapter. The following list contains the primary <TT>waisindex</TT>



options of interest to most users:<BR>







<TABLE BORDER="0" WIDTH="100%">



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-a</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Appends data to an existing index file (used to update index files instead of regenerating



			them each time a new document is added).</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-contents</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP"><TT> </TT>Indexes the file contents (default action).</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" HEIGHT="27" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" HEIGHT="27" VALIGN="TOP"><TT>-d</TT></TD>



		<TD WIDTH="76%" HEIGHT="27" VALIGN="TOP">Gives the filename root for index files (for example, <TT>-d /usr/wais/foo</TT> named



			all index files as <TT>/usr/wais/foo.</TT>xxx).</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-e</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Gives the name of the log file for error information (default is <TT>stderr</TT>--usually



			the console--although you can specify <TT>-s</TT> for <TT>/dev/null</TT>).</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-export</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Adds the host name and TCP port to descriptions for easier Internet access.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-l</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Gives the level of log messages. Valid values are <TT>0</TT>--no log, <TT>1</TT>--log



			only high priority errors and warnings, <TT>5</TT>--log medium priority errors and



			warnings, as well as index filename information, and <TT>10</TT>--log every event.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-M</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Links multiple types of files.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-mem</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Limits memory usage during indexing (the higher the number specified, the faster



			the indexing process and the more memory used).</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-nocontents</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Prevents a file from being indexed (indexes only the document header and filename).</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-nopairs</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Instructs <TT>waisindex</TT> to ignore adjacent capitalized words from being indexed



			together.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-nopos</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Ignores the location of keywords in a document when determining scores.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-pairs</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Indexes adjacent capitalized words as a single entry.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-pos</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Determines scores based on locations of keywords (proximity of keywords increases



			scores).</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-r</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Recursive subdirectory indexing.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-register</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Registers your indexes with the WAIS Directory of Services.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>



				<LI>&#160;



			</UL>



		</TD>



		<TD WIDTH="15%" VALIGN="TOP"><TT>-stdin</TT></TD>



		<TD WIDTH="76%" VALIGN="TOP">Uses a filename from the keyboard instead of a filename on the command line.</TD>



	</TR>



	<TR>



		<TD WIDTH="9%" VALIGN="TOP">



			<UL>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -