⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 lsg44.htm

📁 linux-unix130.linux.and.unix.ebooks130 linux and unix ebookslinuxLearning Linux - Collection of 12 E
💻 HTM
📖 第 1 页 / 共 3 页
字号:
</FONT>







<TD VALIGN=top  BGCOLOR=#80FFFF ><FONT COLOR=#000080>







Sample indexer and service scripts</FONT>







</TABLE><P>Once you have fine-tuned the configuration file information, you can compile the freeWAIS source with the make command:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">make default</FONT></PRE>







<P>By default, the make utility compiles two clients called swais and waisq. If you want to compile an X version of WAIS called xwais (useful if you want to allow access from X terminals or consoles), uncomment the line in the Makefile that ends with makex.







<BR>







<BR>







<A NAME="E68E233"></A>







<H3 ALIGN=CENTER>







<CENTER>







<FONT SIZE=5 COLOR="#FF0000"><B>Setting Up freeWAIS</B></FONT></CENTER></H3>







<BR>







<P>When you have the compiled freeWAIS components installed and configured properly, you can begin setting up the WAIS index files to documents available on your system. Start by creating an index directory whose default name is wsindex. The directory usually resides just under the root of the filesystem (/wsindex), but many administrators like to keep it in a reserved area for the WAIS software (such as /usr/wais/wsindex). If the index files are difficult to locate, problems can result when users try to find them.







<BR>







<P>The wais-test directory created when you installed freeWAIS contains a script called test.waisindex that creates four WAIS index files for you automatically. You use these files to test the WAIS installation for proper functionality. They can also show you how to use the different search and index capabilities of freeWAIS. The following are the four index files:







<BR>







<UL>







<LI>The test-BOOL file is an index of three example documents that use the Boolean capabilities and synonyms.







<BR>







<BR>







<LI>The test-Comp file is an index that demonstrates compressed source file handling.







<BR>







<BR>







<LI>The test-Docs file is an index of files in the /doc directory that shows a recursive directory search.







<BR>







<BR>







<LI>The test-Multi file is an index of GIF images and demonstrates multidocument capabilities.







<BR>







<BR>







</UL>







<P>Only graphically based browsers (usually X-based) can handle the multidocument formats, although any type of browser should be able to handle the other three index formats.







<BR>







<P>Once you have verified that the indexing system works properly and all the components of freeWAIS are properly installed, you need to build an index file for the documents available on your system. You can do this with the waisindex command. The waisindex command enables you to index files two ways by using the -t option followed by one of these keywords:







<BR>







<UL>







<LI>The one_line keyword indexes each line of a document so a match can show the exact line the match occurred in.







<BR>







<BR>







<LI>The text keyword indexes whole documents so a match shows the entire document with no indication of the exact line the match occurred in. This option is the default.







<BR>







<BR>







</UL>







<P>The waisindex command takes arguments for the name of the destination index file (-d followed by the filename), and the directory or files to be indexed. For example, to index a directory called /usr/sales/sales_lit into a destination index file called sales using the one_line indexing approach, you would issue the following command:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">waisindex -d sales -t one_line /usr/sales/sales_lit</FONT></PRE>







<P>Because no path is provided for the sales index file in this example, it would be stored in the current directory.







<BR>







<P>Once you have started the WAIS server software (see &quot;Starting freeWAIS&quot; below), you can test newly created indexes. To test the indexes, use the waissearch command. For example, to look for the word &quot;WAIS&quot; in the index files, issue the following command:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">waissearch -p 210 -d index_file WAIS</FONT></PRE>







<P>In this example, -p gives the port number (default value is 210), and -d is the path to the index file. If the search was successful (and you have something that matches), you will see messages about the number of records returned, and the scores of each match. If you see error messages or nothing, check the configuration information and the index files.







<BR>







<P>A final step you can take if you want Internet users to be able to access your freeWAIS system is to issue the following command:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">waisindex -export -register Filenames</FONT></PRE>







<P>In this example, Filenames is the name of the index. This name is registered with the Directory of Servers at cnidr.org and quake.think.com. These addresses are reached automatically with the -register option. Do this step only if you want all Internet users to be accessing your WAIS service. (See the section &quot;The waisindex Command&quot; for more information on the waisindex command.)







<BR>







<P>If you want to allow clients to connect to your freeWAIS system with a WWW browser (such as Mosaic or Netscape) and access HTML sources on your system through WAIS, you must issue the following command:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">waisindex -d WWW -T HTML -contents -export /usr/resources/*html</FONT></PRE>







<P>This line enables WAIS clients to perform keyword searches on HTML documents as well.







<BR>







<P>If you want, you can set WAIS to allow only certain domains to connect to it. You can do this in the ir.h file, which has a line like the following:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">define SERVSECURITYFILE &quot;SERV_SEC&quot;</FONT></PRE>







<P>This line is commented out by default. Remove the comment symbol. You have to place a copy of an existing SERV_SEC file or one you create yourself in the same directory as the WAIS index files. If there is no SERV_SEC file accessible to WAIS, all domains are allowed access. (You can change the name of the file, of course, as long as the entry in ir.h matches the filename with quotation marks around it.)







<BR>







<P>Each ASCII entry in the SERV_SEC file follows a strict format for defining the domains that are granted access to WAIS. The format of each line is as follows:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">domain [IP address}</FONT></PRE>







<P>Each line has the domain name of the host to which you want to grant access, with its IP address an optional add-on to the line. If the domain name and IP address do not match, it doesn't matter because WAIS allows access to a match of either name or address. A sample SERV_SEC file looks likes this:







<BR>







<PRE>







<FONT COLOR="#000080">chatton.com







roy.sailing.org







bighost.bignet.com</FONT></PRE>







<P>Each of these three domain names can access WAIS, but any connection from a host without these domain names is refused.







<BR>







<P>The SERV_SEC file should be owned and accessible by the login name and group the freeWAIS system is run under (it should not be run as root to avoid security problems), and the file should be modifiable only by root. In other words, if you are letting freeWAIS run under the login waismgr, all the files should be owned by the user waismgr and that login's group (which ideally would be unique for extra security). The files should not have write access for user, group, or other (making root the only login that can write these files).







<BR>







<P>Similar to the SERVSECURITYFILE variable is DATASECURITYFILE, which controls access to the databases. Again, there is a line in the ir.h file that you should uncomment to look like the following:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">#define DATASECURITYFILE &quot;DATA_SEC&quot;</FONT></PRE>







<P>DATA_SEC is a file listing each database file and the domains that have access to it. The file should reside in the same directory as the index files. The format of the DATA_SEC file is as follows:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">database domain [IP address]</FONT></PRE>







<P>In this example, database is the name of the database the permissions refer to, and domain and optional IP address are the same as the SERV_SEC file. A sample DATA_SEC file looks like the following:







<BR>







<PRE>







<FONT COLOR="#000080">primary chatton.com







primary bignet.org







primary roy.sailing.org







sailing roy.sailing.org</FONT></PRE>







<P>In this example, three domains are granted access to a database called primary (note that primary is just a filename and has no special meaning), and one domain has access to the database called sailing. If you want to allow all hosts with access to the system (controlled by SERV_SEC) to access a particular database, you can use asterisks in the domain name and IP address fields. For example, the following entries allow anyone with access to WAIS to use the primary database, with one domain only allowed access to the sailing database:







<BR>







<PRE>







<FONT COLOR="#000080">primary *







sailing roy.sailing.org</FONT></PRE>







<P>In both the SERV_SEC and DATA_SEC files, you have to be careful with the IP addresses to avoid inadvertently granting access to hosts not wanted on your system. For example, if you specify the IP address 150.12 in your file, then any IP address from 150.12 through 150.120, 151.121, and so on are also granted access as they match the IP components. Specify IP addresses explicitly to avoid this problem.







<BR>







<BR>







<A NAME="E68E234"></A>







<H3 ALIGN=CENTER>







<CENTER>







<FONT SIZE=5 COLOR="#FF0000"><B>Starting freeWAIS</B></FONT></CENTER></H3>







<BR>







<P>As with the FTP services, you can set freeWAIS to start up when the system boots by using the rc files from the command line at any time, or you can have inetd start the processes when a service request arrives. If you want to start freeWAIS from the command line, you need to specify a number of options. A sample startup command line looks like this:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">waisserver -u username -p 210 -l 10 -d /usr/wais/wais_index</FONT></PRE>







<P>The -u option tells waisserver to run as the user <I>username</I> (which has to be a valid user in /etc/passwd, of course). The -p option tells waisserver what port to use (the default is 210, as shown in the /etc/services file). The -d option shows the default location of WAIS indexes. If you want to invoke logging of sessions to a file, use the -e option followed by the name of the logfile.







<BR>







<P>You should run waisserver as another user instead of root to prevent holes in the WAIS system being exploited by a hacker. If the service is run as a standard user (such as wais), only the files that the user would have access to are in jeopardy.







<BR>







<P>If the port for waisserver is set to 210, the service corresponds to the Internet standards for access. If you set the value to another port, you can configure the system for local area access only. If the port number is less than 1023, root must start and manage the WAIS service, but any port over 1023 can be handled by a normal user. If you intend to use port 210, you don't have to specify the number in the command line, although you must still use the -p option.







<BR>







<P>If you want to let inetd handle the waisserver startup, you need to ensure that the file /etc/services has an entry for WAIS. The line in the /etc/services file will look like this:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">z3950 210/tcp #WAIS</FONT></PRE>







<P>In this example, 210 is the port number WAIS uses, and tcp is the protocol. After modifying or verifying the entry in /etc/services, you need to add a WAIS entry to the inetd.conf file to start up waisserver whenever a request is received on port 210 (or whatever other port you are using). The entry looks like this:







<BR>







<BR>







<PRE>







<FONT COLOR="#000080">z3950 stream tcp nowait root







<BR>







 /usr/local/bin/waisserver/waisserver.d -u username -d /usr/wais/wais_index</FONT></PRE>







<P>The options are the same as for the command line startup mentioned earlier. The daemon waisserver.d is used when starting up in that mode, instead of waisserver. Again, you can use the -e option to log activity to a file.







<BR>







<BR>







<A NAME="E68E235"></A>







<H3 ALIGN=CENTER>







<CENTER>







<FONT SIZE=5 COLOR="#FF0000"><B>Building Your WAIS Indexes</B></FONT></CENTER></H3>







<BR>







<P>Once you have the freeWAIS server ready to run and everything seems to be working, it's time to provide some content for your WAIS system. Usually, documents are the primary source of information for WAIS, although you can index any type of file. The key step to providing WAIS service is to build the WAIS index using the waisindex command. The waisindex command can be a bit obtuse at times, but a little practice and some trial and error will help you master its somewhat awkward behavior.







<BR>







<P>The waisindex program works by examining all the data in the files for which you want to create an index. From its examination, waisindex usually generates seven different index files (depending on the content and your commands). Each file holds a list of unique words in the documents. The different index files are then combined into one large database, often called the source (or WAIS source). Whenever a client WAIS package submits a search, the search strings are compared to the source and the results displayed with accuracy analysis.







<BR>







<BLOCKQUOTE>







<BLOCKQUOTE>







<HR ALIGN=CENTER>







<BR>







<NOTE>The use of waisindex allows a client search to proceed much faster because the keywords in the data files have already been extracted. However, the mass of data in the index files can be sizable, so allow lots of disk space for a WAIS server to work with.</NOTE>







<BR>







<HR ALIGN=CENTER>







</BLOCKQUOTE></BLOCKQUOTE>







<BR>







<A NAME="E69E251"></A>







<H4 ALIGN=CENTER>







<CENTER>







<FONT SIZE=4 COLOR="#FF0000"><B>WAIS Index Files</B></FONT></CENTER></H4>







<BR>







<P>A system user usually cannot read the freeWAIS index files (although one or two files can be read with some success). Usually, waisindex creates seven index files, although the number may vary depending on requirements. The index files all have a specific file extension to show their purpose, based on a root name (specified on the waisindex command line, or defaulting to &quot;index&quot;). The index files and their purposes are described in the following list:







<BR>







<UL>







<LI>The index.doc document file contains a table with the filename, a headline (title) from the file, the location of the first and last characters of an entry, the length of the document, the number of lines in the document, and the time and date the document was created.







<BR>







<BR>







<LI>The index.dct dictionary file contains a list of every unique word in the files cross-indexed to the inverted file.







<BR>







<BR>







<LI>The index.fn filename file contains a table with a list of the filenames, the date they were created in the index, and the type of file.







<BR>







<BR>







<LI>The index.hl headline file contains a table of all headlines (titles). The headline is displayed in the search output when a match occurs.







<BR>







<BR>







<LI>The index.inv inverted file contains a table associating every unique word in all the files with a pointer to the files themselves and the word's importance (determined based on how close the word is to the start of the file, the number of times the word occurs in the document, and the percentage of times the word appears in the document).







<BR>







<BR>







<LI>The index.src source description file contains descriptions of the information indexed, including the host name and IP address, the port watched by WAIS, the source filename, any cost information for the service, the headline of the service, a description of the source, and the e-mail address of the administrator. ASCII editors can edit the source description file. You will look at this file in a little more detail shortly.







<BR>







<BR>







<LI>The index.status status file contains user-defined information.







<BR>







<BR>







</UL>







<P>The source description file is a standard ASCII file that is read by waisindex at intervals to see if information has changed. If the changes are significant, waisindex updates its internal information. A type source file looks like this:







<BR>







<PRE>







<FONT COLOR="#000080">(:source







 :version 2







 :ip-address &quot;147.120.0.10&quot;







 :ip-name: &quot;wizard.tpci.com&quot;







 :tcp-port 210







 :database-name &quot;Linux stuff&quot;







⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -