📄 searchtohtml.java

📁 站内文本搜索Java小程序 SearchToHTML applet小程序可以让你对指定的若干文件进行文本搜索
💻 JAVA
📖 第 1 页 / 共 2 页
字号:
12 下一页
//SearchToHTML copyright (c) 1999-2000 David Faden. All rights reserved.
//The applet and code are distributed as linkware...
//If you use this applet or a variant on its code,
//you must include a link to The Gilbert Post, 
//http://www.geocities.com/Athens/Parthenon/1911/
//on your site.
//
//The Gilbert Post and David Faden take no responsibility
//for anything bad that happens as a result of using this applet.
//Please send reports of problems to gilbertnews@hotmail.com, anyway, though.

import java.awt.*;
import java.applet.*;
import java.io.*;
import java.net.*;
import java.util.*;

//These classes are designed to work with Netscape 3.x+ and so use JDK 1.0
//SearchToHTML also requires that the user have JavaScript turned on.
//I suggest writing the applet tag with JavaScript.

//
// 5/31/1999 fixed a bug in showPage(URL url) (discovered by Dave Langers) that 
// caused the page to be loaded in both
// the target and top (default) windows
// Also, fixed a bug that might cause the progress bar to show 100% (after an errored 
// SearchThread called foundNoMatch(int i)) while files were still being searched.
//
// 6/2/1999 fixed a very stupid bug (discovered by Dave Langers)
// on my part that made it impossible for queries
// containing uppercase characters to ever be found -- the scanned lines were
// converted to lowercase while the keywords/phrases were not
//
// 6/4/1999 renamed to AdvSiteSearcher to differentiate from older SiteSearcher
// Plan to release one more version of SiteSearcher using SearchSieves
//
// 6/9/1999 made many major revisions: renamed SearchThread to DocSearcher because
// it is no longer a subclass of Thread (it instead implements Runnable), polished
// the use of SearchSieves, began tentative support for caching, added the ability
// to demand exact matches, and to ignore text in between lesser than and greater than
// signs (probably HTML)
//
// 6/11/1999 began work on a stripped down version of AdvSiteSearcher
// designed to be much smaller (hopefully a little faster) and to output HTML
// via JavaScript (but without using direct communication; no com.netscape.JSObject needed)
//
// 6/12/1999 renamed it to SearchToHTML...fixed a "bug" that caused the applet to 
// ignore the user defined height and width.
// Also finished converting AdvSiteSearcher to SearchToHTML, uncommented SearchSieve.reset(),
// Renamed the new specialized version of DocSearcher, HDocSearcher...added ability to capture
// the context of a match, the title (of an HTML doc), and the closest anchor to a match.
//
// 7/28/1999 added text changing parameters...mostly for internationalization...
// Also, gave up and cut out the code that was supposed to draw a componentless progressbar.
//
// 4/12/2000 fixed a "Y2K bug reported by several alert users...  I am not sure what 
// I was thinking when I wrote the portion of code calling Date.getYear()... Perhaps that it
// returns the decade? Anyway, in reality, getYear() returns the number of years
// since 1900. Files with modification dates beyond 1999 were listed with dates greater than
// 99 (100 for 2000).
// Note: the whole Date class is deprecated in JDK 1.1
// The code actually changed is found in HDocSearcher.java.
// 
// 4/12/2000 added code that causes the HDocSearcher's runner Thread to wait
// when it is not "doing anything." This should be more efficient than in the
// previous incarnation, where runner would sleep, then periodically wake up to 
// see if there was anything to search.
// 
// 5/3/2000 Patrick Fourneret reported that escape sequences were visible in
// the titles of his search results.  Originally, I actually considered this a
// feature - making sure that the title would display correctly on all browsers.
// After having this "feature" pointed out though, I see that it is actually a 
// problem. The fix was simple - I simply stopped calling makeHTMLSafe on the title.
// I also corrected the spelling of "exclude."  I fixed this several months ago,
// but apparently I ended up switching to an older code base somehow. 
//
// Jul 12 2000 I added a kludgy method to HDocSearcher which will
// finish extracting the title from a document even if a match is
// found within the title. I had been reminded of this behavior several
// times before, but it was Danny Narayan's complaint that spurred me to action.
// See HDocSearcher.finishTitle(StringBuffer).
//
// Jul 12 2000 changed the names of SearchToHTML's methods foundMatch and 
// foundTitle to receiveMatch and receiveTitle respectively. The former names
// seemed unfortunately confusing. Added a new method boolean hasTitle(int index)
// to SearchToHTML which returns whether the title of the document at index 
// has been found.
//
// Jul 12 2000 corrected an embarassing error in ReadMe.html - most of the
// text was recycled from SiteSearcher's ReadMe. Unfortunately, some portions
// of the text that don't apply to SearchToHTML made it through. 
//
// Jul 13 2000 I seem to be writing a lot broken sentences in this bug log.
// But that okay.
// Changed the name of the method "foundNoMatch" to "receiveNoMatch." Again,
// I think that the former name was misleading. Added two new parameters to
// deal with expanded context capabilities: leadingContextLength and
// trailingContextLength - leadingContextLength is very misleadingly named.
// I will probably change it tomorrow. The new parameters I was alluding to
// are "leadingcontextlength" and "trailingcontextlength". Not yet documented! 
// I added a new method to HDocSearcher.java: appendTrailingContext(StringBuffer)
// and changed HDocSearcher's constructor in connection with the new trailing context
// stuff.
//
// Jul 14 2000 Added two new parameters: "xhtml_chkbx_checked" and "exact_chkbx_checked"
// Setting each of these to true will initially "check" the corresponding checkbox
// in the applet's user interface. (This worked well with the LineSearcherApplet.)
// Sorry, I've forgotten who suggested this.
//
// Jul 15 2000 Fixed "bugs" in HDocSearcher.java that would cause an 
// ArrayOutOfBoundsException to be thrown if leadingContextLength==0. Previous to a few
// edits ago, I had required that this value be greather than zero so the code's
// assumption had been a safe one. 
//
// Jul 19 2000 Cleaned up and updated the documentation.
//
// Jul 24 2000 Added a new parameter "max_num_matches" - no more than max_num_matches
// documents will be returned as matches to a search. The default value is the 
// the total number of documents. This parameter was suggested by Danny Narayan.
// Uncovered another bug: boolean searching was actually always false as a search was
// underway because search() called stopAllSearches() _after_ setting searching to true
// and stopAllSearches sets search to false. I should scuttle this code and get on with
// the next generation of applets.
//
// Jul 25 2000 Fixed a bug in appendTrailingContext(StringBuffer). The fix required that
// the method not append directly from the input stream to the context (this was the source
// of the problem) so I renamed appendTrailingContext(StringBuffer) to getTrailingContext().
//
// August 17, 2000 Modified the SearchSieve class - see SearchSieve.java for details.
// Updated and made compliant the ReadMe.

public class SearchToHTML extends Applet {
 private HDocSearcher[] workers;
 private String[] urls;
 private String[] pageinfo;//size, last modified
 private String[] titles;
 private int numreported=0; 
 private int numWorkers=0;
 
 /**
  * The number of matches reported for the current search.
  */
 private int numOfMatches=0;
 
 /**
  * The maximum number of matches that will be reported.
  */
 private int maxNumOfMatches;
 
 private Button b;
 private Checkbox HTMLbox,Exactbox;
 private TextField searchbox;
 private String target;
 private URL docbase;
 private boolean displayMessage;
 private String message;//Message to be displayed in applet
 //to let the user know what's happening before the GUI is finished being set up
 private static final String searchTokenSeparators="\" \t\r\n,";
 private Insets insets=new Insets(2,2,2,2);
 private StringBuffer results=new StringBuffer();
 private String resultspage;
 private boolean waitForAll;
 private String searchbase=null;
 private boolean searching=false;
 
 /**
  * The text for the button to start a search.
  */
 private String search_btn_txt;
 
 /**
  * The text for the button to stop a search.
  */
 private String stop_btn_txt;
 
 /**
  * Initialize the applet.
  * <br>
  * Read and parse parameters, give meaningful values
  * to class variables, and set up the user interface.
  */
 public void init() {
   //first initialize the variables
   URL docbase=getDocumentBase();
   searchbase=docbase.getProtocol()+"://"+docbase.getHost()+docbase.getFile();
   if(searchbase.lastIndexOf('.')>searchbase.lastIndexOf('/')) {
     searchbase=searchbase.substring(0,searchbase.lastIndexOf('/')+1);
   }
   
   int leadingContextLength=15;
   String contextLenStr=getParameter("contextsize");
   if (contextLenStr==null)
       contextLenStr=getParameter("leadingcontextlength");
   if(contextLenStr!=null) {
     try {
       leadingContextLength=Integer.parseInt(contextLenStr);
       if (leadingContextLength<0) 
           leadingContextLength=0;
     }
     catch(NumberFormatException nfe) {
       System.out.println("  Problem with contextsize/leadingcontextlength parameter.");
       nfe.printStackTrace();
     }
   }
   
   int trailingContextLength=0; //Keep default behavior of previous incantations.
   String trailingLenStr = getParameter("trailingcontextlength");
   if (trailingLenStr!=null) {
       try {
           trailingContextLength=Integer.parseInt(trailingLenStr);
           if (trailingContextLength<0)
               trailingContextLength=0;
       }
       catch (NumberFormatException nfe) {
           System.out.println("  Invalid value for trailingcontextlength parameter.");
           nfe.printStackTrace();
       }
   }
   
   String files=getParameter("files");
   if (files!=null) {
     StringTokenizer st=new StringTokenizer(files,"\n\r \t,",false);
     int num=st.countTokens();
     urls=new String[num];
     workers=new HDocSearcher[num];
     pageinfo=new String[num];
     titles=new String[num];
     numWorkers=num;
     String maxNumStr = getParameter("max_num_matches");
     if (maxNumStr==null)
        maxNumOfMatches = numWorkers;
     else {
        try {
            maxNumOfMatches = Integer.parseInt(maxNumStr);
            if (maxNumOfMatches>numWorkers)
                maxNumOfMatches = numWorkers;
            //XXX! Allow ridiculous, negative numbers.
            //Why would someone want to do this? Who knows?
        }
        catch (NumberFormatException nfe) {
            System.err.println("Invalid value for \"max_num_matches\" parameter");
            nfe.printStackTrace();
        }
     }
     String currToken;
     URL cURL=null;
     for(int i=0;i<num;i++) {
        currToken=st.nextToken();
        pageinfo[i]="";
        titles[i]="";
        urls[i]=new String(currToken);
        try {
          cURL=new URL(docbase,currToken);
          workers[i]=new HDocSearcher(this,cURL,i,leadingContextLength, trailingContextLength);
        }
        catch(MalformedURLException mued) {
          urls[i]="";
          cURL=null;
          //XXX! waste an Object
          //This needs to change!
          workers[i]=new HDocSearcher(this,cURL,i,0,0);
          workers[i].setErrored();
          System.out.println(mued);
        }
     }
   }
   else { 
     displayMessage=true; 
     System.out.println("SearchToHTML Applet can\'t start");
     System.out.println("Missing required parameter: files"); 
     message="Can\'t continue: missing the \"files\" parameter.";
     repaint();
     return;
   }
   target=getParameter("target");
   if (target==null) 
       target="_top";
   resultspage=getParameter("resultspage");
   if (resultspage==null) 
       resultspage="searchresults.html";
   waitForAll=("true".equalsIgnoreCase(getParameter("waitforall")));
   if ("_top".equals(target) || "_self".equals(target)) 
       waitForAll=true;
       
   //Set up GUI
   
   //Parameters to allow control over the text in the applet
   //  Though this needed ability is very easy to implement, I'm still faced with
   //  the dilemma of what to name the parameters. Perhaps this is a sign of my insanity,
   //  but I worry about whether to name them for their functionality (like "search_btn_txt") 
   //  or their English versions (like "Search_en")...for now, functionality:
   //  search_btn_txt
   //  stop_btn_txt
   //  xhtml_chkbx_txt
   //  exact_chkbx_txt
   //  searchbox_label_txt
   search_btn_txt=getParameter("search_btn_txt","Search");
   stop_btn_txt=getParameter("stop_btn_txt","Stop");
   String xhtml_chkbx_txt=getParameter("xhtml_chkbx_txt","Exclude HTML");
   String exact_chkbx_txt=getParameter("exact_chkbx_txt","Exact matches only");
   String searchbox_label_txt=getParameter("searchbox_label_txt","Search for:");
   
   //get color parameters
   Color color=null;
   if((color=getColor(getParameter("bgcolor")))!=null) setBackground(color);
   else setBackground(Color.gray);
   if((color=getColor(getParameter("fgcolor")))!=null) setForeground(color);
   else setForeground(Color.black);
   //Lots of Panels
   setLayout(new GridLayout(2,1));//searchbox,checkboxes
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -