📄 searchtohtml.java
字号:
//SearchToHTML copyright (c) 1999-2000 David Faden. All rights reserved.
//The applet and code are distributed as linkware...
//If you use this applet or a variant on its code,
//you must include a link to The Gilbert Post,
//http://www.geocities.com/Athens/Parthenon/1911/
//on your site.
//
//The Gilbert Post and David Faden take no responsibility
//for anything bad that happens as a result of using this applet.
//Please send reports of problems to gilbertnews@hotmail.com, anyway, though.
import java.awt.*;
import java.applet.*;
import java.io.*;
import java.net.*;
import java.util.*;
//These classes are designed to work with Netscape 3.x+ and so use JDK 1.0
//SearchToHTML also requires that the user have JavaScript turned on.
//I suggest writing the applet tag with JavaScript.
//
// 5/31/1999 fixed a bug in showPage(URL url) (discovered by Dave Langers) that
// caused the page to be loaded in both
// the target and top (default) windows
// Also, fixed a bug that might cause the progress bar to show 100% (after an errored
// SearchThread called foundNoMatch(int i)) while files were still being searched.
//
// 6/2/1999 fixed a very stupid bug (discovered by Dave Langers)
// on my part that made it impossible for queries
// containing uppercase characters to ever be found -- the scanned lines were
// converted to lowercase while the keywords/phrases were not
//
// 6/4/1999 renamed to AdvSiteSearcher to differentiate from older SiteSearcher
// Plan to release one more version of SiteSearcher using SearchSieves
//
// 6/9/1999 made many major revisions: renamed SearchThread to DocSearcher because
// it is no longer a subclass of Thread (it instead implements Runnable), polished
// the use of SearchSieves, began tentative support for caching, added the ability
// to demand exact matches, and to ignore text in between lesser than and greater than
// signs (probably HTML)
//
// 6/11/1999 began work on a stripped down version of AdvSiteSearcher
// designed to be much smaller (hopefully a little faster) and to output HTML
// via JavaScript (but without using direct communication; no com.netscape.JSObject needed)
//
// 6/12/1999 renamed it to SearchToHTML...fixed a "bug" that caused the applet to
// ignore the user defined height and width.
// Also finished converting AdvSiteSearcher to SearchToHTML, uncommented SearchSieve.reset(),
// Renamed the new specialized version of DocSearcher, HDocSearcher...added ability to capture
// the context of a match, the title (of an HTML doc), and the closest anchor to a match.
//
// 7/28/1999 added text changing parameters...mostly for internationalization...
// Also, gave up and cut out the code that was supposed to draw a componentless progressbar.
//
// 4/12/2000 fixed a "Y2K bug reported by several alert users... I am not sure what
// I was thinking when I wrote the portion of code calling Date.getYear()... Perhaps that it
// returns the decade? Anyway, in reality, getYear() returns the number of years
// since 1900. Files with modification dates beyond 1999 were listed with dates greater than
// 99 (100 for 2000).
// Note: the whole Date class is deprecated in JDK 1.1
// The code actually changed is found in HDocSearcher.java.
//
// 4/12/2000 added code that causes the HDocSearcher's runner Thread to wait
// when it is not "doing anything." This should be more efficient than in the
// previous incarnation, where runner would sleep, then periodically wake up to
// see if there was anything to search.
//
// 5/3/2000 Patrick Fourneret reported that escape sequences were visible in
// the titles of his search results. Originally, I actually considered this a
// feature - making sure that the title would display correctly on all browsers.
// After having this "feature" pointed out though, I see that it is actually a
// problem. The fix was simple - I simply stopped calling makeHTMLSafe on the title.
// I also corrected the spelling of "exclude." I fixed this several months ago,
// but apparently I ended up switching to an older code base somehow.
//
// Jul 12 2000 I added a kludgy method to HDocSearcher which will
// finish extracting the title from a document even if a match is
// found within the title. I had been reminded of this behavior several
// times before, but it was Danny Narayan's complaint that spurred me to action.
// See HDocSearcher.finishTitle(StringBuffer).
//
// Jul 12 2000 changed the names of SearchToHTML's methods foundMatch and
// foundTitle to receiveMatch and receiveTitle respectively. The former names
// seemed unfortunately confusing. Added a new method boolean hasTitle(int index)
// to SearchToHTML which returns whether the title of the document at index
// has been found.
//
// Jul 12 2000 corrected an embarassing error in ReadMe.html - most of the
// text was recycled from SiteSearcher's ReadMe. Unfortunately, some portions
// of the text that don't apply to SearchToHTML made it through.
//
// Jul 13 2000 I seem to be writing a lot broken sentences in this bug log.
// But that okay.
// Changed the name of the method "foundNoMatch" to "receiveNoMatch." Again,
// I think that the former name was misleading. Added two new parameters to
// deal with expanded context capabilities: leadingContextLength and
// trailingContextLength - leadingContextLength is very misleadingly named.
// I will probably change it tomorrow. The new parameters I was alluding to
// are "leadingcontextlength" and "trailingcontextlength". Not yet documented!
// I added a new method to HDocSearcher.java: appendTrailingContext(StringBuffer)
// and changed HDocSearcher's constructor in connection with the new trailing context
// stuff.
//
// Jul 14 2000 Added two new parameters: "xhtml_chkbx_checked" and "exact_chkbx_checked"
// Setting each of these to true will initially "check" the corresponding checkbox
// in the applet's user interface. (This worked well with the LineSearcherApplet.)
// Sorry, I've forgotten who suggested this.
//
// Jul 15 2000 Fixed "bugs" in HDocSearcher.java that would cause an
// ArrayOutOfBoundsException to be thrown if leadingContextLength==0. Previous to a few
// edits ago, I had required that this value be greather than zero so the code's
// assumption had been a safe one.
//
// Jul 19 2000 Cleaned up and updated the documentation.
//
// Jul 24 2000 Added a new parameter "max_num_matches" - no more than max_num_matches
// documents will be returned as matches to a search. The default value is the
// the total number of documents. This parameter was suggested by Danny Narayan.
// Uncovered another bug: boolean searching was actually always false as a search was
// underway because search() called stopAllSearches() _after_ setting searching to true
// and stopAllSearches sets search to false. I should scuttle this code and get on with
// the next generation of applets.
//
// Jul 25 2000 Fixed a bug in appendTrailingContext(StringBuffer). The fix required that
// the method not append directly from the input stream to the context (this was the source
// of the problem) so I renamed appendTrailingContext(StringBuffer) to getTrailingContext().
//
// August 17, 2000 Modified the SearchSieve class - see SearchSieve.java for details.
// Updated and made compliant the ReadMe.
public class SearchToHTML extends Applet {
private HDocSearcher[] workers;
private String[] urls;
private String[] pageinfo;//size, last modified
private String[] titles;
private int numreported=0;
private int numWorkers=0;
/**
* The number of matches reported for the current search.
*/
private int numOfMatches=0;
/**
* The maximum number of matches that will be reported.
*/
private int maxNumOfMatches;
private Button b;
private Checkbox HTMLbox,Exactbox;
private TextField searchbox;
private String target;
private URL docbase;
private boolean displayMessage;
private String message;//Message to be displayed in applet
//to let the user know what's happening before the GUI is finished being set up
private static final String searchTokenSeparators="\" \t\r\n,";
private Insets insets=new Insets(2,2,2,2);
private StringBuffer results=new StringBuffer();
private String resultspage;
private boolean waitForAll;
private String searchbase=null;
private boolean searching=false;
/**
* The text for the button to start a search.
*/
private String search_btn_txt;
/**
* The text for the button to stop a search.
*/
private String stop_btn_txt;
/**
* Initialize the applet.
* <br>
* Read and parse parameters, give meaningful values
* to class variables, and set up the user interface.
*/
public void init() {
//first initialize the variables
URL docbase=getDocumentBase();
searchbase=docbase.getProtocol()+"://"+docbase.getHost()+docbase.getFile();
if(searchbase.lastIndexOf('.')>searchbase.lastIndexOf('/')) {
searchbase=searchbase.substring(0,searchbase.lastIndexOf('/')+1);
}
int leadingContextLength=15;
String contextLenStr=getParameter("contextsize");
if (contextLenStr==null)
contextLenStr=getParameter("leadingcontextlength");
if(contextLenStr!=null) {
try {
leadingContextLength=Integer.parseInt(contextLenStr);
if (leadingContextLength<0)
leadingContextLength=0;
}
catch(NumberFormatException nfe) {
System.out.println(" Problem with contextsize/leadingcontextlength parameter.");
nfe.printStackTrace();
}
}
int trailingContextLength=0; //Keep default behavior of previous incantations.
String trailingLenStr = getParameter("trailingcontextlength");
if (trailingLenStr!=null) {
try {
trailingContextLength=Integer.parseInt(trailingLenStr);
if (trailingContextLength<0)
trailingContextLength=0;
}
catch (NumberFormatException nfe) {
System.out.println(" Invalid value for trailingcontextlength parameter.");
nfe.printStackTrace();
}
}
String files=getParameter("files");
if (files!=null) {
StringTokenizer st=new StringTokenizer(files,"\n\r \t,",false);
int num=st.countTokens();
urls=new String[num];
workers=new HDocSearcher[num];
pageinfo=new String[num];
titles=new String[num];
numWorkers=num;
String maxNumStr = getParameter("max_num_matches");
if (maxNumStr==null)
maxNumOfMatches = numWorkers;
else {
try {
maxNumOfMatches = Integer.parseInt(maxNumStr);
if (maxNumOfMatches>numWorkers)
maxNumOfMatches = numWorkers;
//XXX! Allow ridiculous, negative numbers.
//Why would someone want to do this? Who knows?
}
catch (NumberFormatException nfe) {
System.err.println("Invalid value for \"max_num_matches\" parameter");
nfe.printStackTrace();
}
}
String currToken;
URL cURL=null;
for(int i=0;i<num;i++) {
currToken=st.nextToken();
pageinfo[i]="";
titles[i]="";
urls[i]=new String(currToken);
try {
cURL=new URL(docbase,currToken);
workers[i]=new HDocSearcher(this,cURL,i,leadingContextLength, trailingContextLength);
}
catch(MalformedURLException mued) {
urls[i]="";
cURL=null;
//XXX! waste an Object
//This needs to change!
workers[i]=new HDocSearcher(this,cURL,i,0,0);
workers[i].setErrored();
System.out.println(mued);
}
}
}
else {
displayMessage=true;
System.out.println("SearchToHTML Applet can\'t start");
System.out.println("Missing required parameter: files");
message="Can\'t continue: missing the \"files\" parameter.";
repaint();
return;
}
target=getParameter("target");
if (target==null)
target="_top";
resultspage=getParameter("resultspage");
if (resultspage==null)
resultspage="searchresults.html";
waitForAll=("true".equalsIgnoreCase(getParameter("waitforall")));
if ("_top".equals(target) || "_self".equals(target))
waitForAll=true;
//Set up GUI
//Parameters to allow control over the text in the applet
// Though this needed ability is very easy to implement, I'm still faced with
// the dilemma of what to name the parameters. Perhaps this is a sign of my insanity,
// but I worry about whether to name them for their functionality (like "search_btn_txt")
// or their English versions (like "Search_en")...for now, functionality:
// search_btn_txt
// stop_btn_txt
// xhtml_chkbx_txt
// exact_chkbx_txt
// searchbox_label_txt
search_btn_txt=getParameter("search_btn_txt","Search");
stop_btn_txt=getParameter("stop_btn_txt","Stop");
String xhtml_chkbx_txt=getParameter("xhtml_chkbx_txt","Exclude HTML");
String exact_chkbx_txt=getParameter("exact_chkbx_txt","Exact matches only");
String searchbox_label_txt=getParameter("searchbox_label_txt","Search for:");
//get color parameters
Color color=null;
if((color=getColor(getParameter("bgcolor")))!=null) setBackground(color);
else setBackground(Color.gray);
if((color=getColor(getParameter("fgcolor")))!=null) setForeground(color);
else setForeground(Color.black);
//Lots of Panels
setLayout(new GridLayout(2,1));//searchbox,checkboxes
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -