linkextractor.java

来自「Simple Web Spider - This spider can fetc」· Java 代码 · 共 35 行

JAVA
35
字号
// LinkExtractor.java
// Written by Wei Sun @ 2006 Semester A; For Course CS5286
// This should always be used together with
//      MyHtmlHrefHandler.java

import java.io.*;
import java.util.*;
import javax.swing.text.html.parser.*;

public class LinkExtractor {
    private MyHtmlHrefHandler handler = new MyHtmlHrefHandler();

    public void parse(InputStream stream) throws IOException {
        if(stream==null)throw new IllegalArgumentException(
                "Illegal Argument :: Null Value :: stream");
        handler.reset();
        ParserDelegator pd=new ParserDelegator();
        try{
            pd.parse(new InputStreamReader(stream),handler,true);
        }finally{
            if(stream!=null)stream.close();
        }
    }

    public Iterator getExtLinkIterator() {
        return handler.extLinks.iterator();
    }

    public Iterator getLocalLinkIterator() {
        return handler.localLinks.iterator();
    }

}

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?