⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 aralethread.java

📁 一个网络爬虫
💻 JAVA
字号:
package org.flaviotordini.arale;

import java.util.*;
import java.net.*;
import java.io.*;

/**
 *  Arale thread class
 *
 * @author     Flavio Tordini
 * @created    26 novembre 2001
 */
public class AraleThread implements Runnable {

    private Arale arale;


    /**
     *  Constructor for the AraleThread object
     *
     * @param  arale  Description of Parameter
     * @since         7 dicembre 2001
     */
    public AraleThread(Arale arale) {
        this.arale = arale;
    }


    /**
     *  Main processing method for the AraleThread object
     *
     * @since    7 dicembre 2001
     */
    public void run() {

        arale.logger.log(Thread.currentThread().getName() + " start");
        arale.threads.add(this);

        // consume queued URLs
        while (!arale.queuedUrls.isEmpty()) {
            ContextualURL contextualURL = (ContextualURL) arale.queuedUrls.remove(0);
            process(contextualURL);
        }

        arale.threads.remove(this);
        arale.logger.log(Thread.currentThread().getName() + " end");

        if (arale.threads.isEmpty()) {
            arale.endProcess();
        }
    }


    /**
     *  the core method. given an input ContextualURL object [-----] . <br>
     *  + a HTTP connection is established <br>
     *  + if the resource is marked to be scannable or its mimetype is
     *  text/html, the resource is passed to <code>scanHTML</code> for links to
     *  be parsed.<br>
     *  + the resource is finally written to disk (if it complies with user
     *  settings)
     *
     * @param  contextualURL  url to process
     * @since                 7 dicembre 2001
     */
    public void process(ContextualURL contextualURL) {
        /*
            controllo pagina gi

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -