⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 visitlinkedpages.java

📁 用来为垂直搜索引擎抓取数据的采集系统
💻 JAVA
字号:
/*
 * *****************************************************
 * Copyright (c) 2005 IIM Lab. All  Rights Reserved.
 * Created by xuehao at 2005-10-12
 * Contact: zxuehao@mail.ustc.edu.cn
 * *****************************************************
 */

package org.indigo.pages;

import java.util.ArrayList;

public class VisitLinkedPages extends LinkedPages
{
    private VisitPage itsVisitPage = null;
    private int start, end, inc, crt;

    public VisitLinkedPages(String key)
    {
        super( key );
    }
    public void setParameters(int i, int j, int k)
    {
        start = i;
        end = j;
        inc = k;
    }

    public String getNextVisitLink()
    {
        if ( itsLinkedUrls.isEmpty() && bNextPage )
            return null;

        crt += inc;
        if (crt > end || itsVisitPage == null)
            bNextPage = true;
        else
            bNextPage = false;

        if (bNextPage && !itsLinkedUrls.isEmpty() )
        {
            itsBeginUrl = (String) itsLinkedUrls.get(0);
            itsLinkedUrls.remove(0);
            return getCurrentLink();
        } else
        {
            if( crt>end )
                bNextPage = true;
            else
                bNextPage = false;
            return itsVisitPage.getNextVisitLink();
        }
    }

    public VisitPage getVisitPage()
    {
        return itsVisitPage;
    }
    private String getCurrentLink()
    {
        bNextPage = true;
        itsVisitPage = new VisitPage(itsKey);
        itsVisitPage.setBeginUrl(itsBeginUrl);
        itsVisitPage.setParameters(start, end, inc);
        crt = start;
        if (crt > end)
            bNextPage = true;
        else
            bNextPage = false;

        return itsVisitPage.getCurrentLink();
    }
    public boolean isNewPage()
    {
        return start==crt;
    }
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -