⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 testcollectedlinkedpages.java

📁 用来为垂直搜索引擎抓取数据的采集系统
💻 JAVA
字号:
/*
 * *****************************************************
 * Copyright (c) 2005 IIM Lab. All  Rights Reserved.
 * Created by xuehao at 2005-10-12
 * Contact: zxuehao@mail.ustc.edu.cn
 * *****************************************************
 */

package org.indigo.tests.pages;

import java.util.ArrayList;

import org.indigo.pages.CollectedLinkedPages;
import org.indigo.pages.VisitLinkedPages;

import junit.framework.TestCase;

public class TestCollectedLinkedPages extends TestCase
{
    public void testCollectedLinkedPages()
    {
        ArrayList urls = new ArrayList();
        urls.add(0,"http://www.ahnw.gov.cn/scxx/schq/index.asp?datetime=&page=1&zl=80%CB%AE%B9%FB&diqu=&chanpin=&dl=01%C5%A9%B8%B1&NewDay=0");
        urls.add(1,"http://www.ahnw.gov.cn/scxx/schq/index.asp?datetime=&page=1&zl=30%CA%DF%B2%CB&diqu=&chanpin=&dl=01%C5%A9%B8%B1&NewDay=0");
        urls.add(2,"http://www.ahnw.gov.cn/scxx/schq/index.asp?datetime=&page=1&zl=70%CB%AE%B2%FA%C6%B7&diqu=&chanpin=&dl=01%C5%A9%B8%B1&NewDay=0");

        CollectedLinkedPages cPages = new CollectedLinkedPages("page");
        cPages.setLinkedUrls(urls);

        String testurl;

        for (int j = 0; j<3; j++)
        {
            testurl = cPages.getNextUrl();
            cPages.setBeginUrl(testurl);
            //        System.out.println( "Collecting in: " + testurl );

            String id, url;
            for (int i = 0; i<1; i++)
            {
                id = String.valueOf(i + 1);
                url = cPages.getCollectedUrl(id);
                System.out.println(url);
            }
        }

    }
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -