testcollectedpage2.java
来自「用来为垂直搜索引擎抓取数据的采集系统」· Java 代码 · 共 55 行
JAVA
55 行
/*
* *****************************************************
* Copyright (c) 2005 IIM Lab. All Rights Reserved.
* Created by xuehao at 2005-10-12
* Contact: zxuehao@mail.ustc.edu.cn
* *****************************************************
*/
package org.indigo.tests.pages;
import java.util.ArrayList;
import junit.framework.TestCase;
import org.indigo.pages.CollectedIdsPage;
import org.indigo.pages.CollectedPage;
import org.indigo.pages.VisitPage;
public class TestCollectedPage2 extends TestCase
{
public void testCollectedPage2()
{
VisitPage visitPage = new VisitPage( "page" );
visitPage.setBeginUrl( "http://www.ahnw.gov.cn/scxx/schq/?datetime=&page=2&zl=&diqu=&chanpin=&dl=&NewDay=0" );
visitPage.setParameters( 1, 3, 1 );
CollectedPage colPage = new CollectedPage( "page" );
colPage.setBeginUrl( "http://www.ahnw.gov.cn/scxx/schq/?datetime=&page=1&zl=&diqu=&chanpin=&dl=&NewDay=0" );
CollectedIdsPage idsPage = new CollectedIdsPage();
idsPage.setVisitPage( visitPage );
String url=null;
url = visitPage.getCurrentLink();
while( url!=null )
{
idsPage.setUrl( url );
ArrayList ids=null;
ids = idsPage.getIds();
for( int i=0; i<ids.size(); i++ )
{
String id=null;
id = (String) ids.get(i);
url = colPage.getCollectedUrl( id );
System.out.println( url );
}
url = visitPage.getNextVisitLink();
}
System.out.println( "TestCollectedPage2 over." );
}
}
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?