⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 login.java

📁 用来为垂直搜索引擎抓取数据的采集系统
💻 JAVA
字号:
package org.indigo.tests;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;

import org.apache.commons.httpclient.Cookie;
import org.apache.commons.httpclient.Header;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpState;
import org.apache.commons.httpclient.HttpStatus;
import org.apache.commons.httpclient.NameValuePair;
import org.apache.commons.httpclient.methods.PostMethod;
import org.indigo.parser.Parser;

public class Login {

	/**
	 * @param args
	 */
	public static void main (String args[]) {
		String url = "http://nc.mofcom.gov.cn/gxdj/schq/list.do";
		String sourceCode="";
		HttpClient httpClient = new HttpClient();
		httpClient.getParams().setContentCharset("gb2312");
		PostMethod postMethod = new PostMethod(url);
//		 填入各个表单域的值
//		Parser parser=new Parser();
//		parser.setUrl("http://www.scnjw.gov.cn/schq/schq.aspx");
//		parser.open();
//		String middle=parser.parseWith("<input type=\"hidden\" name=\"__VIEWSTATE\" value=\"", "\" />");
//		pager.addHiddenInputs("p_index", "");
//		pager.addHiddenInputs("eud_id", "");
//		pager.addHiddenInputs("get_p_date", "");
//		pager.addHiddenInputs("key_word", "");
//		pager.addHiddenInputs("par_index", "");
//		pager.addHiddenInputs("craft_index", "");
//		pager.addHiddenInputs("p", "");
//		document.write(pager.toString());
		
		NameValuePair[] data = { new NameValuePair("p_index", "")
        ,new NameValuePair("eud_id", ""),
        new NameValuePair("get_p_date", "") ,
        new NameValuePair("key_word", ""),new NameValuePair("par_index", ""),
        new NameValuePair("craft_index", ""),new NameValuePair("p", ""),
        new NameValuePair("requestPage","50")};
		// 将表单的值放入postMethod中  
		postMethod.setRequestBody(data);
		int statusCode=0;
//		 执行postMethod
		try {
			 statusCode =httpClient.executeMethod(postMethod);	
			 InputStream is=postMethod.getResponseBodyAsStream();
			 BufferedReader br=new BufferedReader(new InputStreamReader(is));
			 String line=br.readLine();
			 while(line!=null)
			 {
				 sourceCode+=line+"\n";
				 line=br.readLine();
			 }
//			 sourceCode=postMethod.getResponseBodyAsString();
			 System.out.println(sourceCode+" .");
		} catch (HttpException e) {
			// TODO Auto-generated catch block
			//e.printStackTrace();
			//return null;
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
//		 HttpClient对于要求接受后继服务的请求,象POST和PUT等不能自动处理转发
//		 301或者302
       if (statusCode == HttpStatus.SC_MOVED_PERMANENTLY || statusCode == HttpStatus.SC_MOVED_TEMPORARILY) {    
//		 从头中取出转向的地址
		Header locationHeader = postMethod.getResponseHeader("location");
		
		String location = null;  
		if (locationHeader != null) 
		{
		location = locationHeader.getValue();
		System.out.println("The page was redirected to:" + location);
		 } 
		else 
		{   
			System.err.println("Location field value is null.");
		 }    
		}
       if(statusCode==500)
		{
			System.out.println("服务器错误!");
		}
//       Cookie cookie[]= httpClient.getState().getCookies();
//       postMethod.releaseConnection();
//       HttpClient httpClient1 = new HttpClient();
//       HttpState state=new HttpState();
//       for(Cookie c:cookie)
//       {
//    	   state.addCookie(c);
//       }
//       httpClient1.setState(state);
//       httpClient1.getParams().setContentCharset("gb2312");
//       PostMethod post1=new PostMethod("http://www.iim.ac.cn/kaoqin3/new01.asp");
//       try {
//		httpClient1.executeMethod(post1);
//		System.out.println(post1.getResponseBodyAsString());
//	} catch (HttpException e) {
//		// TODO Auto-generated catch block
//		e.printStackTrace();
//	} catch (IOException e) {
//		// TODO Auto-generated catch block
//		e.printStackTrace();
//	}
      
     //  System.out.println(sourceCode);
       
      
      
      
      
      // return sourceCode;

	}
	}


⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -