http.java
来自「一个主题相关的网络爬虫,实现与某一主题相关的网页的爬取」· Java 代码 · 共 46 行
JAVA
46 行
package com.parser;
import java.io.*;
import java.net.*;
//获取网页文档的类
public class HTTP {
// 获取url对应网页文档
public String getBody(String url) {
BufferedReader reader = null;
try {
URL su = new URL(url);
URLConnection uc = su.openConnection();
// 设置连接和读取网页超时
uc.setConnectTimeout(10000);
uc.setReadTimeout(10000);
StringBuffer sb = new StringBuffer();
// Reader r = new InputStreamReader(su.openStream());
Reader r = new InputStreamReader(uc.getInputStream());
reader = new BufferedReader(r);
String s = null;
do {
s = reader.readLine();
if (s != null) {
sb.append(s).append("\r\n");
}
} while (s != null);
return sb.toString();
} catch (Exception e1) {
System.out.println("连接超时");
return "";
} finally {
if (reader != null) {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?