📄 parseitemcontenthandle.java
字号:
package com.blogool.crawl;
import java.util.regex.*;
import org.flytinge.ContentHandle;
import com.blogool.crawl.lib.*;
public class ParseItemContentHandle implements ContentHandle {
Pattern pItem = Pattern.compile("<div\\s+class=\"sea_r_part4_left\">\\s*<a\\s*href=\"(.+?)\".+?><img src=\"(.+?)\" border=\"\\d*\"\\s*alt=\"(.+?)\"\\s*width=\"60\"\\s*height=\"60\"\\s*\\/><\\/a>\\s*<\\/div>");
private Cat cat;
public ParseItemContentHandle(Cat cat) {
this.cat = cat;
}
public void handle(String content) {
Matcher m = pItem.matcher(content);
while (m.find()) {
try {
Item item = new Item();
item.setUrl(m.group(1));
item.setImageUrls(new String[] {m.group(2)});
synchronized (this.cat) {
this.cat.getItems().add(item);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -