searchengineherald.com.script

来自「垂直搜索的网络爬虫」· SCRIPT 代码 · 共 17 行

SCRIPT
17
字号
FIND_NODE	... <div class="post"> 	[next][END_OF_DOCUMENT]
FIND_NODE	... <h2> 	[next][END_OF_DOCUMENT]
STORE_TEXT	title	-1	-1	[next][END_OF_DOCUMENT]
FIND_NODE	... <div class=~/entry.*/> 	[next][END_OF_DOCUMENT]
STORE_TEXT	content	-1	-1	[next][END_OF_DOCUMENT]
STORE_LINKS	-1	-1	[next][END_OF_DOCUMENT]
FIND_NODE	... <div class="support">	[next][GOTO_TASK	15]
SAVE_TEXT	$meta	-1	-1	[next][END_OF_DOCUMENT]
STORE_NODE	meta	$meta	[next][END_OF_DOCUMENT]
REGEXP	$meta	/(\S+ \d+\S+, \d+)\s+.*/	[next][GOTO_TASK	15]
STORE_ISODATE	iso-date	MMM dd'st,' yyyy	$1	[GOTO_TASK	15][next]
STORE_ISODATE	iso-date	MMM dd'nd,' yyyy	$1	[GOTO_TASK	15][next]
STORE_ISODATE	iso-date	MMM dd'rd,' yyyy	$1	[GOTO_TASK	15][next]
STORE_ISODATE	iso-date	MMM dd'th,' yyyy	$1	[next][END_OF_DOCUMENT]
END_OF_ARTICLE		[next][END_OF_DOCUMENT]
GOTO_TASK	1	[][]

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?