slashdot.org.script

来自「垂直搜索的网络爬虫」· SCRIPT 代码 · 共 19 行

SCRIPT
19
字号
FIND_NODE	... <div class="article">	[next][END_OF_DOCUMENT]FIND_NODE	... <div class="title">	[next][END_OF_DOCUMENT]STORE_TEXT	title	-1	-1	[next][END_OF_DOCUMENT]FIND_NODE	... <div class="details"> 	[next][END_OF_DOCUMENT]SAVE_TEXT	$details	-1	-1	[next][END_OF_DOCUMENT]REGEXP	$details	/Posted\s+by\s*(.+)\s*on\s+(.+?[PA]M)\s*(.+)/s	[next][END_OF_DOCUMENT]STORE_NODE	author	$1	[next][END_OF_DOCUMENT]STORE_ISODATE	iso-date	EEE MMM dd, '@'hh:mma	$2	[GOTO_TASK	10][next]STORE_ISODATE	iso-date	EEE MMM dd, ''yy hh:mm a	$2	[next][END_OF_DOCUMENT]FIND_NODE	... <div class="body">	[next][END_OF_DOCUMENT]STORE_TEXT	content	-1	-1	[next][END_OF_DOCUMENT]SAVE_POS	$links_start	[next][END_OF_DOCUMENT]FIND_NODE	... </div class="storylinks">	[next][GOTO_TASK	16]SAVE_POS	$links_end	[next][END_OF_DOCUMENT]STORE_LINKS	$links_start	$links_end	[GOTO_TASK	17][END_OF_DOCUMENT]STORE_LINKS	-1	-1	[next][END_OF_DOCUMENT]END_OF_ARTICLE		[next][END_OF_DOCUMENT]GOTO_TASK	1	[][]

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?