news.zdnet.com.script
来自「垂直搜索的网络爬虫」· SCRIPT 代码 · 共 22 行
SCRIPT
22 行
FIND_NODE ... <div id="heading"> [next][END_OF_DOCUMENT]
FIND_NODE ... <h1> [next][END_OF_DOCUMENT]
STORE_TEXT title -1 -1 [next][END_OF_DOCUMENT]
FIND_NODE ... /h1 [next][END_OF_DOCUMENT]
SAVE_POS $meta_start [next][END_OF_DOCUMENT]
FIND_NODE ... <ul class="tools"> [next][END_OF_DOCUMENT]
SAVE_POS $meta_end [next][END_OF_DOCUMENT]
SAVE_TEXT $meta $meta_start $meta_end [next][END_OF_DOCUMENT]
STORE_NODE meta $meta [next][END_OF_DOCUMENT]
REGEXP $meta /.*:.(\S+ \d+, \d+, \d+:\d+ [AP]M PT).*/ [next][END_OF_DOCUMENT]
STORE_ISODATE iso-date MMM dd, yyyy, hh:mm a 'PT' $1 [next][END_OF_DOCUMENT]
FIND_NODE ... <p> [next][END_OF_DOCUMENT]
SAVE_POS $content_start [next][END_OF_DOCUMENT]
FIND_NODE ... h2 id="tbh" [next][END_OF_DOCUMENT]
SAVE_POS $content_end [next][END_OF_DOCUMENT]
STORE_TEXT content $content_start $content_end [next][END_OF_DOCUMENT]
STORE_LINKS $content_start $content_end [next][END_OF_DOCUMENT]
END_OF_ARTICLE [next][END_OF_DOCUMENT]
END_OF_DOCUMENT [next][END_OF_DOCUMENT]
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?