regexpcsslinkextractor.html
来自「网络爬虫开源代码」· HTML 代码 · 共 418 行 · 第 1/2 页
HTML
418 行
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><!--NewPage--><HTML><HEAD><!-- Generated by javadoc (build 1.5.0_07) on Sun May 06 18:00:00 GMT 2007 --><TITLE>RegexpCSSLinkExtractor (Heritrix 1.12.1)</TITLE><META NAME="keywords" CONTENT="org.archive.extractor.RegexpCSSLinkExtractor class"><LINK REL ="stylesheet" TYPE="text/css" HREF="../../../stylesheet.css" TITLE="Style"><SCRIPT type="text/javascript">function windowTitle(){ parent.document.title="RegexpCSSLinkExtractor (Heritrix 1.12.1)";}</SCRIPT><NOSCRIPT></NOSCRIPT></HEAD><BODY BGCOLOR="white" onload="windowTitle();"><!-- ========= START OF TOP NAVBAR ======= --><A NAME="navbar_top"><!-- --></A><A HREF="#skip-navbar_top" title="Skip navigation links"></A><TABLE BORDER="0" WIDTH="100%" CELLPADDING="1" CELLSPACING="0" SUMMARY=""><TR><TD COLSPAN=2 BGCOLOR="#EEEEFF" CLASS="NavBarCell1"><A NAME="navbar_top_firstrow"><!-- --></A><TABLE BORDER="0" CELLPADDING="0" CELLSPACING="3" SUMMARY=""> <TR ALIGN="center" VALIGN="top"> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../overview-summary.html"><FONT CLASS="NavBarFont1"><B>Overview</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="package-summary.html"><FONT CLASS="NavBarFont1"><B>Package</B></FONT></A> </TD> <TD BGCOLOR="#FFFFFF" CLASS="NavBarCell1Rev"> <FONT CLASS="NavBarFont1Rev"><B>Class</B></FONT> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="class-use/RegexpCSSLinkExtractor.html"><FONT CLASS="NavBarFont1"><B>Use</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="package-tree.html"><FONT CLASS="NavBarFont1"><B>Tree</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../deprecated-list.html"><FONT CLASS="NavBarFont1"><B>Deprecated</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../index-all.html"><FONT CLASS="NavBarFont1"><B>Index</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../help-doc.html"><FONT CLASS="NavBarFont1"><B>Help</B></FONT></A> </TD> </TR></TABLE></TD><TD ALIGN="right" VALIGN="top" ROWSPAN=3><EM></EM></TD></TR><TR><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2"> <A HREF="../../../org/archive/extractor/LinkExtractor.html" title="interface in org.archive.extractor"><B>PREV CLASS</B></A> <A HREF="../../../org/archive/extractor/RegexpHTMLLinkExtractor.html" title="class in org.archive.extractor"><B>NEXT CLASS</B></A></FONT></TD><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2"> <A HREF="../../../index.html?org/archive/extractor/RegexpCSSLinkExtractor.html" target="_top"><B>FRAMES</B></A> <A HREF="RegexpCSSLinkExtractor.html" target="_top"><B>NO FRAMES</B></A> <SCRIPT type="text/javascript"> <!-- if(window==top) { document.writeln('<A HREF="../../../allclasses-noframe.html"><B>All Classes</B></A>'); } //--></SCRIPT><NOSCRIPT> <A HREF="../../../allclasses-noframe.html"><B>All Classes</B></A></NOSCRIPT></FONT></TD></TR><TR><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2"> SUMMARY: NESTED | <A HREF="#field_summary">FIELD</A> | <A HREF="#constructor_summary">CONSTR</A> | <A HREF="#method_summary">METHOD</A></FONT></TD><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">DETAIL: <A HREF="#field_detail">FIELD</A> | <A HREF="#constructor_detail">CONSTR</A> | <A HREF="#method_detail">METHOD</A></FONT></TD></TR></TABLE><A NAME="skip-navbar_top"></A><!-- ========= END OF TOP NAVBAR ========= --><HR><!-- ======== START OF CLASS DATA ======== --><H2><FONT SIZE="-1">org.archive.extractor</FONT><BR>Class RegexpCSSLinkExtractor</H2><PRE>java.lang.Object <IMG SRC="../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html" title="class in org.archive.extractor">org.archive.extractor.CharSequenceLinkExtractor</A> <IMG SRC="../../../resources/inherit.gif" ALT="extended by "><B>org.archive.extractor.RegexpCSSLinkExtractor</B></PRE><DL><DT><B>All Implemented Interfaces:</B> <DD>java.util.Iterator, <A HREF="../../../org/archive/extractor/LinkExtractor.html" title="interface in org.archive.extractor">LinkExtractor</A></DD></DL><HR><DL><DT><PRE>public class <B>RegexpCSSLinkExtractor</B><DT>extends <A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html" title="class in org.archive.extractor">CharSequenceLinkExtractor</A></DL></PRE><P>This extractor is parsing URIs from CSS type files. The format of a CSS URL value is 'url(' followed by optional white space followed by an optional single quote (') or double quote (") character followed by the URL itself followed by an optional single quote (') or double quote (") character followed by optional white space followed by ')'. Parentheses, commas, white space characters, single quotes (') and double quotes (") appearing in a URL must be escaped with a backslash: '\(', '\)', '\,'. Partial URLs are interpreted relative to the source of the style sheet, not relative to the document. <a href="http://www.w3.org/TR/REC-CSS1#url"> Source: www.w3.org</a> ROUGH DRAFT IN PROGRESS / incomplete... untested... major changes likely<P><P><DL><DT><B>Author:</B></DT> <DD>igor gojomo</DD></DL><HR><P><!-- =========== FIELD SUMMARY =========== --><A NAME="field_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Field Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>(package private) static java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/extractor/RegexpCSSLinkExtractor.html#CSS_BACKSLASH_ESCAPE">CSS_BACKSLASH_ESCAPE</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>(package private) static java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/extractor/RegexpCSSLinkExtractor.html#CSS_URI_EXTRACTOR">CSS_URI_EXTRACTOR</A></B></CODE><BR> CSS URL extractor pattern.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.regex.Matcher</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/extractor/RegexpCSSLinkExtractor.html#uris">uris</A></B></CODE><BR> </TD></TR></TABLE> <A NAME="fields_inherited_from_class_org.archive.extractor.CharSequenceLinkExtractor"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Fields inherited from class org.archive.extractor.<A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html" title="class in org.archive.extractor">CharSequenceLinkExtractor</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html#base">base</A>, <A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html#extractErrorListener">extractErrorListener</A>, <A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html#next">next</A>, <A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html#source">source</A>, <A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html#sourceContent">sourceContent</A></CODE></TD></TR></TABLE> <!-- ======== CONSTRUCTOR SUMMARY ======== --><A NAME="constructor_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Constructor Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><B><A HREF="../../../org/archive/extractor/RegexpCSSLinkExtractor.html#RegexpCSSLinkExtractor()">RegexpCSSLinkExtractor</A></B>()</CODE><BR> </TD></TR></TABLE> <!-- ========== METHOD SUMMARY =========== --><A NAME="method_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Method Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/extractor/RegexpCSSLinkExtractor.html#findNextLink()">findNextLink</A></B>()</CODE><BR> Scan to the next link(s), if any, loading it into the next buffer.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected static <A HREF="../../../org/archive/extractor/CharSequenceLinkExtractor.html" title="class in org.archive.extractor">CharSequenceLinkExtractor</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/extractor/RegexpCSSLinkExtractor.html#newDefaultInstance()">newDefaultInstance</A></B>()</CODE><BR> </TD></TR>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?