extractorimplieduri.html
来自「网络爬虫开源代码」· HTML 代码 · 共 532 行 · 第 1/3 页
HTML
532 行
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><!--NewPage--><HTML><HEAD><!-- Generated by javadoc (build 1.5.0_07) on Sun May 06 17:59:59 GMT 2007 --><TITLE>ExtractorImpliedURI (Heritrix 1.12.1)</TITLE><META NAME="keywords" CONTENT="org.archive.crawler.extractor.ExtractorImpliedURI class"><LINK REL ="stylesheet" TYPE="text/css" HREF="../../../../stylesheet.css" TITLE="Style"><SCRIPT type="text/javascript">function windowTitle(){ parent.document.title="ExtractorImpliedURI (Heritrix 1.12.1)";}</SCRIPT><NOSCRIPT></NOSCRIPT></HEAD><BODY BGCOLOR="white" onload="windowTitle();"><!-- ========= START OF TOP NAVBAR ======= --><A NAME="navbar_top"><!-- --></A><A HREF="#skip-navbar_top" title="Skip navigation links"></A><TABLE BORDER="0" WIDTH="100%" CELLPADDING="1" CELLSPACING="0" SUMMARY=""><TR><TD COLSPAN=2 BGCOLOR="#EEEEFF" CLASS="NavBarCell1"><A NAME="navbar_top_firstrow"><!-- --></A><TABLE BORDER="0" CELLPADDING="0" CELLSPACING="3" SUMMARY=""> <TR ALIGN="center" VALIGN="top"> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../../overview-summary.html"><FONT CLASS="NavBarFont1"><B>Overview</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="package-summary.html"><FONT CLASS="NavBarFont1"><B>Package</B></FONT></A> </TD> <TD BGCOLOR="#FFFFFF" CLASS="NavBarCell1Rev"> <FONT CLASS="NavBarFont1Rev"><B>Class</B></FONT> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="class-use/ExtractorImpliedURI.html"><FONT CLASS="NavBarFont1"><B>Use</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="package-tree.html"><FONT CLASS="NavBarFont1"><B>Tree</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../../deprecated-list.html"><FONT CLASS="NavBarFont1"><B>Deprecated</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../../index-all.html"><FONT CLASS="NavBarFont1"><B>Index</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../../help-doc.html"><FONT CLASS="NavBarFont1"><B>Help</B></FONT></A> </TD> </TR></TABLE></TD><TD ALIGN="right" VALIGN="top" ROWSPAN=3><EM></EM></TD></TR><TR><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2"> <A HREF="../../../../org/archive/crawler/extractor/ExtractorHTTP.html" title="class in org.archive.crawler.extractor"><B>PREV CLASS</B></A> <A HREF="../../../../org/archive/crawler/extractor/ExtractorJS.html" title="class in org.archive.crawler.extractor"><B>NEXT CLASS</B></A></FONT></TD><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2"> <A HREF="../../../../index.html?org/archive/crawler/extractor/ExtractorImpliedURI.html" target="_top"><B>FRAMES</B></A> <A HREF="ExtractorImpliedURI.html" target="_top"><B>NO FRAMES</B></A> <SCRIPT type="text/javascript"> <!-- if(window==top) { document.writeln('<A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A>'); } //--></SCRIPT><NOSCRIPT> <A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A></NOSCRIPT></FONT></TD></TR><TR><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2"> SUMMARY: <A HREF="#nested_classes_inherited_from_class_org.archive.crawler.settings.ComplexType">NESTED</A> | <A HREF="#field_summary">FIELD</A> | <A HREF="#constructor_summary">CONSTR</A> | <A HREF="#method_summary">METHOD</A></FONT></TD><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">DETAIL: <A HREF="#field_detail">FIELD</A> | <A HREF="#constructor_detail">CONSTR</A> | <A HREF="#method_detail">METHOD</A></FONT></TD></TR></TABLE><A NAME="skip-navbar_top"></A><!-- ========= END OF TOP NAVBAR ========= --><HR><!-- ======== START OF CLASS DATA ======== --><H2><FONT SIZE="-1">org.archive.crawler.extractor</FONT><BR>Class ExtractorImpliedURI</H2><PRE>java.lang.Object <IMG SRC="../../../../resources/inherit.gif" ALT="extended by ">javax.management.Attribute <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/settings/Type.html" title="class in org.archive.crawler.settings">org.archive.crawler.settings.Type</A> <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/settings/ComplexType.html" title="class in org.archive.crawler.settings">org.archive.crawler.settings.ComplexType</A> <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/settings/ModuleType.html" title="class in org.archive.crawler.settings">org.archive.crawler.settings.ModuleType</A> <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/framework/Processor.html" title="class in org.archive.crawler.framework">org.archive.crawler.framework.Processor</A> <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/extractor/Extractor.html" title="class in org.archive.crawler.extractor">org.archive.crawler.extractor.Extractor</A> <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><B>org.archive.crawler.extractor.ExtractorImpliedURI</B></PRE><DL><DT><B>All Implemented Interfaces:</B> <DD>java.io.Serializable, javax.management.DynamicMBean, <A HREF="../../../../org/archive/crawler/datamodel/CoreAttributeConstants.html" title="interface in org.archive.crawler.datamodel">CoreAttributeConstants</A></DD></DL><HR><DL><DT><PRE>public class <B>ExtractorImpliedURI</B><DT>extends <A HREF="../../../../org/archive/crawler/extractor/Extractor.html" title="class in org.archive.crawler.extractor">Extractor</A><DT>implements <A HREF="../../../../org/archive/crawler/datamodel/CoreAttributeConstants.html" title="interface in org.archive.crawler.datamodel">CoreAttributeConstants</A></DL></PRE><P>An extractor for finding 'implied' URIs inside other URIs. If the 'trigger' regex is matched, a new URI will be constructed from the 'build' replacement pattern. Unlike most other extractors, this works on URIs discovered by previous extractors. Thus it should appear near the end of any set of extractors. Initially, only finds absolute HTTP(S) URIs in query-string or its parameters. TODO: extend to find URIs in path-info<P><P><DL><DT><B>Author:</B></DT> <DD>Gordon Mohr</DD><DT><B>See Also:</B><DD><A HREF="../../../../serialized-form.html#org.archive.crawler.extractor.ExtractorImpliedURI">Serialized Form</A></DL><HR><P><!-- ======== NESTED CLASS SUMMARY ======== --><A NAME="nested_class_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Nested Class Summary</B></FONT></TH></TR></TABLE> <A NAME="nested_classes_inherited_from_class_org.archive.crawler.settings.ComplexType"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Nested classes/interfaces inherited from class org.archive.crawler.settings.<A HREF="../../../../org/archive/crawler/settings/ComplexType.html" title="class in org.archive.crawler.settings">ComplexType</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../org/archive/crawler/settings/ComplexType.MBeanAttributeInfoIterator.html" title="class in org.archive.crawler.settings">ComplexType.MBeanAttributeInfoIterator</A></CODE></TD></TR></TABLE> <!-- =========== FIELD SUMMARY =========== --><A NAME="field_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Field Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/extractor/ExtractorImpliedURI.html#ATTR_BUILD_PATTERN">ATTR_BUILD_PATTERN</A></B></CODE><BR> replacement pattern used to build 'implied' URI</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/extractor/ExtractorImpliedURI.html#ATTR_REMOVE_TRIGGER_URIS">ATTR_REMOVE_TRIGGER_URIS</A></B></CODE><BR> whether to remove URIs that trigger addition of 'implied' URI; default false</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static java.lang.String</CODE></FONT></TD>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?