fetchhttp.html
来自「网络爬虫开源代码」· HTML 代码 · 共 987 行 · 第 1/5 页
HTML
987 行
</TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#crawlResuming(java.lang.String)">crawlResuming</A></B>(java.lang.String statusMessage)</CODE><BR> Called when a CrawlController is resuming a crawl that had been paused.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#crawlStarted(java.lang.String)">crawlStarted</A></B>(java.lang.String message)</CODE><BR> Called on crawl start.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#doAbort(org.archive.crawler.datamodel.CrawlURI, org.apache.commons.httpclient.HttpMethod, java.lang.String)">doAbort</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi, org.apache.commons.httpclient.HttpMethod method, java.lang.String annotation)</CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#finalTasks()">finalTasks</A></B>()</CODE><BR> Classes subclassing this one should override this method to perform processor specific actions.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.lang.Object</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#getAttributeEither(org.archive.crawler.datamodel.CrawlURI, java.lang.String)">getAttributeEither</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi, java.lang.String key)</CODE><BR> Get a value either from inside the CrawlURI instance, or from settings (module attributes).</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected org.apache.commons.httpclient.auth.AuthScheme</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#getAuthScheme(org.apache.commons.httpclient.HttpMethod, org.archive.crawler.datamodel.CrawlURI)">getAuthScheme</A></B>(org.apache.commons.httpclient.HttpMethod method, <A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi)</CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected org.apache.commons.httpclient.HttpClient</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#getHttp()">getHttp</A></B>()</CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected <A HREF="../../../../org/archive/crawler/deciderules/DecideRule.html" title="class in org.archive.crawler.deciderules">DecideRule</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#getMidfetchRule(java.lang.Object)">getMidfetchRule</A></B>(java.lang.Object o)</CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#handle401(org.apache.commons.httpclient.HttpMethod, org.archive.crawler.datamodel.CrawlURI)">handle401</A></B>(org.apache.commons.httpclient.HttpMethod method, <A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi)</CODE><BR> Server is looking for basic/digest auth credentials (RFC2617).</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#initialTasks()">initialTasks</A></B>()</CODE><BR> Classes subclassing this one should override this method to perform processor specific actions.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#innerProcess(org.archive.crawler.datamodel.CrawlURI)">innerProcess</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi)</CODE><BR> Classes subclassing this one should override this method to perform their custom actions on the CrawlURI.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#listUsedFiles(java.util.List)">listUsedFiles</A></B>(java.util.List<java.lang.String> list)</CODE><BR> Those Modules that use files on disk should list them all when this method is called.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#loadCookies()">loadCookies</A></B>()</CODE><BR> Load cookies from the file specified in the order file.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#loadCookies(java.lang.String)">loadCookies</A></B>(java.lang.String cookiesFile)</CODE><BR> Load cookies from a file before the first fetch.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#report()">report</A></B>()</CODE><BR> Compiles and returns a report (in human readable form) about the status of the processor.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#saveCookies()">saveCookies</A></B>()</CODE><BR> Saves cookies to the file specified in the order file.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#saveCookies(java.lang.String)">saveCookies</A></B>(java.lang.String saveCookiesFile)</CODE><BR> Saves cookies to a file.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#setConditionalGetHeader(org.archive.crawler.datamodel.CrawlURI, org.apache.commons.httpclient.HttpMethod, java.lang.String, java.lang.String, java.lang.String)">setConditionalGetHeader</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi, org.apache.commons.httpclient.HttpMethod method, java.lang.String setting, java.lang.String sourceHeader, java.lang.String targetHeader)</CODE><BR> Set the given conditional-GET header, if the setting is enabled and a suitable value is available in the URI history.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/fetcher/FetchHTTP.html#setSizes(org.archive.crawler.datamodel.CrawlURI, org.archive.util.HttpRecorder)">setSizes</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi, <A HREF="../../../../org/archive/util/HttpRecorder.html" title="class in org.archive.util">HttpRecorder</A> rec)</CODE><BR> Update CrawlURI internal sizes based on current transaction (and in the case of 304s, history)</TD></TR></TABLE> <A NAME="methods_inherited_from_class_org.archive.crawler.framework.Processor"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Methods inherited from class org.archive.crawler.framework.<A HREF="../../../../org/archive/crawler/framework/Processor.html" title="class in org.archive.crawler.framework">Processor</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../org/archive/crawler/framework/Processor.html#checkForInterrupt()">checkForInterrupt</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#getController()">getController</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#getDecideRule(java.lang.Object)">getDecideRule</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#getDefaultNextProcessor(org.archive.crawler.datamodel.CrawlURI)">getDefaultNextProcessor</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#innerRejectProcess(org.archive.crawler.datamodel.CrawlURI)">innerRejectProcess</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#isContentToProcess(org.archive.crawler.datamodel.CrawlURI)">isContentToProcess</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#isExpectedMimeType(java.lang.String, java.lang.String)">isExpectedMimeType</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#isHttpTransactionContentToProcess(org.archive.crawler.datamodel.CrawlURI)">isHttpTransactionContentToProcess</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#kickUpdate()">kickUpdate</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#process(org.archive.crawler.datamodel.CrawlURI)">process</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#rulesAccept(org.archive.crawler.deciderules.DecideRule, java.lang.Object)">rulesAccept</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#rulesAccept(java.lang.Object)">rulesAccept</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#setDefaultNextProcessor(org.archive.crawler.framework.Processor)">setDefaultNextProcessor</A>, <A HREF="../../../../org/archive/crawler/framework/Processor.html#spawn(int)">spawn</A></CODE></TD></TR></TABLE> <A NAME="methods_inherited_from_class_org.archive.crawler.settings.ModuleType"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Methods inherited from class org.archive.crawler.settings.<A HREF="../../../../org/archive/crawler/settings/ModuleType.html" title="class in org.archive.crawler.settings">ModuleType</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../org/archive/crawler/settings/ModuleType.html#addElement(org.archive.crawler.settings.CrawlerSettings, org.archive.crawler.settings.Type)">addElement</A></CODE></TD></TR></TABLE> <A NAME="methods_inherited_from_class_org.archive.crawler.settings.ComplexType"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Methods inherited from class org.archive.crawler.settings.<A HREF="../../../../org/archive/crawler/settings/ComplexType.html" title="class in org.archive.crawler.settings">ComplexType</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../org/archive/crawler/settings/ComplexType.html#addElementToDefinition(org.archive.crawler.settings.Type)">addElementToDefinition</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#checkValue(org.archive.crawler.settings.CrawlerSettings, java.lang.String, java.lang.Object)">checkValue</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#earlyInitialize(org.archive.crawler.settings.CrawlerSettings)">earlyInitialize</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAbsoluteName()">getAbsoluteName</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttribute(java.lang.Object, java.lang.String)">getAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttribute(java.lang.String)">getAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttribute(java.lang.String, org.archive.crawler.datamodel.CrawlURI)">getAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttributeInfo(org.archive.crawler.settings.CrawlerSettings, java.lang.String)">getAttributeInfo</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttributeInfo(java.lang.String)">getAttributeInfo</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttributeInfoIterator(java.lang.Object)">getAttributeInfoIterator</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttributes(java.lang.String[])">getAttributes</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getDataContainerRecursive(org.archive.crawler.settings.ComplexType.Context)">getDataContainerRecursive</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getDataContainerRecursive(org.archive.crawler.settings.ComplexType.Context, java.lang.String)">getDataContainerRecursive</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getDefaultValue()">getDefaultValue</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getDescription()">getDescription</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getElementFromDefinition(java.lang.String)">getElementFromDefinition</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getLegalValues()">getLegalValues</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getLocalAttribute(org.archive.crawler.settings.CrawlerSettings, java.lang.String)">getLocalAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getMBeanInfo()">getMBeanInfo</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getMBeanInfo(java.lang.Object)">getMBeanInfo</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getParent()">getParent</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getPreservedFields()">getPreservedFields</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getSettingsHandler()">getSettingsHandler</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getUncheckedAttribute(java.lang.Object, java.lang.String)">getUncheckedAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getValue()">getValue</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#globalSettings()">globalSettings</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#invoke(java.lang.String, java.lang.Object[], java.lang.String[])">invoke</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#isInitialized()">isInitialized</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#isOverridden(org.archive.crawler.settings.CrawlerSettings, java.lang.String)">isOverridden</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#iterator(java.lang.Object)">iterator</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#removeElementFromDefinition(java.lang.String)">removeElementFromDefinition</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#setAsOrder(org.archive.crawler.settings.SettingsHandler)">setAsOrder</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#setAttribute(javax.management.Attribute)">setAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#setAttribute(org.archive.crawler.settings.CrawlerSettings, javax.management.Attribute)">setAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#setAttributes(javax.management.AttributeList)">setAttributes</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#setDescription(java.lang.String)">setDescription</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#setPreservedFields(java.lang.String[])">setPreservedFields</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#toString()">toString</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#unsetAttribute(org.archive.crawler.settings.CrawlerSettings, java.lang.String)">unsetAttribute</A></CODE></TD></TR>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?