⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 urlfilter.html

📁 用java做的一个类似网页爬虫的东西
💻 HTML
📖 第 1 页 / 共 3 页
字号:
public static final java.lang.String <B>SELFURLFLAG1</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.SELFURLFLAG1">Constant Field Values</A></DL>
</DL>
<HR>

<A NAME="JAVASCRIPTFLAG"><!-- --></A><H3>
JAVASCRIPTFLAG</H3>
<PRE>
public static final java.lang.String <B>JAVASCRIPTFLAG</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.JAVASCRIPTFLAG">Constant Field Values</A></DL>
</DL>
<HR>

<A NAME="MAILTOFLAG"><!-- --></A><H3>
MAILTOFLAG</H3>
<PRE>
public static final java.lang.String <B>MAILTOFLAG</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.MAILTOFLAG">Constant Field Values</A></DL>
</DL>
<HR>

<A NAME="SCRIPTFLAG_BEGIN"><!-- --></A><H3>
SCRIPTFLAG_BEGIN</H3>
<PRE>
public static final java.lang.String <B>SCRIPTFLAG_BEGIN</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.SCRIPTFLAG_BEGIN">Constant Field Values</A></DL>
</DL>
<HR>

<A NAME="HTMLFLAG_BEGIN"><!-- --></A><H3>
HTMLFLAG_BEGIN</H3>
<PRE>
public static final java.lang.String <B>HTMLFLAG_BEGIN</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.HTMLFLAG_BEGIN">Constant Field Values</A></DL>
</DL>
<HR>

<A NAME="HTMLUNDOFLAG_BEGIN"><!-- --></A><H3>
HTMLUNDOFLAG_BEGIN</H3>
<PRE>
public static final java.lang.String <B>HTMLUNDOFLAG_BEGIN</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.HTMLUNDOFLAG_BEGIN">Constant Field Values</A></DL>
</DL>
<HR>

<A NAME="HTMLFLAG_END"><!-- --></A><H3>
HTMLFLAG_END</H3>
<PRE>
public static final java.lang.String <B>HTMLFLAG_END</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.HTMLFLAG_END">Constant Field Values</A></DL>
</DL>
<HR>

<A NAME="HTMLUNDOFLAG_END"><!-- --></A><H3>
HTMLUNDOFLAG_END</H3>
<PRE>
public static final java.lang.String <B>HTMLUNDOFLAG_END</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.HTMLUNDOFLAG_END">Constant Field Values</A></DL>
</DL>
<HR>

<A NAME="SCRIPTFLAG_END"><!-- --></A><H3>
SCRIPTFLAG_END</H3>
<PRE>
public static final java.lang.String <B>SCRIPTFLAG_END</B></PRE>
<DL>
<DL>
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#com.snoics.reptile.parse.UrlFilter.SCRIPTFLAG_END">Constant Field Values</A></DL>
</DL>

<!-- ========= CONSTRUCTOR DETAIL ======== -->

<A NAME="constructor_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY="">
<TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor">
<TD COLSPAN=1><FONT SIZE="+2">
<B>Constructor Detail</B></FONT></TD>
</TR>
</TABLE>

<A NAME="UrlFilter()"><!-- --></A><H3>
UrlFilter</H3>
<PRE>
public <B>UrlFilter</B>()</PRE>
<DL>
</DL>

<!-- ============ METHOD DETAIL ========== -->

<A NAME="method_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY="">
<TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor">
<TD COLSPAN=1><FONT SIZE="+2">
<B>Method Detail</B></FONT></TD>
</TR>
</TABLE>

<A NAME="filter(java.util.ArrayList, java.util.ArrayList, java.util.ArrayList, java.util.ArrayList, java.util.ArrayList, java.util.ArrayList, java.lang.String, java.lang.String)"><!-- --></A><H3>
filter</H3>
<PRE>
public java.lang.String[] <B>filter</B>(java.util.ArrayList&nbsp;websiteurllist,                                 java.util.ArrayList&nbsp;fileList,                                 java.util.ArrayList&nbsp;urlList,                                 java.util.ArrayList&nbsp;downloadfileList,                                 java.util.ArrayList&nbsp;downloadfiletype,                                 java.util.ArrayList&nbsp;relinkremotefiletypelist,                                 java.lang.String&nbsp;parseflag,                                 java.lang.String&nbsp;hrefstring)</PRE>
<DL>
<DD>过滤无效的URL
<P>
<DD><DL>
<DT><B>Parameters:</B><DD><CODE>websiteurllist</CODE> - 将要抓取的路径<DD><CODE>fileList</CODE> - 存放需要修改为本地路径的URL<DD><CODE>urlList</CODE> - 存放需要修改为服务器路径的URL<DD><CODE>downloadfileList</CODE> - 存放需要下载到本地的文件URL<DD><CODE>downloadfiletype</CODE> - 需要下载到本地的文件类型<DD><CODE>relinkremotefiletypelist</CODE> - 路径保存为服务器上的绝对路径的文件类型<DD><CODE>parseflag</CODE> - 解析URL使用的类型<DD><CODE>hrefstring</CODE> - URL<DT><B>Returns:</B><DD>String</DL>
</DD>
</DL>
<HR>

<A NAME="indexOfFlag(java.util.ArrayList, java.lang.String)"><!-- --></A><H3>
indexOfFlag</H3>
<PRE>
public java.lang.String <B>indexOfFlag</B>(java.util.ArrayList&nbsp;urlflaglist,                                    java.lang.String&nbsp;string)</PRE>
<DL>
<DD>判断是否是属于URL类型的字符串
<P>
<DD><DL>
<DT><B>Parameters:</B><DD><CODE>urlflaglist</CODE> - 属于URL标志的列表<DD><CODE>string</CODE> - 字符串<DT><B>Returns:</B><DD>String</DL>
</DD>
</DL>
<HR>

<A NAME="isParse(java.util.ArrayList, java.lang.String)"><!-- --></A><H3>
isParse</H3>
<PRE>
public boolean <B>isParse</B>(java.util.ArrayList&nbsp;websiteurllist,                       java.lang.String&nbsp;url)</PRE>
<DL>
<DD>判断当前URL是否处于需要解析的范围
<P>
<DD><DL>
<DT><B>Parameters:</B><DD><CODE>websiteurllist</CODE> - 需解析的URL范围<DD><CODE>url</CODE> - 当前URL<DT><B>Returns:</B><DD>boolean</DL>
</DD>
</DL>
<HR>

<A NAME="isForbid(java.util.ArrayList, java.lang.String)"><!-- --></A><H3>
isForbid</H3>
<PRE>
public boolean <B>isForbid</B>(java.util.ArrayList&nbsp;forbidurllist,                        java.lang.String&nbsp;url)</PRE>
<DL>
<DD>判断是否属于不抓取的范围
<P>
<DD><DL>
<DT><B>Parameters:</B><DD><CODE>forbidurllist</CODE> - 不需要抓取的URL列表<DD><CODE>url</CODE> - URL<DT><B>Returns:</B><DD>boolean</DL>
</DD>
</DL>
<HR>

<A NAME="isUnrelinktype(java.util.ArrayList, java.lang.String)"><!-- --></A><H3>
isUnrelinktype</H3>
<PRE>
public boolean <B>isUnrelinktype</B>(java.util.ArrayList&nbsp;unrelinktype,                              java.lang.String&nbsp;url)</PRE>
<DL>
<DD>判断是否属于不抓取的文件类型
<P>
<DD><DL>
<DT><B>Parameters:</B><DD><CODE>unrelinktype</CODE> - 路径需要改为服务器路径的文件类型<DD><CODE>url</CODE> - <DT><B>Returns:</B><DD>boolean</DL>
</DD>
</DL>
<HR>

<A NAME="isDownloadfiletype(java.util.ArrayList, java.lang.String)"><!-- --></A><H3>
isDownloadfiletype</H3>
<PRE>
public boolean <B>isDownloadfiletype</B>(java.util.ArrayList&nbsp;downloadfiletype,                                  java.lang.String&nbsp;url)</PRE>
<DL>
<DD>判断是否属于需要下载到本地的文件类型(优先级高于isUnrelinktype())
<P>
<DD><DL>
<DT><B>Parameters:</B><DD><CODE>downloadfiletype</CODE> - 需要下载到本地的文件类型<DD><CODE>url</CODE> - <DT><B>Returns:</B><DD>boolean</DL>
</DD>
</DL>
<!-- ========= END OF CLASS DATA ========= -->
<HR>

<!-- ======= START OF BOTTOM NAVBAR ====== -->
<A NAME="navbar_bottom"><!-- --></A><A HREF="#skip-navbar_bottom" title="Skip navigation links"></A><TABLE BORDER="0" WIDTH="100%" CELLPADDING="1" CELLSPACING="0" SUMMARY="">
<TR>
<TD COLSPAN=3 BGCOLOR="#EEEEFF" CLASS="NavBarCell1">
<A NAME="navbar_bottom_firstrow"><!-- --></A><TABLE BORDER="0" CELLPADDING="0" CELLSPACING="3" SUMMARY="">
  <TR ALIGN="center" VALIGN="top">
  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../overview-summary.html"><FONT CLASS="NavBarFont1"><B>Overview</B></FONT></A>&nbsp;</TD>
  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-summary.html"><FONT CLASS="NavBarFont1"><B>Package</B></FONT></A>&nbsp;</TD>
  <TD BGCOLOR="#FFFFFF" CLASS="NavBarCell1Rev"> &nbsp;<FONT CLASS="NavBarFont1Rev"><B>Class</B></FONT>&nbsp;</TD>
  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="class-use/UrlFilter.html"><FONT CLASS="NavBarFont1"><B>Use</B></FONT></A>&nbsp;</TD>
  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-tree.html"><FONT CLASS="NavBarFont1"><B>Tree</B></FONT></A>&nbsp;</TD>
  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../deprecated-list.html"><FONT CLASS="NavBarFont1"><B>Deprecated</B></FONT></A>&nbsp;</TD>
  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../index-files/index-1.html"><FONT CLASS="NavBarFont1"><B>Index</B></FONT></A>&nbsp;</TD>
  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../help-doc.html"><FONT CLASS="NavBarFont1"><B>Help</B></FONT></A>&nbsp;</TD>
  </TR>
</TABLE>
</TD>
<TD ALIGN="right" VALIGN="top" ROWSPAN=3><EM>
</EM>
</TD>
</TR>

<TR>
<TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">
&nbsp;<A HREF="../../../../com/snoics/reptile/parse/ParseFileImpl.html" title="class in com.snoics.reptile.parse"><B>PREV CLASS</B></A>&nbsp;
&nbsp;NEXT CLASS</FONT></TD>
<TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">
  <A HREF="../../../../index.html" target="_top"><B>FRAMES</B></A>  &nbsp;
&nbsp;<A HREF="UrlFilter.html" target="_top"><B>NO FRAMES</B></A>  &nbsp;
&nbsp;<SCRIPT type="text/javascript">
  <!--
  if(window==top) {
    document.writeln('<A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A>');
  }
  //-->
</SCRIPT>
<NOSCRIPT>
  <A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A>
</NOSCRIPT>
</FONT></TD>
</TR>
<TR>
<TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">
  SUMMARY:&nbsp;NESTED&nbsp;|&nbsp;<A HREF="#field_summary">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_summary">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_summary">METHOD</A></FONT></TD>
<TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">
DETAIL:&nbsp;<A HREF="#field_detail">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_detail">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_detail">METHOD</A></FONT></TD>
</TR>
</TABLE>
<A NAME="skip-navbar_bottom"></A><!-- ======== END OF BOTTOM NAVBAR ======= -->

<HR>

</BODY>
</HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -