⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 transclusionfilter.html

📁 用JAVA编写的,在做实验的时候留下来的,本来想删的,但是传上来,大家分享吧
💻 HTML
📖 第 1 页 / 共 2 页
字号:
<a name="91" href="#91">91</a>                  ATTR_MAX_SPECULATIVE_HOPS,<a name="92" href="#92">92</a>                  <span class="string">"Maximum number of consecutive speculative (i.e. URIs"</span> +<a name="93" href="#93">93</a>                  <span class="string">" extracted that we are not sure if they are embeds or"</span> +<a name="94" href="#94">94</a>                  <span class="string">" not) hops to allow.\nA value of -1 means no upper limit."</span>,<a name="95" href="#95">95</a>                  <strong>new</strong> Integer(DEFAULT_MAX_SPECULATIVE_HOPS)));<a name="96" href="#96">96</a>          addElementToDefinition(<a name="97" href="#97">97</a>              <strong>new</strong> <a href="../../../../org/archive/crawler/settings/SimpleType.html">SimpleType</a>(<a name="98" href="#98">98</a>                  ATTR_MAX_REFERRAL_HOPS,<a name="99" href="#99">99</a>                  <span class="string">"Maximum number of consecutive referral hops to allow.\n"</span> +<a name="100" href="#100">100</a>                 <span class="string">"A value of -1 means no upper limit."</span>,<a name="101" href="#101">101</a>                 <strong>new</strong> Integer(DEFAULT_MAX_REFERRAL_HOPS)));<a name="102" href="#102">102</a>         addElementToDefinition(<a name="103" href="#103">103</a>             <strong>new</strong> <a href="../../../../org/archive/crawler/settings/SimpleType.html">SimpleType</a>(<a name="104" href="#104">104</a>                 ATTR_MAX_EMBED_HOPS,<a name="105" href="#105">105</a>                 <span class="string">"Maximum number of consecutive embed hops to allow.\n"</span> +<a name="106" href="#106">106</a>                 <span class="string">"A value of -1 means no upper limit."</span>,<a name="107" href="#107">107</a>                 <strong>new</strong> Integer(DEFAULT_MAX_EMBED_HOPS)));<a name="108" href="#108">108</a>     }<a name="109" href="#109">109</a> <a name="110" href="#110">110</a>     <em class="comment">/*<em class="comment"> (non-Javadoc)</em></em><a name="111" href="#111">111</a> <em class="comment">     * @see org.archive.crawler.framework.Filter#innerAccepts(java.lang.Object)</em><a name="112" href="#112">112</a> <em class="comment">     */</em><a name="113" href="#113">113</a>     <strong>protected</strong> <strong>boolean</strong> innerAccepts(Object o) {<a name="114" href="#114">114</a>         <strong>if</strong>(! (o instanceof CandidateURI)) {<a name="115" href="#115">115</a>             <strong>return</strong> false;<a name="116" href="#116">116</a>         }<a name="117" href="#117">117</a>         String path = ((CandidateURI)o).getPathFromSeed();<a name="118" href="#118">118</a>         <strong>int</strong> transCount = 0;<a name="119" href="#119">119</a>         <strong>int</strong> specCount = 0;<a name="120" href="#120">120</a>         <strong>int</strong> refCount = 0;<a name="121" href="#121">121</a>         <strong>int</strong> embedCount = 0;<a name="122" href="#122">122</a>         loop: <strong>for</strong>(<strong>int</strong> i=path.length()-1;i>=0;i--) {<a name="123" href="#123">123</a>             <em class="comment">// everything except 'L' is considered transitive</em><a name="124" href="#124">124</a>             <strong>switch</strong> (path.charAt(i)) {<a name="125" href="#125">125</a>                 <strong>case</strong> Link.NAVLINK_HOP: {<a name="126" href="#126">126</a>                     <strong>break</strong> loop;<a name="127" href="#127">127</a>                 }<a name="128" href="#128">128</a>                 <strong>case</strong> Link.PREREQ_HOP: {<a name="129" href="#129">129</a>                     <strong>if</strong>(transCount==0) {<a name="130" href="#130">130</a>                         <em class="comment">// always consider a trailing P as a 1-hop trans inclusion; disregard previous hops</em><a name="131" href="#131">131</a>                         transCount++;<a name="132" href="#132">132</a>                         <strong>break</strong> loop;<a name="133" href="#133">133</a>                     }<a name="134" href="#134">134</a>                     <em class="comment">// otherwise, just count as another regular trans hop</em><a name="135" href="#135">135</a>                     <strong>break</strong>;<a name="136" href="#136">136</a>                 }<a name="137" href="#137">137</a>                 <strong>case</strong> Link.SPECULATIVE_HOP: {<a name="138" href="#138">138</a>                     specCount++;<a name="139" href="#139">139</a>                     <strong>break</strong>;<a name="140" href="#140">140</a>                 }<a name="141" href="#141">141</a>                 <strong>case</strong> Link.REFER_HOP: {<a name="142" href="#142">142</a>                     refCount++;<a name="143" href="#143">143</a>                     <strong>break</strong>;<a name="144" href="#144">144</a>                 }<a name="145" href="#145">145</a>                 <strong>case</strong> Link.EMBED_HOP: {<a name="146" href="#146">146</a>                     embedCount++;<a name="147" href="#147">147</a>                     <strong>break</strong>;<a name="148" href="#148">148</a>                 }<a name="149" href="#149">149</a>                 <em class="comment">// FIXME: what is 'D'?</em><a name="150" href="#150">150</a>                 <em class="comment">// 'D's get a free pass</em><a name="151" href="#151">151</a>             }<a name="152" href="#152">152</a>             transCount++;<a name="153" href="#153">153</a>         }<a name="154" href="#154">154</a> <a name="155" href="#155">155</a>         readMaxValues(o);<a name="156" href="#156">156</a> <a name="157" href="#157">157</a>         <em class="comment">// This is a case of possible transclusion</em><a name="158" href="#158">158</a>         <strong>return</strong> (transCount > 0) <a name="159" href="#159">159</a>             <em class="comment">// ...and the overall number of hops isn't too high</em><a name="160" href="#160">160</a>             &amp;&amp; (transCount &lt;= <strong>this</strong>.maxTransHops) <a name="161" href="#161">161</a>             <em class="comment">// ...and the number of spec-hops isn't too high</em><a name="162" href="#162">162</a>             &amp;&amp; (<strong>this</strong>.maxSpeculativeHops &lt; 0 ||  specCount &lt;= <strong>this</strong>.maxSpeculativeHops) <a name="163" href="#163">163</a>             <em class="comment">// ...and the number of referral-hops isn't too high</em><a name="164" href="#164">164</a>             &amp;&amp; (<strong>this</strong>.maxReferralHops &lt; 0 || refCount &lt;= <strong>this</strong>.maxReferralHops)<a name="165" href="#165">165</a>             <em class="comment">// ...and the number of embed-hops isn't too high</em><a name="166" href="#166">166</a>             &amp;&amp; (<strong>this</strong>.maxEmbedHops &lt; 0 || embedCount &lt;= <strong>this</strong>.maxEmbedHops);<a name="167" href="#167">167</a>     }<a name="168" href="#168">168</a> <a name="169" href="#169">169</a>     <strong>public</strong> <strong>void</strong> readMaxValues(Object o) {<a name="170" href="#170">170</a>         <strong>try</strong> {<a name="171" href="#171">171</a>             <a href="../../../../org/archive/crawler/framework/CrawlScope.html">CrawlScope</a> scope =<a name="172" href="#172">172</a>                 (<a href="../../../../org/archive/crawler/framework/CrawlScope.html">CrawlScope</a>) globalSettings().getModule(CrawlScope.ATTR_NAME);<a name="173" href="#173">173</a>             <strong>this</strong>.maxTransHops = ((Integer) scope.getAttribute(o, ClassicScope.ATTR_MAX_TRANS_HOPS)).intValue();<a name="174" href="#174">174</a>             <strong>this</strong>.maxSpeculativeHops = ((Integer) getAttribute(o, ATTR_MAX_SPECULATIVE_HOPS)).intValue();<a name="175" href="#175">175</a>             <strong>this</strong>.maxReferralHops = ((Integer) getAttribute(o, ATTR_MAX_REFERRAL_HOPS)).intValue();<a name="176" href="#176">176</a>             <strong>this</strong>.maxEmbedHops = ((Integer) getAttribute(o, ATTR_MAX_EMBED_HOPS)).intValue();<a name="177" href="#177">177</a>         } <strong>catch</strong> (AttributeNotFoundException e) {<a name="178" href="#178">178</a>             <em class="comment">// TODO Auto-generated catch block</em><a name="179" href="#179">179</a>             e.printStackTrace();<a name="180" href="#180">180</a>         }<a name="181" href="#181">181</a>     }<a name="182" href="#182">182</a> <a name="183" href="#183">183</a> }</pre><hr/><div id="footer">This page was automatically generated by <a href="http://maven.apache.org/">Maven</a></div></body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -