⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 fpmergeuriuniqfilter.html

📁 一个开源的网页爬虫一个开源的网页爬虫一个开源的网页爬虫一个开源的网页爬虫一个开源的网页爬虫一个开源的网页爬虫
💻 HTML
📖 第 1 页 / 共 3 页
字号:
<a name="246" href="#246">246</a> <em>     * URI along to the 'receiver' (frontier) for queueing. </em><a name="247" href="#247">247</a> <em>     * </em><a name="248" href="#248">248</a> <em>     * @return number of pending items actually added </em><a name="249" href="#249">249</a> <em>     */</em><a name="250" href="#250">250</a>     <strong>public</strong> <strong>synchronized</strong> <strong>long</strong> flush() {<a name="251" href="#251">251</a>         <strong>if</strong>(pending()==0) {<a name="252" href="#252">252</a>             <strong>return</strong> 0;<a name="253" href="#253">253</a>         }<a name="254" href="#254">254</a>         <strong>long</strong> flushStartTime = System.currentTimeMillis();<a name="255" href="#255">255</a>         <strong>long</strong> adds = 0; <a name="256" href="#256">256</a>         <strong>long</strong> fpOnlyAdds = 0;<a name="257" href="#257">257</a>         Long currFp = <strong>null</strong>; <a name="258" href="#258">258</a>         PendingItem currPend = <strong>null</strong>; <a name="259" href="#259">259</a>         <a name="260" href="#260">260</a>         Iterator pendIter = pendingSet.iterator();<a name="261" href="#261">261</a>         LongIterator fpIter = beginFpMerge();<a name="262" href="#262">262</a> <a name="263" href="#263">263</a>         currPend = (PendingItem) (pendIter.hasNext() ? pendIter.next() : <strong>null</strong>);<a name="264" href="#264">264</a>         currFp = (Long) (fpIter.hasNext() ? fpIter.next() : <strong>null</strong>); <a name="265" href="#265">265</a> <a name="266" href="#266">266</a>         <strong>while</strong>(<strong>true</strong>) {<a name="267" href="#267">267</a>             <strong>while</strong>(currFp!=<strong>null</strong> &amp;&amp; (currPend==<strong>null</strong>||(currFp.longValue() &lt;= currPend.fp))) {<a name="268" href="#268">268</a>                 addNewFp(currFp.longValue());<a name="269" href="#269">269</a>                 <strong>if</strong>(currPend!=<strong>null</strong> &amp;&amp; currFp.longValue() == currPend.fp) {<a name="270" href="#270">270</a>                     mergeDuplicateCount++;<a name="271" href="#271">271</a>                 }<a name="272" href="#272">272</a>                 <strong>if</strong>(fpIter.hasNext()) {<a name="273" href="#273">273</a>                     currFp = (Long) fpIter.next();<a name="274" href="#274">274</a>                 } <strong>else</strong> {<a name="275" href="#275">275</a>                     currFp = <strong>null</strong>;<a name="276" href="#276">276</a>                     <strong>break</strong>;<a name="277" href="#277">277</a>                 }<a name="278" href="#278">278</a>             }<a name="279" href="#279">279</a>             <strong>while</strong>(currPend!=<strong>null</strong> &amp;&amp; (currFp==<strong>null</strong>||(currFp.longValue() > currPend.fp))) {<a name="280" href="#280">280</a>                 addNewFp(currPend.fp);<a name="281" href="#281">281</a>                 <strong>if</strong>(currPend.caUri!=<strong>null</strong>) {<a name="282" href="#282">282</a>                     adds++;<a name="283" href="#283">283</a>                     <strong>this</strong>.receiver.receive(currPend.caUri);<a name="284" href="#284">284</a>                 } <strong>else</strong> {<a name="285" href="#285">285</a>                     fpOnlyAdds++;<a name="286" href="#286">286</a>                 }<a name="287" href="#287">287</a>                 <strong>if</strong>(pendIter.hasNext()) {<a name="288" href="#288">288</a>                     currPend = (PendingItem)pendIter.next();<a name="289" href="#289">289</a>                 } <strong>else</strong> {<a name="290" href="#290">290</a>                     currPend = <strong>null</strong>;<a name="291" href="#291">291</a>                     <strong>break</strong>;<a name="292" href="#292">292</a>                 }<a name="293" href="#293">293</a>             }<a name="294" href="#294">294</a>             <strong>if</strong>(currFp==<strong>null</strong>) {<a name="295" href="#295">295</a>                 <em class="comment">// currPend must be null too, or while wouldn't have exitted</em><a name="296" href="#296">296</a>                 <em class="comment">// done</em><a name="297" href="#297">297</a>                 <strong>break</strong>;<a name="298" href="#298">298</a>             } <a name="299" href="#299">299</a>         }<a name="300" href="#300">300</a>         <em class="comment">// maintain throttle timing</em><a name="301" href="#301">301</a>         <strong>long</strong> flushDuration = System.currentTimeMillis() - flushStartTime;<a name="302" href="#302">302</a>         nextFlushAllowableAfter = flushStartTime + (FLUSH_DELAY_FACTOR*flushDuration);<a name="303" href="#303">303</a>         <a name="304" href="#304">304</a>         <em class="comment">// add/duplicate statistics</em><a name="305" href="#305">305</a>         <strong>if</strong>(LOGGER.isLoggable(Level.INFO)) {<a name="306" href="#306">306</a>             <strong>long</strong> mergeDups = (mergeDuplicateCount-mergeDupAtLast);<a name="307" href="#307">307</a>             <strong>long</strong> pendDups = (pendDuplicateCount-pendDupAtLast);<a name="308" href="#308">308</a>             <strong>long</strong> quickDups = (quickDuplicateCount-quickDupAtLast);<a name="309" href="#309">309</a>             LOGGER.info(<span class="string">"flush took "</span>+flushDuration+<span class="string">"ms: "</span><a name="310" href="#310">310</a>                     +adds+<span class="string">" adds, "</span><a name="311" href="#311">311</a>                     +fpOnlyAdds+<span class="string">" fpOnlydds, "</span><a name="312" href="#312">312</a>                     +mergeDups+<span class="string">" mergeDups, "</span><a name="313" href="#313">313</a>                     +pendDups+<span class="string">" pendDups, "</span><a name="314" href="#314">314</a>                     +quickDups+<span class="string">" quickDups "</span>);<a name="315" href="#315">315</a>             <strong>if</strong>(adds==0 &amp;&amp; fpOnlyAdds==0 &amp;&amp; mergeDups == 0 &amp;&amp; pendDups == 0 &amp;&amp; quickDups == 0) {<a name="316" href="#316">316</a>                 LOGGER.info(<span class="string">"that's odd"</span>);<a name="317" href="#317">317</a>             }<a name="318" href="#318">318</a>         }<a name="319" href="#319">319</a>         mergeDupAtLast = mergeDuplicateCount;<a name="320" href="#320">320</a>         pendDupAtLast = pendDuplicateCount;<a name="321" href="#321">321</a>         quickDupAtLast = quickDuplicateCount;<a name="322" href="#322">322</a>         pendingSet.clear();<a name="323" href="#323">323</a>         finishFpMerge();<a name="324" href="#324">324</a>         <strong>return</strong> adds;<a name="325" href="#325">325</a>     }<a name="326" href="#326">326</a>     <a name="327" href="#327">327</a>     <em>/**<em>*</em></em><a name="328" href="#328">328</a> <em>     * Begin merging pending candidates with complete list. Return an</em><a name="329" href="#329">329</a> <em>     * Iterator which will return all previously-known FPs in turn. </em><a name="330" href="#330">330</a> <em>     * </em><a name="331" href="#331">331</a> <em>     * @return Iterator over all previously-known FPs</em><a name="332" href="#332">332</a> <em>     */</em><a name="333" href="#333">333</a>     <strong>abstract</strong> <strong>protected</strong> LongIterator beginFpMerge();<a name="334" href="#334">334</a> <a name="335" href="#335">335</a>     <a name="336" href="#336">336</a>     <em>/**<em>*</em></em><a name="337" href="#337">337</a> <em>     * Add an FP (which may be an old or new FP) to the new complete</em><a name="338" href="#338">338</a> <em>     * list. Should only be called after beginFpMerge() and before</em><a name="339" href="#339">339</a> <em>     * finishFpMerge(). </em><a name="340" href="#340">340</a> <em>     * </em><a name="341" href="#341">341</a> <em>     * @param fp  the FP to add</em><a name="342" href="#342">342</a> <em>     */</em><a name="343" href="#343">343</a>     <strong>abstract</strong> <strong>protected</strong> <strong>void</strong> addNewFp(<strong>long</strong> fp);<a name="344" href="#344">344</a> <a name="345" href="#345">345</a>     <em>/**<em>*</em></em><a name="346" href="#346">346</a> <em>     * Complete the merge of candidate and previously-known FPs (closing</em><a name="347" href="#347">347</a> <em>     * files/iterators as appropriate). </em><a name="348" href="#348">348</a> <em>     */</em><a name="349" href="#349">349</a>     <strong>abstract</strong> <strong>protected</strong> <strong>void</strong> finishFpMerge();<a name="350" href="#350">350</a> <a name="351" href="#351">351</a>     <strong>public</strong> <strong>void</strong> close() {<a name="352" href="#352">352</a>         <strong>if</strong> (profileLog != <strong>null</strong>) {<a name="353" href="#353">353</a>             profileLog.close();<a name="354" href="#354">354</a>         }<a name="355" href="#355">355</a>     }<a name="356" href="#356">356</a> <a name="357" href="#357">357</a>     <strong>public</strong> <strong>void</strong> setProfileLog(File logfile) {<a name="358" href="#358">358</a>         <strong>try</strong> {<a name="359" href="#359">359</a>             profileLog = <strong>new</strong> PrintWriter(<strong>new</strong> BufferedOutputStream(<a name="360" href="#360">360</a>                     <strong>new</strong> FileOutputStream(logfile)));<a name="361" href="#361">361</a>         } <strong>catch</strong> (FileNotFoundException e) {<a name="362" href="#362">362</a>             <strong>throw</strong> <strong>new</strong> RuntimeException(e);<a name="363" href="#363">363</a>         }<a name="364" href="#364">364</a>     }<a name="365" href="#365">365</a> }</pre><hr/><div id="footer">This page was automatically generated by <a href="http://maven.apache.org/">Maven</a></div></body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -