⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 prefixstringmatcher.java

📁 nutch搜索的改进型工具和优化爬虫的相关工具
💻 JAVA
字号:
/* Copyright (c) 2003 The Nutch Organization.  All rights reserved.   */
/* Use subject to the conditions in http://www.nutch.org/LICENSE.txt. */

package net.nutch.util;

import java.util.Collection;
import java.util.Iterator;

/**
 * A class for efficiently matching <code>String</code>s against a set
 * of prefixes.
 */
public class PrefixStringMatcher extends TrieStringMatcher {

  /**
   * Creates a new <code>PrefixStringMatcher</code> which will match
   * <code>String</code>s with any prefix in the supplied array.
   * Zero-length <code>Strings</code> are ignored.
   */
  public PrefixStringMatcher(String[] prefixes) {
    super();
    for (int i= 0; i < prefixes.length; i++)
      addPatternForward(prefixes[i]);
  }

  /**
   * Creates a new <code>PrefixStringMatcher</code> which will match
   * <code>String</code>s with any prefix in the supplied    
   * <code>Collection</code>.
   *
   * @throws ClassCastException if any <code>Object</code>s in the
   * collection are not <code>String</code>s
   */
  public PrefixStringMatcher(Collection prefixes) {
    super();
    Iterator iter= prefixes.iterator();
    while (iter.hasNext())
      addPatternForward((String)iter.next());
  }

  /**
   * Returns true if the given <code>String</code> is matched by a
   * prefix in the trie
   */
  public boolean matches(String input) {
    TrieNode node= root;
    for (int i= 0; i < input.length(); i++) {
      node= node.getChild(input.charAt(i));
      if (node == null) 
        return false;
      if (node.isTerminal())
        return true;
    }
    return false;
  }

  /**
   * Returns the shortest prefix of <code>input<code> that is matched,
   * or <code>null<code> if no match exists.
   */
  public String shortestMatch(String input) {
    TrieNode node= root;
    for (int i= 0; i < input.length(); i++) {
      node= node.getChild(input.charAt(i));
      if (node == null) 
        return null;
      if (node.isTerminal())
        return input.substring(0, i+1);
    }
    return null;
  }

  /**
   * Returns the longest prefix of <code>input<code> that is matched,
   * or <code>null<code> if no match exists.
   */
  public String longestMatch(String input) {
    TrieNode node= root;
    String result= null;
    for (int i= 0; i < input.length(); i++) {
      node= node.getChild(input.charAt(i));
      if (node == null) 
        break;
      if (node.isTerminal())
        result= input.substring(0, i+1);
    }
    return result;
  }

  public static final void main(String[] argv) {
    PrefixStringMatcher matcher= 
      new PrefixStringMatcher( 
        new String[] 
        {"abcd", "abc", "aac", "baz", "foo", "foobar"} );

    String[] tests= {"a", "ab", "abc", "abcdefg", "apple", "aa", "aac",
                     "aaccca", "abaz", "baz", "bazooka", "fo", "foobar",
                     "kite", };

    for (int i= 0; i < tests.length; i++) {
      System.out.println("testing: " + tests[i]);
      System.out.println("   matches: " + matcher.matches(tests[i]));
      System.out.println("  shortest: " + matcher.shortestMatch(tests[i]));
      System.out.println("   longest: " + matcher.longestMatch(tests[i]));
    }
  }
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -