⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 perl5util.java

📁 java正这表达式,简单.好用.
💻 JAVA
📖 第 1 页 / 共 3 页
字号:
package org.apache.oro.text.perl;/* ==================================================================== * The Apache Software License, Version 1.1 * * Copyright (c) 2000 The Apache Software Foundation.  All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright *    notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright *    notice, this list of conditions and the following disclaimer in *    the documentation and/or other materials provided with the *    distribution. * * 3. The end-user documentation included with the redistribution, *    if any, must include the following acknowledgment: *       "This product includes software developed by the *        Apache Software Foundation (http://www.apache.org/)." *    Alternately, this acknowledgment may appear in the software itself, *    if and wherever such third-party acknowledgments normally appear. * * 4. The names "Apache" and "Apache Software Foundation", "Jakarta-Oro"  *    must not be used to endorse or promote products derived from this *    software without prior written permission. For written *    permission, please contact apache@apache.org. * * 5. Products derived from this software may not be called "Apache"  *    or "Jakarta-Oro", nor may "Apache" or "Jakarta-Oro" appear in their  *    name, without prior written permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation.  For more * information on the Apache Software Foundation, please see * <http://www.apache.org/>. * * Portions of this software are based upon software originally written  * by Daniel F. Savarese. We appreciate his contributions. */import java.util.*;import org.apache.oro.text.regex.*;import org.apache.oro.text.*;import org.apache.oro.util.*;/** * This is a utility class implementing the 3 most common Perl5 operations * involving regular expressions: * <ul> * <li> [m]/pattern/[i][m][s][x], * <li> s/pattern/replacement/[g][i][m][o][s][x], * <li> and split(). * </ul> * As with Perl, any non-alphanumeric character can be used in lieu of * the slashes. *  <p> * The objective of the class is to minimize the amount of code a Java * programmer using OROMatcher<font size="-2"><sup>TM</sup></font> * has to write to achieve the same results as Perl by  * transparently handling regular expression compilation, caching, and * matching.  A second objective is to use the same Perl pattern matching * syntax to ease the task of Perl programmers transitioning to Java * (this also reduces the number of parameters to a method). * All the state affecting methods are synchronized to avoid * the maintenance of explicit locks in multithreaded programs.  This * philosophy differs from the * OROMatcher<font size="-2"><sup>TM</sup></font> package, where * you are expected to either maintain explicit locks, or more preferably * create separate compiler and matcher instances for each thread. * <p> * To use this class, first create an instance using the default constructor * or initialize the instance with a PatternCache of your choosing using * the alternate constructor.  The default cache used by Perl5Util is a * PatternCacheLRU of capacity GenericPatternCache.DEFAULT_CAPACITY.  You may * want to create a cache with a different capacity, a different * cache replacement policy, or even devise your own PatternCache * implementation.  The PatternCacheLRU is probably the best general purpose * pattern cache, but your specific application may be better served by * a different cache replacement policy.  You should remember that you can * front-load a cache with all the patterns you will be using before * initializing a Perl5Util instance, or you can just let Perl5Util * fill the cache as you use it. * <p> * You might use the class as follows: * <pre> * Perl5Util util = new Perl5Util(); * String line; * DataInputStream input; * PrintStream output; *  * // Initialization of input and output omitted * while((line = input.readLine()) != null) { *     // First find the line with the string we want to substitute because *     // it is cheaper than blindly substituting each line. *     if(util.match("/HREF=\"description1.html\"") { *        line = util.substitute("s/description1\\.html/about1.html/", line); *     } *    output.println(line); * } * </pre> * <p> * A couple of things to remember when using this class are that the * {@link #match match()} methods have the same meaning as * contains() in OROMatcher<font size="-2"><sup>TM</sup></font> * and <code>=~ m/pattern/</code> in Perl.  The methods are named match * to more closely associate them with Perl and to differentiate them * from matches() in OROMatcher<font size="-2"><sup>TM</sup></font>. * A further thing to keep in mind is that the * {@link MalformedPerl5PatternException} class is derived from * RuntimeException which means you DON'T have to catch it.  The reasoning * behind this is that you will detect your regular expression mistakes * as you write and debug your program when a MalformedPerl5PatternException * is thrown during a test run.  However, we STRONGLY recommend that you * ALWAYS catch MalformedPerl5PatternException whenever you deal with a * DYNAMICALLY created pattern.  Relying on a fatal * MalformedPerl5PatternException being thrown to detect errors while * debugging is only useful for dealing with static patterns, that is, actual * pregenerated strings present in your program.  Patterns created from user * input or some other dynamic method CANNOT be relied upon to be correct * and MUST be handled by catching MalformedPerl5PatternException for your * programs to be robust. * <p> * Finally, as a convenience Perl5Util implements  * the org.apache.oro.text.regex.MatchResult interface found in the * OROMatcher<font size="-2"><sup>TM</sup></font> package.  The methods * are merely wrappers which call the corresponding method of the last * MatchResult found (which can be accessed with * {@link #getMatch()} by a match or substitution * (or even a split, but this isn't particularly useful). @author <a href="mailto:dfs@savarese.org">Daniel F. Savarese</a> @version $Id: Perl5Util.java,v 1.3 2000/09/15 05:17:26 dfs Exp $ * @see MalformedPerl5PatternException * @see org.apache.oro.text.PatternCache * @see org.apache.oro.text.PatternCacheLRU * @see org.apache.oro.text.regex.MatchResult */public final class Perl5Util implements MatchResult {  /** The regular expression to use to parse match expression. */  private static final String __matchExpression = "m?(\\W)(.*)\\1([imsx]*)";  /** The pattern cache to compile and store patterns */  private PatternCache __patternCache;  /** The hashtable to cache higher-level expressions */  private Cache __expressionCache;  /** The pattern matcher to perform matching operations. */  private Perl5Matcher __matcher = new Perl5Matcher();  /** The compiled match expression parsing regular expression. */  private Pattern __matchPattern;  /** The last match from a successful call to a matching method. */  private MatchResult __lastMatch;  /**   * Keeps track of the original input (for postMatch() and preMatch())   * methods.  This will be discarded if the preMatch() and postMatch()   * methods are moved into the MatchResult interface.   */  private Object __originalInput;  /**   * Keeps track of the begin and end offsets of the original input for   * the postMatch() and preMatch() methods.   */  private int __inputBeginOffset, __inputEndOffset;  /** Used for default return value of post and pre Match() */  private static final String __nullString = "";  /**   * A constant passed to the {@link #split split()} methods indicating   * that all occurrences of a pattern should be used to split a string.    */  public static final int SPLIT_ALL = Util.SPLIT_ALL;  /**   * A secondary constructor for Perl5Util.  It initializes the Perl5Matcher   * used by the class to perform matching operations, but requires the   * programmer to provide a PatternCache instance for the class   * to use to compile and store regular expressions.  You would want to   * use this constructor if you want to change the capacity or policy    * of the cache used.  Example uses might be:   * <pre>   * // We know we're going to use close to 50 expressions a whole lot, so   * // we create a cache of the proper size.   * util = new Perl5Util(new PatternCacheLRU(50));   * </pre>   * or   * <pre>   * // We're only going to use a few expressions and know that second-chance   * // fifo is best suited to the order in which we are using the patterns.   * util = new Perl5Util(new PatternCacheFIFO2(10));   * </pre>   */  public Perl5Util(PatternCache cache) {    __patternCache = cache;    __expressionCache = new CacheLRU(cache.capacity());    __compilePatterns();  }  /**   * Default constructor for Perl5Util.  This initializes the Perl5Matcher   * used by the class to perform matching operations and creates a   * default PatternCacheLRU instance to use to compile and cache regular   * expressions.  The size of this cache is    * GenericPatternCache.DEFAULT_CAPACITY.   */  public Perl5Util() {    this(new PatternCacheLRU());  }  /**   * Compiles the patterns (currently only the match expression) used to   * parse Perl5 expressions.  Right now it initializes __matchPattern.   */  private void __compilePatterns() {    Perl5Compiler compiler = new Perl5Compiler();    try {      __matchPattern = 	compiler.compile(__matchExpression, Perl5Compiler.SINGLELINE_MASK);    } catch(MalformedPatternException e) {      // This should only happen during debugging.      //e.printStackTrace();      throw new RuntimeException(e.getMessage());    }  }  /**   * Parses a match expression and returns a compiled pattern.   * First checks the expression cache and if the pattern is not found,   * then parses the expression and fetches a compiled pattern from the   * pattern cache.  Otherwise, just uses the pattern found in the   * expression cache.  __matchPattern is used to parse the expression.   * <p>   * @param pattern  The Perl5 match expression to parse.   * @exception MalformedPerl5PatternException If there is an error parsing   *            the expression.   */  private Pattern __parseMatchExpression(String pattern)       throws MalformedPerl5PatternException   {    int index, compileOptions;    String options, regex;    MatchResult result;    Object obj;    Pattern ret;    obj = __expressionCache.getElement(pattern);    // Must catch ClassCastException because someone might incorrectly     // pass an s/// expression.  try block is cheaper than checking    // instanceof    try {      if(obj != null)	return (Pattern)obj;    } catch(ClassCastException e) {      // Fall through and parse expression    }    if(!__matcher.matches(pattern, __matchPattern))      throw new	MalformedPerl5PatternException("Invalid expression: " +				       pattern);    result = __matcher.getMatch();    regex = result.group(2);    compileOptions = Perl5Compiler.DEFAULT_MASK;    options = result.group(3);    if(options != null) {      index = options.length();      while(index-- > 0) {	switch(options.charAt(index)) {	case 'i' :	  compileOptions |= Perl5Compiler.CASE_INSENSITIVE_MASK;	  break;	case 'm' : compileOptions |= Perl5Compiler.MULTILINE_MASK; break;	case 's' : compileOptions |= Perl5Compiler.SINGLELINE_MASK; break;	case 'x' : compileOptions |= Perl5Compiler.EXTENDED_MASK; break;	default  :	  throw new	    MalformedPerl5PatternException("Invalid options: " + options);	}      }    }    ret = __patternCache.getPattern(regex, compileOptions);    __expressionCache.addElement(pattern, ret);    return ret;  }  /**   * Searches for the first pattern match somewhere in a character array   * taking a pattern specified in Perl5 native format:   * <blockquote><pre>   * [m]/pattern/[i][m][s][x]   * </pre></blockquote>   * The <code>m</code> prefix is optional and the meaning of the optional   * trailing options are:   * <dl compact>    * <dt> i <dd> case insensitive match   * <dt> m <dd> treat the input as consisting of multiple lines   * <dt> s <dd> treat the input as consisting of a single line   * <dt> x <dd> enable extended expression syntax incorporating whitespace   *             and comments   * </dl>   * As with Perl, any non-alphanumeric character can be used in lieu of   * the slashes.   * <p>   * If the input contains the pattern, the org.apache.oro.text.regex.MatchResult   * can be obtained by calling {@link #getMatch()}.   * However, Perl5Util implements the MatchResult interface as a wrapper   * around the last MatchResult found, so you can call its methods to   * access match information.   * <p>   * @param pattern  The pattern to search for.   * @param input    The char[] input to search.   * @return True if the input contains the pattern, false otherwise.   * @exception MalformedPerl5PatternException  If there is an error in   *            the pattern.  You are not forced to catch this exception   *            because it is derived from RuntimeException.   */  public synchronized boolean match(String pattern, char[] input)        throws MalformedPerl5PatternException  {    boolean result;    __parseMatchExpression(pattern);    result = __matcher.contains(input, __parseMatchExpression(pattern));			     if(result) {      __lastMatch        = __matcher.getMatch();      __originalInput    = input;      __inputBeginOffset = 0;      __inputEndOffset   = input.length;    }    return result;  }  /**   * Searches for the first pattern match in a String taking   * a pattern specified in Perl5 native format:   * <blockquote><pre>   * [m]/pattern/[i][m][s][x]   * </pre></blockquote>   * The <code>m</code> prefix is optional and the meaning of the optional   * trailing options are:   * <dl compact>    * <dt> i <dd> case insensitive match   * <dt> m <dd> treat the input as consisting of multiple lines   * <dt> s <dd> treat the input as consisting of a single line   * <dt> x <dd> enable extended expression syntax incorporating whitespace   *             and comments   * </dl>   * As with Perl, any non-alphanumeric character can be used in lieu of   * the slashes.   * <p>   * If the input contains the pattern, the org.apache.oro.text.regex.MatchResult   * can be obtained by calling {@link #getMatch()}.   * However, Perl5Util implements the MatchResult interface as a wrapper   * around the last MatchResult found, so you can call its methods to   * access match information.   * <p>   * @param pattern  The pattern to search for.   * @param input    The String input to search.   * @return True if the input contains the pattern, false otherwise.   * @exception MalformedPerl5PatternException  If there is an error in   *            the pattern.  You are not forced to catch this exception   *            because it is derived from RuntimeException.   */  public synchronized boolean match(String pattern, String input)       throws MalformedPerl5PatternException  {    return match(pattern, input.toCharArray());  }  /**   * Searches for the next pattern match somewhere in a   * org.apache.oro.text.regex.PatternMatcherInput instance, taking   * a pattern specified in Perl5 native format:   * <blockquote><pre>   * [m]/pattern/[i][m][s][x]   * </pre></blockquote>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -