pattern.html

来自「j2se5-api-zh,java文档的中文版本」· HTML 代码 · 共 1,186 行 · 第 1/4 页
HTML
1,186 行
<tr><th>&nbsp;</th></tr><tr align="left"><th colspan="2" id="reluc">Reluctant 数量词</th></tr><tr><td valign="top" headers="construct reluc"><i>X</i><tt>??</tt></td>    <td headers="matches"><i>X</i>，一次或一次也没有</td></tr><tr><td valign="top" headers="construct reluc"><i>X</i><tt>*?</tt></td>    <td headers="matches"><i>X</i>，零次或多次</td></tr><tr><td valign="top" headers="construct reluc"><i>X</i><tt>+?</tt></td>    <td headers="matches"><i>X</i>，一次或多次</td></tr><tr><td valign="top" headers="construct reluc"><i>X</i><tt>{</tt><i>n</i><tt>}?</tt></td>    <td headers="matches"><i>X</i>，恰好 <i>n</i> 次</td></tr><tr><td valign="top" headers="construct reluc"><i>X</i><tt>{</tt><i>n</i><tt>,}?</tt></td>    <td headers="matches"><i>X</i>，至少 <i>n</i> 次</td></tr><tr><td valign="top" headers="construct reluc"><i>X</i><tt>{</tt><i>n</i><tt>,</tt><i>m</i><tt>}?</tt></td>    <td headers="matches"><i>X</i>，至少 <i>n</i> 次，但是不超过 <i>m</i> 次</td></tr><tr><th>&nbsp;</th></tr><tr align="left"><th colspan="2" id="poss">Possessive 数量词</th></tr><tr><td valign="top" headers="construct poss"><i>X</i><tt>?+</tt></td>    <td headers="matches"><i>X</i>，一次或一次也没有</td></tr><tr><td valign="top" headers="construct poss"><i>X</i><tt>*+</tt></td>    <td headers="matches"><i>X</i>，零次或多次</td></tr><tr><td valign="top" headers="construct poss"><i>X</i><tt>++</tt></td>    <td headers="matches"><i>X</i>，一次或多次</td></tr><tr><td valign="top" headers="construct poss"><i>X</i><tt>{</tt><i>n</i><tt>}+</tt></td>    <td headers="matches"><i>X</i>，恰好 <i>n</i> 次</td></tr><tr><td valign="top" headers="construct poss"><i>X</i><tt>{</tt><i>n</i><tt>,}+</tt></td>    <td headers="matches"><i>X</i>，至少 <i>n</i> 次</td></tr><tr><td valign="top" headers="construct poss"><i>X</i><tt>{</tt><i>n</i><tt>,</tt><i>m</i><tt>}+</tt></td>    <td headers="matches"><i>X</i>，至少 <i>n</i> 次，但是不超过 <i>m</i> 次</td></tr><tr><th>&nbsp;</th></tr><tr align="left"><th colspan="2" id="logical">Logical 运算符</th></tr><tr><td valign="top" headers="construct logical"><i>XY</i></td>    <td headers="matches"><i>X</i> 后跟 <i>Y</i></td></tr><tr><td valign="top" headers="construct logical"><i>X</i><tt>|</tt><i>Y</i></td>    <td headers="matches"><i>X</i> 或 <i>Y</i></td></tr><tr><td valign="top" headers="construct logical"><tt>(</tt><i>X</i><tt>)</tt></td>    <td headers="matches">X，作为<a href="#cg">捕获组</a></td></tr><tr><th>&nbsp;</th></tr><tr align="left"><th colspan="2" id="backref">Back 引用</th></tr><tr><td valign="bottom" headers="construct backref"><tt>\</tt><i>n</i></td>    <td valign="bottom" headers="matches">任何匹配的 <i>n</i><sup>th</sup> <a href="#cg">捕获组</a></td></tr><tr><th>&nbsp;</th></tr><tr align="left"><th colspan="2" id="quot">引用</th></tr><tr><td valign="top" headers="construct quot"><tt>\</tt></td>    <td headers="matches">Nothing，但是引用以下字符</tt></td></tr><tr><td valign="top" headers="construct quot"><tt>\Q</tt></td>    <td headers="matches">Nothing，但是引用所有字符，直到 <tt>\E</tt></td></tr><tr><td valign="top" headers="construct quot"><tt>\E</tt></td>    <td headers="matches">Nothing，但是结束从 <tt>\Q</tt> 开始的引用</td></tr>     <!-- Metachars: !$()*+.<>?[\]^{|} --><tr><th>&nbsp;</th></tr><tr align="left"><th colspan="2" id="special">特殊构造（非捕获）</th></tr><tr><td valign="top" headers="construct special"><tt>(?:</tt><i>X</i><tt>)</tt></td>    <td headers="matches"><i>X</i>，作为非捕获组</td></tr><tr><td valign="top" headers="construct special"><tt>(?idmsux-idmsux)&nbsp;</tt></td>    <td headers="matches">Nothing，但是将匹配标志由 on 转为 off</td></tr><tr><td valign="top" headers="construct special"><tt>(?idmsux-idmsux:</tt><i>X</i><tt>)</tt>&nbsp;&nbsp;</td>    <td headers="matches"><i>X</i>，作为带有给定标志 on - off 的<a href="#cg">非捕获组</a></td></tr><tr><td valign="top" headers="construct special"><tt>(?=</tt><i>X</i><tt>)</tt></td>    <td headers="matches"><i>X</i>，通过零宽度的正 lookahead</td></tr><tr><td valign="top" headers="construct special"><tt>(?!</tt><i>X</i><tt>)</tt></td>    <td headers="matches"><i>X</i>，通过零宽度的负 lookahead</td></tr><tr><td valign="top" headers="construct special"><tt>(?&lt;=</tt><i>X</i><tt>)</tt></td>    <td headers="matches"><i>X</i>，通过零宽度的正 lookbehind</td></tr><tr><td valign="top" headers="construct special"><tt>(?&lt;!</tt><i>X</i><tt>)</tt></td>    <td headers="matches"><i>X</i>，通过零宽度的负 lookbehind</td></tr><tr><td valign="top" headers="construct special"><tt>(?&gt;</tt><i>X</i><tt>)</tt></td>    <td headers="matches"><i>X</i>，作为独立的非捕获组</td></tr> </table> <hr><a name="bs"><h4> 反斜线、转义和引用 </h4><p> 反斜线字符 (<tt>'\'</tt>) 用于引用转义构造，如上表所定义的，同时还用于引用其他将被解释为非转义构造的字符。因此，表达式 <tt>\\</tt> 与单个反斜线匹配，而 <tt>\{</tt> 与左括号匹配。<p> 在不表示转义构造的任何字母字符前使用反斜线都是错误的；它们是为将来扩展正则表达式语言保留的。可以在非字母字符前使用反斜线，不管该字符是否非转义构造的一部分。<p> 根据 <a href="../../../../../../../../../java.sun.com/docs/books/jls/second_edition/html/default.htm">Java Language Specification</a> 的要求，Java 源代码的字符串中的反斜线被解释为 <a href="../../../../../../../../../java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#100850">Unicode 转义</a>或其他<a href="../../../../../../../../../java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#101089">字符转义</a>。因此必须在字符串字面值中使用两个反斜线，表示正则表达式受到保护，不被 Java 字节码编译器解释。例如，当解释为正则表达式时，字符串字面值 <tt>&quot;\b&quot;</tt> 与单个退格字符匹配，而 <tt>&quot;\\b&quot;</tt>  与单词边界匹配。字符串字面值 <tt>"&#92;(hello&#92;)"</tt> 是非法的，将导致编译时错误；要与字符串 <tt>(hello)</tt> 匹配，必须使用字符串字面值 <tt>"&#92;&#92;(hello&#92;&#92;)"</tt>。 <a name="cc"><h4> 字符类 </h4><p> 字符类可以出现在其他字符类中，并且可以包含并集运算符（隐式）和交集运算符 (<tt>&amp;&amp;</tt>)。并集运算符表示至少包含其某个操作数类中所有字符的类。交集运算符表示包含同时位于其两个操作数类中所有字符的类。<p> 字符类运算符的优先级如下所示，按从最高到最低的顺序排列：    <blockquote><table border="0" cellpadding="1" cellspacing="0"                  summary="Precedence of character class operators.">      <tr><th>1&nbsp;&nbsp;&nbsp;&nbsp;</th>          <td>字面值转义&nbsp;&nbsp;&nbsp;&nbsp;</td>          <td><tt>\x</tt></td></tr>     <tr><th>2&nbsp;&nbsp;&nbsp;&nbsp;</th>          <td>分组</td>          <td><tt>[...]</tt></td></tr>     <tr><th>3&nbsp;&nbsp;&nbsp;&nbsp;</th>          <td>范围</td>          <td><tt>a-z</tt></td></tr>      <tr><th>4&nbsp;&nbsp;&nbsp;&nbsp;</th>          <td>并集</td>          <td><tt>[a-e][i-u]<tt></td></tr>      <tr><th>5&nbsp;&nbsp;&nbsp;&nbsp;</th>          <td>交集</td>          <td><tt>[a-z&&[aeiou]]</tt></td></tr>    </table></blockquote><p> 注意，元字符的不同集合在字符类内部和外部的有效性不同。例如，正则表达式 <tt>.</tt> 在字符类内部就失去了其特殊意义，而表达式 <tt>-</tt> 则成为一个表示范围的元字符。 <a name="lt"><h4> 行结束符 </h4><p> <i>行结束符</i> 是一个或两个字符的序列，标记输入字符序列的行结尾。以下代码被识别为行结束符： <ul>  <li> 新行（换行）符 (<tt>'\n'</tt>)、  <li> 后面紧跟新行符的回车符 (<tt>"\r\n"</tt>)、  <li> 单独的回车符 (<tt>'\r'</tt>)、  <li> 下一行字符 (<tt>'\u0085'</tt>)、  <li> 行分隔符 (<tt>'\u2028'</tt>) 或  <li> 段落分隔符 (<tt>'\u2029</tt>)。 </ul><p>如果激活 <A HREF="Pattern.html#UNIX_LINES"><CODE>UNIX_LINES</CODE></A> 模式，则新行符是惟一识别的行结束符。<p> 如果未指定 <A HREF="Pattern.html#DOTALL"><CODE>DOTALL</CODE></A> 标志，则正则表达式 <tt>.</tt> 可以与任何字符（行结束符除外）匹配。<p> 默认情况下，正则表达式 <tt>^</tt> 和 <tt>$</tt> 忽略行结束符，仅分别与整个输入序列的开头和结尾匹配。如果激活 <A HREF="Pattern.html#MULTILINE"><CODE>MULTILINE</CODE></A> 模式，则 <tt>^</tt> 在输入的开头和行结束符之后（不包括输入的结尾）发生匹配。处于 <A HREF="Pattern.html#MULTILINE"><CODE>MULTILINE</CODE></A> 模式中时，<tt>$</tt> 则在行结束符之前或输入序列的结尾处匹配。<a name="cg"><h4> 组和捕获 </h4><p> 捕获组可以通过从左到右计算其开括号来编号。例如，在表达式 <tt>((A)(B(C)))</tt> 中，存在四个这样的组： </p> <blockquote><table cellpadding=1 cellspacing=0 summary="Capturing group numberings"> <tr><th>1&nbsp;&nbsp;&nbsp;&nbsp;</th>     <td><tt>((A)(B(C)))</tt></td></tr> <tr><th>2&nbsp;&nbsp;&nbsp;&nbsp;</th><td><tt>\A</tt></td></tr> <tr><th>3&nbsp;&nbsp;&nbsp;&nbsp;</th>     <td><tt>(B(C))</tt></td></tr> <tr><th>4&nbsp;&nbsp;&nbsp;&nbsp;</th>     <td><tt>(C)</tt></td></tr> </table></blockquote><p> 组零始终代表整个表达式。<p> 之所以这样命名捕获组是因为在匹配中，保存了与这些组匹配的输入序列的每个子序列。捕获的子序列稍后可以通过 Back 引用在表达式中使用，也可以在匹配操作完成后从匹配器检索。<p> 与组关联的捕获输入始终是与组最近匹配的子序列。如果由于量化的缘故再次计算了组，则在第二次计算失败时将保留其以前捕获的值（如果有的话）例如，将字符串 <tt>"aba"</tt> 与表达式 <tt>(a(b)?)+</tt> 相匹配，会将第二组设置为 <tt>"b"</tt>。在每个匹配的开头，所有捕获的输入都会被丢弃。<p> 以 <tt>(?</tt> 开头的组是纯的<i>非捕获</i> 组，它不捕获文本，也不统计组的总数。<h4> Unicode 支持 </h4><p> 此类符合 <a href="../../../../../../../../../www.unicode.org/reports/tr18/default.htm"><i>Unicode Technical Standard #18:Unicode Regular Expression Guidelines</i></a> 第 1 级和 RL2.1 Canonical Equivalents。<p> Java 源代码中的 Unicode 转义序列（如 <tt>\u2014</tt>）是按照 Java Language Specification 的 <a href="../../../../../../../../../java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#100850">第 3.3 节</a>中的描述处理的。这样的转义序列还可以由正则表达式分析器直接实现，以便在从文件或键盘击键读取的表达式中使用 Unicode 转义。因此，可以将不相等的字符串 <tt>"&#92;u2014"</tt> 和 <tt>"\\u2014"</tt> 编译为相同的模式，从而与带有十六进制值 <tt>0x2014</tt> 的字符匹配。<a name="ubc"> <p>与 Perl 中一样，Unicode 块和类别是使用 <tt>\p</tt> 和 <tt>\P</tt> 构造编写的。如果输入具有属性 <i>prop</i>，则与 <tt>\p{</tt><i>prop</i><tt>}</tt> 匹配，而输入具有该属性时与 \P{</tt><i>prop</i><tt>}</tt> 不匹配。块使用前缀 <tt>In</tt> 指定，与在 <tt>InMongolian</tt> 中一样。可以使用可选前缀 <tt>Is</tt> 指定类别：<tt>\p{L}</tt> 和 <tt>\p{IsL}</tt> 都表示 Unicode 字母的类别。块和类别在字符类的内部和外部都可以使用。<p> 受支持的类别是由 <A HREF="../../lang/Character.html" title="java.lang 中的类"><CODE>Character</CODE></A> 类指定版本中的 <a href="../../../../../../../../../www.unicode.org/unicode/standard/standard.html"><i>The Unicode Standard</i></a> 的类别。类别名称是在 Standard 中定义的，即标准又丰富。<code>Pattern</code> 所支持的块名称是 <A HREF="../../lang/Character.UnicodeBlock.html#forName(java.lang.String)"><CODE>UnicodeBlock.forName</CODE></A> 所接受和定义的有效块名称。<a name="jcc"> <p>行为类似 java.lang.Character boolean 是 is<i>methodname</i> 方法（已过时的类别除外）的类别，可以通过相同的 <tt>\p{</tt><i>prop</i><tt>}</tt> 语法来提供，其中指定的属性具有名称 <tt>java<i>methodname</i></tt>。<h4> 与 Perl 5 相比较 </h4><p><code>Pattern</code> 引擎用有序替换项执行传统上基于 NFA 的匹配，与 Perl 5 中进行的相同。<p> 此类不支持 Perl 构造： </p> <ul><li><p> 条件构造 <tt>(?{</tt><i>X</i><tt>})</tt> 和 <tt>(?(</tt><i>condition</i><tt>)</tt><i>X</i><tt>|</tt><i>Y</i><tt>)</tt>    </p></li><li><p> 嵌入式代码构造 <tt>(?{</tt><i>code</i><tt>})</tt> 和 <tt>(??{</tt><i>code</i><tt>})</tt></p></li><li><p> 嵌入式注释语法 <tt>(?#comment)</tt> </p></li><li><p> 预处理操作 <tt>\l</tt> <tt>\u</tt>、<tt>\L</tt> 和 <tt>\U</tt></p></li> </ul><p> 此类支持但 Perl 不支持的构造： </p> <ul><li><p> Possessive 数量词，它可以尽可能多地进行匹配，即使这样做导致所有匹配都成功时也如此。  </p></li><li><p> 字符类并集和交集，如<a href="#cc">上文</a>所述。</p></li> </ul><p> 与 Perl 的显著不同点是： </p> <ul><li><p> 在 Perl 中，<tt>\1</tt> 到 <tt>\9</tt> 始终被解释为 Back 引用；如果至少存在多个子表达式，则大于 <tt>9</tt> 的反斜线转义数按 Back 引用对待，否则在可能的情况下，它将被解释为八进制转义。在此类中，八进制转义必须始终以零开头。在此类中，<tt>\1</tt> 到 <tt>\9</tt> 始终被解释为 Back 引用，较大的数被接受为 Back 引用，如果在正则表达式中至少存在多个子表达式的话；否则，分析器将删除数字，直到该数小于或等于组的现有数或者其为一个数字。    </p></li><li><p> Perl 使用 <tt>g</tt> 标志请求恢复最后匹配丢失的匹配。此功能是由 <A HREF="Matcher.html" title="java.util.regex 中的类"><CODE>Matcher</CODE></A> 类显式提供的：重复执行 <A HREF="Matcher.html#find()"><CODE>find</CODE></A> 方法调用可以恢复丢失的最后匹配，除非匹配器被重置。  </p></li><li><p> 在 Perl 中，位于表达式顶级的嵌入式标记对整个表达式都有影响。在此类中，嵌入式标志始终在它们出现的时候才起作用，不管它们位于顶级还是组中；在后一种情况下，与在 Perl 中类似，标志在组的结尾处还原。  </p></li><li><p> Perl 允许形式不正确的匹配构造，比如在表达式 <tt>*a</tt> 中，还允许不匹配的括号，比如在表达式 <tt>abc]</tt> 中，会将它们作为字面值对待。此类也接受不匹配的括号，但对 +、? 和 * 这样的不匹配元字符有严格限制；如果遇到它们，则抛出 <A HREF="PatternSyntaxException.html" title="java.util.regex 中的类"><CODE>PatternSyntaxException</CODE></A>。 </p></li> </ul><p> 有关正则表达式构造行为更准确的描述，请参见《<a href="../../../../../../../../../www.oreilly.com/catalog/regex2/default.htm"><i>Mastering Regular Expressions, 2nd Edition</i></a>》，该书由 Jeffrey E. F. Friedl、O'Reilly 和 Associates 合著，于 2002 年出版。 </p><P><P><DL><DT><B>从以下版本开始：</B></DT>  <DD>1.4</DD><DT><B>另请参见：</B><DD><A HREF="../../lang/String.html#split(java.lang.String, int)"><CODE>String.split(String, int)</CODE></A>, <A HREF="../../lang/String.html#split(java.lang.String)"><CODE>String.split(String)</CODE></A>, <A HREF="../../../serialized-form.html#java.util.regex.Pattern">序列化表格</A></DL><HR><P><!-- =========== FIELD SUMMARY =========== --><A NAME="field_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>字段摘要</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;int</CODE></FONT></TD><TD><CODE><B><A HREF="Pattern.html#CANON_EQ">CANON_EQ</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;启用规范等价。</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;int</CODE></FONT></TD><TD><CODE><B><A HREF="Pattern.html#CASE_INSENSITIVE">CASE_INSENSITIVE</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;启用不区分大小写的匹配。</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;int</CODE></FONT></TD><TD><CODE><B><A HREF="Pattern.html#COMMENTS">COMMENTS</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;模式中允许空白和注释。</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;int</CODE></FONT></TD><TD><CODE><B><A HREF="Pattern.html#DOTALL">DOTALL</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;启用 dotall 模式。</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;int</CODE></FONT></TD><TD><CODE><B><A HREF="Pattern.html#LITERAL">LITERAL</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;启用模式的字面值分析。</TD></TR>
pattern.html - 源码说明

本页面展示了「j2se5-api-zh,java文档的中文版本」中的 pattern.html 源码文件，采用 HTML 编程语言编写，共 1,186 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与api-zh相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?