📄 regexp.reference.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html> <head> <title>Regular Expression Details</title> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> </head> <body><div style="text-align: center;"> <div class="prev" style="text-align: left; float: left;"><a href="reference.pcre.pattern.syntax.html">Describes PCRE regex syntax</a></div> <div class="next" style="text-align: right; float: right;"><a href="ref.pcre.html">PCRE Functions</a></div> <div class="up"><a href="reference.pcre.pattern.syntax.html">Describes PCRE regex syntax</a></div> <div class="home"><a href="index.html">PHP Manual</a></div></div><hr /><div id="regexp.reference" class="section"> <h2 class="title">Regular Expression Details</h2> <div id="regexp.introduction" class="section"> <h2 class="title">Introduction</h2> <p class="para"> The syntax and semantics of the regular expressions supported by PCRE are described below. Regular expressions are also described in the Perl documentation and in a number of other books, some of which have copious examples. Jeffrey Friedl's "Mastering Regular Expressions", published by O'Reilly (ISBN 1-56592-257-3), covers them in great detail. The description here is intended as reference documentation. </p> <p class="para"> A regular expression is a pattern that is matched against a subject string from left to right. Most characters stand for themselves in a pattern, and match the corresponding characters in the subject. As a trivial example, the pattern <i>The quick brown fox</i> matches a portion of a subject string that is identical to itself. </p> </div> <div id="regexp.reference.meta" class="section"> <h2 class="title">Meta-characters</h2> <p class="para"> The power of regular expressions comes from the ability to include alternatives and repetitions in the pattern. These are encoded in the pattern by the use of <em class="emphasis">meta-characters</em>, which do not stand for themselves but instead are interpreted in some special way. </p> <p class="para"> There are two different sets of meta-characters: those that are recognized anywhere in the pattern except within square brackets, and those that are recognized in square brackets. Outside square brackets, the meta-characters are as follows: <dl> <dt> <span class="term"><em class="emphasis">\</em></span> <dd><span class="simpara">general escape character with several uses</span></dd> </dt> <dt> <span class="term"><em class="emphasis">^</em></span> <dd><span class="simpara">assert start of subject (or line, in multiline mode)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">$</em></span> <dd><span class="simpara">assert end of subject (or line, in multiline mode)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">.</em></span> <dd><span class="simpara">match any character except newline (by default)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">[</em></span> <dd><span class="simpara">start character class definition</span></dd> </dt> <dt> <span class="term"><em class="emphasis">]</em></span> <dd><span class="simpara">end character class definition</span></dd> </dt> <dt> <span class="term"><em class="emphasis">|</em></span> <dd><span class="simpara">start of alternative branch</span></dd> </dt> <dt> <span class="term"><em class="emphasis">(</em></span> <dd><span class="simpara">start subpattern</span></dd> </dt> <dt> <span class="term"><em class="emphasis">)</em></span> <dd><span class="simpara">end subpattern</span></dd> </dt> <dt> <span class="term"><em class="emphasis">?</em></span> <dd><span class="simpara">extends the meaning of (, also 0 or 1 quantifier, also quantifier minimizer</span></dd> </dt> <dt> <span class="term"><em class="emphasis">*</em></span> <dd><span class="simpara">0 or more quantifier</span></dd> </dt> <dt> <span class="term"><em class="emphasis">+</em></span> <dd><span class="simpara">1 or more quantifier</span></dd> </dt> <dt> <span class="term"><em class="emphasis">{</em></span> <dd><span class="simpara">start min/max quantifier</span></dd> </dt> <dt> <span class="term"><em class="emphasis">}</em></span> <dd><span class="simpara">end min/max quantifier</span></dd> </dt> </dl> Part of a pattern that is in square brackets is called a "character class". In a character class the only meta-characters are: <dl> <dt> <span class="term"><em class="emphasis">\</em></span> <dd><span class="simpara">general escape character</span></dd> </dt> <dt> <span class="term"><em class="emphasis">^</em></span> <dd><span class="simpara">negate the class, but only if the first character</span></dd> </dt> <dt> <span class="term"><em class="emphasis">-</em></span> <dd><span class="simpara">indicates character range</span></dd> </dt> <dt> <span class="term"><em class="emphasis">]</em></span> <dd><span class="simpara">terminates the character class</span></dd> </dt> </dl> The following sections describe the use of each of the meta-characters. </p> </div> <div id="regexp.reference.backslash" class="section"> <h2 class="title">Backslash</h2> <p class="para"> The backslash character has several uses. Firstly, if it is followed by a non-alphanumeric character, it takes away any special meaning that character may have. This use of backslash as an escape character applies both inside and outside character classes. </p> <p class="para"> For example, if you want to match a "*" character, you write "\*" in the pattern. This applies whether or not the following character would otherwise be interpreted as a meta-character, so it is always safe to precede a non-alphanumeric with "\" to specify that it stands for itself. In particular, if you want to match a backslash, you write "\\". </p> <blockquote><p><b class="note">Note</b>: Single and double quoted PHP <a href="language.types.string.html#language.types.string.syntax" class="link">strings</a> have special meaning of backslash. Thus if \ has to be matched with a regular expression \\, then "\\\\" or '\\\\' must be used in PHP code. <br /> </p></blockquote> <p class="para"> If a pattern is compiled with the <a href="reference.pcre.pattern.modifiers.html" class="link">PCRE_EXTENDED</a> option, whitespace in the pattern (other than in a character class) and characters between a "#" outside a character class and the next newline character are ignored. An escaping backslash can be used to include a whitespace or "#" character as part of the pattern. </p> <p class="para"> A second use of backslash provides a way of encoding non-printing characters in patterns in a visible manner. There is no restriction on the appearance of non-printing characters, apart from the binary zero that terminates a pattern, but when a pattern is being prepared by text editing, it is usually easier to use one of the following escape sequences than the binary character it represents: </p> <p class="para"> <dl> <dt> <span class="term"><em class="emphasis">\a</em></span> <dd><span class="simpara">alarm, that is, the BEL character (hex 07)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">\cx</em></span> <dd><span class="simpara">"control-x", where x is any character</span></dd> </dt> <dt> <span class="term"><em class="emphasis">\e</em></span> <dd><span class="simpara">escape (hex 1B)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">\f</em></span> <dd><span class="simpara">formfeed (hex 0C)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">\n</em></span> <dd><span class="simpara">newline (hex 0A)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">\r</em></span> <dd><span class="simpara">carriage return (hex 0D)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">\t</em></span> <dd><span class="simpara">tab (hex 09)</span></dd> </dt> <dt> <span class="term"><em class="emphasis">\xhh</em></span>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -