📄 re-syntax.html
字号:
<DD>This is an extension notation (a "<tt class="character">?</tt>"
following a "<tt class="character">(</tt>" is not meaningful otherwise). The first
character after the "<tt class="character">?</tt>"
determines what the meaning and further syntax of the construct is.
Extensions usually do not create a new group;
<tt class="regexp">(?P<<var>name</var>>...)</tt> is the only exception to this rule.
Following are the currently supported extensions.
<P>
<DT><code>(?iLmsux)</code>
<DD>(One or more letters from the set "<tt class="character">i</tt>",
"<tt class="character">L</tt>", "<tt class="character">m</tt>", "<tt class="character">s</tt>", "<tt class="character">u</tt>",
"<tt class="character">x</tt>".) The group matches the empty string; the letters set
the corresponding flags (<tt class="constant">re.I</tt>, <tt class="constant">re.L</tt>,
<tt class="constant">re.M</tt>, <tt class="constant">re.S</tt>, <tt class="constant">re.U</tt>, <tt class="constant">re.X</tt>)
for the entire regular expression. This is useful if you wish to
include the flags as part of the regular expression, instead of
passing a <var>flag</var> argument to the <tt class="function">compile()</tt> function.
<P>
Note that the <tt class="regexp">(?x)</tt> flag changes how the expression is parsed.
It should be used first in the expression string, or after one or more
whitespace characters. If there are non-whitespace characters before
the flag, the results are undefined.
<P>
<DT><code>(?:...)</code>
<DD>A non-grouping version of regular parentheses.
Matches whatever regular expression is inside the parentheses, but the
substring matched by the
group <i>cannot</i> be retrieved after performing a match or
referenced later in the pattern.
<P>
<DT><code>(?P<<var>name</var>>...)</code>
<DD>Similar to regular parentheses, but
the substring matched by the group is accessible via the symbolic group
name <var>name</var>. Group names must be valid Python identifiers. A
symbolic group is also a numbered group, just as if the group were not
named. So the group named 'id' in the example above can also be
referenced as the numbered group 1.
<P>
For example, if the pattern is
<tt class="regexp">(?P<id>[a-zA-Z_]\w*)</tt>, the group can be referenced by its
name in arguments to methods of match objects, such as <code>m.group('id')</code>
or <code>m.end('id')</code>, and also by name in pattern text
(e.g. <tt class="regexp">(?P=id)</tt>) and replacement text (e.g. <code>\g<id></code>).
<P>
<DT><code>(?P=<var>name</var>)</code>
<DD>Matches whatever text was matched by the
earlier group named <var>name</var>.
<P>
<DT><code>(?#...)</code>
<DD>A comment; the contents of the parentheses are
simply ignored.
<P>
<DT><code>(?=...)</code>
<DD>Matches if <tt class="regexp">...</tt> matches next, but doesn't
consume any of the string. This is called a lookahead assertion. For
example, <tt class="regexp">Isaac (?=Asimov)</tt> will match <code>'Isaac '</code> only if it's
followed by <code>'Asimov'</code>.
<P>
<DT><code>(?!...)</code>
<DD>Matches if <tt class="regexp">...</tt> doesn't match next. This
is a negative lookahead assertion. For example,
<tt class="regexp">Isaac (?!Asimov)</tt> will match <code>'Isaac '</code> only if it's <i>not</i>
followed by <code>'Asimov'</code>.
<P>
<DT><code>(?<=...)</code>
<DD>Matches if the current position in the string
is preceded by a match for <tt class="regexp">...</tt> that ends at the current
position. This is called a positive lookbehind assertion.
<tt class="regexp">(?<=abc)def</tt> will match "<tt class="samp">abcdef</tt>", since the lookbehind
will back up 3 characters and check if the contained pattern matches.
The contained pattern must only match strings of some fixed length,
meaning that <tt class="regexp">abc</tt> or <tt class="regexp">a|b</tt> are allowed, but <tt class="regexp">a*</tt>
isn't.
<P>
<DT><code>(?<!...)</code>
<DD>Matches if the current position in the string
is not preceded by a match for <tt class="regexp">...</tt>. This
is called a negative lookbehind assertion. Similar to positive lookbehind
assertions, the contained pattern must only match strings of some
fixed length.
<P>
</DD>
</DL>
<P>
The special sequences consist of "<tt class="character">\</tt>" and a character from the
list below. If the ordinary character is not on the list, then the
resulting RE will match the second character. For example,
<tt class="regexp">\$</tt> matches the character "<tt class="character">$</tt>".
<P>
<DL COMPACT>
<DT><code>\<var>number</var></code>
<DD>Matches the contents of the group of the
same number. Groups are numbered starting from 1. For example,
<tt class="regexp">(.+) \1</tt> matches <code>'the the'</code> or <code>'55 55'</code>, but not
<code>'the end'</code> (note
the space after the group). This special sequence can only be used to
match one of the first 99 groups. If the first digit of <var>number</var>
is 0, or <var>number</var> is 3 octal digits long, it will not be interpreted
as a group match, but as the character with octal value <var>number</var>.
Inside the "<tt class="character">[</tt>" and "<tt class="character">]</tt>" of a character class, all numeric
escapes are treated as characters.
<P>
<DT><code>\A</code>
<DD>Matches only at the start of the string.
<P>
<DT><code>\b</code>
<DD>Matches the empty string, but only at the
beginning or end of a word. A word is defined as a sequence of
alphanumeric characters, so the end of a word is indicated by
whitespace or a non-alphanumeric character. Inside a character range,
<tt class="regexp">\b</tt> represents the backspace character, for compatibility with
Python's string literals.
<P>
<DT><code>\B</code>
<DD>Matches the empty string, but only when it is
<i>not</i> at the beginning or end of a word.
<P>
<DT><code>\d</code>
<DD>Matches any decimal digit; this is
equivalent to the set <tt class="regexp">[0-9]</tt>.
<P>
<DT><code>\D</code>
<DD>Matches any non-digit character; this is
equivalent to the set <tt class="regexp">[^0-9]</tt>.
<P>
<DT><code>\s</code>
<DD>Matches any whitespace character; this is
equivalent to the set <tt class="regexp">[ \t\n\r\f\v]</tt>.
<P>
<DT><code>\S</code>
<DD>Matches any non-whitespace character; this is
equivalent to the set <tt class="regexp">[^ \t\n\r\f\v]</tt>.
<P>
<DT><code>\w</code>
<DD>When the <tt class="constant">LOCALE</tt> and <tt class="constant">UNICODE</tt>
flags are not specified,
matches any alphanumeric character; this is equivalent to the set
<tt class="regexp">[a-zA-Z0-9_]</tt>. With <tt class="constant">LOCALE</tt>, it will match the set
<tt class="regexp">[0-9_]</tt> plus whatever characters are defined as letters for
the current locale. If <tt class="constant">UNICODE</tt> is set, this will match the
characters <tt class="regexp">[0-9_]</tt> plus whatever is classified as alphanumeric
in the Unicode character properties database.
<P>
<DT><code>\W</code>
<DD>When the <tt class="constant">LOCALE</tt> and <tt class="constant">UNICODE</tt>
flags are not specified, matches any non-alphanumeric character; this
is equivalent to the set <tt class="regexp">[^a-zA-Z0-9_]</tt>. With
<tt class="constant">LOCALE</tt>, it will match any character not in the set
<tt class="regexp">[0-9_]</tt>, and not defined as a letter for the current locale.
If <tt class="constant">UNICODE</tt> is set, this will match anything other than
<tt class="regexp">[0-9_]</tt> and characters marked at alphanumeric in the Unicode
character properties database.
<P>
<DT><code>\Z</code>
<DD>Matches only at the end of the string.
<P>
<DT><code>\\</code>
<DD>Matches a literal backslash.
<P>
</DD>
</DL>
<P>
<DIV CLASS="navigation"><p><hr><table align="center" width="100%" cellpadding="0" cellspacing="2">
<tr>
<td><A href="module-re.html" tppabs="http://www.python.org/doc/current/lib/module-re.html"><img src="previous.gif" tppabs="http://www.python.org/doc/current/icons/previous.gif" border="0" height="32"
alt="Previous Page" width="32"></A></td>
<td><A href="module-re.html" tppabs="http://www.python.org/doc/current/lib/module-re.html"><img src="up.gif" tppabs="http://www.python.org/doc/current/icons/up.gif" border="0" height="32"
alt="Up One Level" width="32"></A></td>
<td><A href="matching-searching.html" tppabs="http://www.python.org/doc/current/lib/matching-searching.html"><img src="next.gif" tppabs="http://www.python.org/doc/current/icons/next.gif" border="0" height="32"
alt="Next Page" width="32"></A></td>
<td align="center" width="100%">Python Library Reference</td>
<td><A href="contents.html" tppabs="http://www.python.org/doc/current/lib/contents.html"><img src="contents.gif" tppabs="http://www.python.org/doc/current/icons/contents.gif" border="0" height="32"
alt="Contents" width="32"></A></td>
<td><a href="modindex.html" tppabs="http://www.python.org/doc/current/lib/modindex.html" title="Module Index"><img src="modules.gif" tppabs="http://www.python.org/doc/current/icons/modules.gif" border="0" height="32"
alt="Module Index" width="32"></a></td>
<td><A href="genindex.html" tppabs="http://www.python.org/doc/current/lib/genindex.html"><img src="index.gif" tppabs="http://www.python.org/doc/current/icons/index.gif" border="0" height="32"
alt="Index" width="32"></A></td>
</tr></table>
<b class="navlabel">Previous:</b> <a class="sectref" href="module-re.html" tppabs="http://www.python.org/doc/current/lib/module-re.html">4.2 re </A>
<b class="navlabel">Up:</b> <a class="sectref" href="module-re.html" tppabs="http://www.python.org/doc/current/lib/module-re.html">4.2 re </A>
<b class="navlabel">Next:</b> <a class="sectref" href="matching-searching.html" tppabs="http://www.python.org/doc/current/lib/matching-searching.html">4.2.2 Matching vs. Searching</A>
</DIV>
<!--End of Navigation Panel-->
<ADDRESS>
<hr>See <i><a href="about.html" tppabs="http://www.python.org/doc/current/lib/about.html">About this document...</a></i> for information on suggesting changes.
</ADDRESS>
</BODY>
</HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -