📄 xbd_chap09.html
字号:
<li><p>The set of single-character collating elements whose characters belong to the character class, as defined in the <i>LC_CTYPE</i>category in the current locale.</p></li><li><p>An unspecified set of multi-character collating elements.</p></li></ol><p>All character classes specified in the current locale shall be recognized. A character class expression is expressed as acharacter class name enclosed within bracket-colon ( <tt>"[:"</tt> and <tt>":]"</tt> ) delimiters.</p><p>The following character class expressions shall be supported in all locales:</p><blockquote><pre><tt>[:alnum:] [:cntrl:] [:lower:] [:space:][:alpha:] [:digit:] [:print:] [:upper:][:blank:] [:graph:] [:punct:] [:xdigit:]</tt></pre></blockquote><p>In addition, character class expressions of the form:</p><blockquote><pre><tt>[:</tt><i>name</i><tt>:]</tt></pre></blockquote><p>are recognized in those locales where the <i>name</i> keyword has been given a <b>charclass</b> definition in the<i>LC_CTYPE</i> category.</p></li><li><p>In the POSIX locale, a range expression represents the set of collating elements that fall between two elements in the collationsequence, inclusive. In other locales, a range expression has unspecified behavior: strictly conforming applications shall not relyon whether the range expression is valid, or on the set of collating elements matched. A range expression shall be expressed as thestarting point and the ending point separated by a hyphen ( <tt>'-'</tt> ).</p><p>In the following, all examples assume the POSIX locale.</p><p>The starting range point and the ending range point shall be a collating element or collating symbol. An equivalence classexpression used as a starting or ending point of a range expression produces unspecified results. An equivalence class can be usedportably within a bracket expression, but only outside the range. If the represented set of collating elements is empty, it isunspecified whether the expression matches nothing, or is treated as invalid.</p><p>The interpretation of range expressions where the ending range point is also the starting range point of a subsequent rangeexpression (for example, <tt>"[a-m-o]"</tt> ) is undefined.</p><p>The hyphen character shall be treated as itself if it occurs first (after an initial <tt>'^'</tt> , if any) or last in the list,or as an ending range point in a range expression. As examples, the expressions <tt>"[-ac]"</tt> and <tt>"[ac-]"</tt> areequivalent and match any of the characters <tt>'a'</tt> , <tt>'c'</tt> , or <tt>'-'</tt> ; <tt>"[^-ac]"</tt> and <tt>"[^ac-]"</tt>are equivalent and match any characters except <tt>'a'</tt> , <tt>'c'</tt> , or <tt>'-'</tt> ; the expression <tt>"[%--]"</tt>matches any of the characters between <tt>'%'</tt> and <tt>'-'</tt> inclusive; the expression <tt>"[--@]"</tt> matches any of thecharacters between <tt>'-'</tt> and <tt>'@'</tt> inclusive; and the expression <tt>"[a--@]"</tt> is either invalid or equivalent to<tt>'@'</tt> , because the letter <tt>'a'</tt> follows the symbol <tt>'-'</tt> in the POSIX locale. To use a hyphen as the startingrange point, it shall either come first in the bracket expression or be specified as a collating symbol; for example,<tt>"[][.-.]-0]"</tt> , which matches either a right bracket or any character or collating element that collates between hyphen and0, inclusive.</p><p>If a bracket expression specifies both <tt>'-'</tt> and <tt>']'</tt> , the <tt>']'</tt> shall be placed first (after the<tt>'^'</tt> , if any) and the <tt>'-'</tt> last within the bracket expression.</p></li></ol><h4><a name="tag_09_03_06"></a>BREs Matching Multiple Characters</h4><p>The following rules can be used to construct BREs matching multiple characters from BREs matching a single character:</p><ol><li><p>The concatenation of BREs shall match the concatenation of the strings matched by each component of the BRE.</p></li><li><p>A subexpression can be defined within a BRE by enclosing it between the character pairs <tt>"\("</tt> and <tt>"\)"</tt> . Such asubexpression shall match whatever it would have matched without the <tt>"\("</tt> and <tt>"\)"</tt> , except that anchoring withinsubexpressions is optional behavior; see <a href="#tag_09_03_08">BRE Expression Anchoring</a> . Subexpressions can be arbitrarilynested.</p></li><li><p>The back-reference expression <tt>'\n'</tt> shall match the same (possibly empty) string of characters as was matched by asubexpression enclosed between <tt>"\("</tt> and <tt>"\)"</tt> preceding the <tt>'\n'</tt> . The character <tt>'n'</tt> shall be adigit from 1 through 9, specifying the <i>n</i>th subexpression (the one that begins with the <i>n</i>th <tt>"\("</tt> from thebeginning of the pattern and ends with the corresponding paired <tt>"\)"</tt> ). The expression is invalid if less than <i>n</i>subexpressions precede the <tt>'\n'</tt> . For example, the expression <tt>"\(.*\)\1$"</tt> matches a line consisting of twoadjacent appearances of the same string, and the expression <tt>"\(a\)*\1"</tt> fails to match <tt>'a'</tt> . When the referencedsubexpression matched more than one string, the back-referenced expression shall refer to the last matched string. If thesubexpression referenced by the back-reference matches more than one string because of an asterisk ( <tt>'*'</tt> ) or an intervalexpression (see item (5)), the back-reference shall match the last (rightmost) of these strings.</p></li><li><p>When a BRE matching a single character, a subexpression, or a back-reference is followed by the special character asterisk (<tt>'*'</tt> ), together with that asterisk it shall match what zero or more consecutive occurrences of the BRE would match. Forexample, <tt>"[ab]*"</tt> and <tt>"[ab][ab]"</tt> are equivalent when matching the string <tt>"ab"</tt> .</p></li><li><p>When a BRE matching a single character, a subexpression, or a back-reference is followed by an interval expression of the format<tt>"\{m\}"</tt> , <tt>"\{m,\}"</tt> , or <tt>"\{m,n\}"</tt> , together with that interval expression it shall match what repeatedconsecutive occurrences of the BRE would match. The values of <i>m</i> and <i>n</i> are decimal integers in the range 0 <=<i>m</i><= <i>n</i><= {RE_DUP_MAX}, where <i>m</i> specifies the exact or minimum number of occurrences and <i>n</i>specifies the maximum number of occurrences. The expression <tt>"\{m\}"</tt> shall match exactly <i>m</i> occurrences of thepreceding BRE, <tt>"\{m,\}"</tt> shall match at least <i>m</i> occurrences, and <tt>"\{m,n\}"</tt> shall match any number ofoccurrences between <i>m</i> and <i>n</i>, inclusive.</p><p>For example, in the string <tt>"abababccccccd"</tt> the BRE <tt>"c\{3\}"</tt> is matched by characters seven to nine, the BRE<tt>"\(ab\)\{4,\}"</tt> is not matched at all, and the BRE <tt>"c\{1,3\}d"</tt> is matched by characters ten to thirteen.</p></li></ol><p>The behavior of multiple adjacent duplication symbols ( <tt>'*'</tt> and intervals) produces undefined results.</p><p>A subexpression repeated by an asterisk ( <tt>'*'</tt> ) or an interval expression shall not match a null expression unless thisis the only match for the repetition or it is necessary to satisfy the exact or minimum number of occurrences for the intervalexpression.</p><h4><a name="tag_09_03_07"></a>BRE Precedence</h4><p>The order of precedence shall be as shown in the following table:</p><center><table border="1" cellpadding="3" align="center"><tr valign="top"><th colspan="2" align="center"><p class="tent"><b>BRE Precedence (from high to low)</b></p></th></tr><tr valign="top"><td align="left"><p class="tent">Collation-related bracket symbols</p></td><td align="left"><p class="tent">[==] [::] [..]</p></td></tr><tr valign="top"><td align="left"><p class="tent">Escaped characters</p></td><td align="left"><p class="tent">\<special character></p></td></tr><tr valign="top"><td align="left"><p class="tent">Bracket expression</p></td><td align="left"><p class="tent">[]</p></td></tr><tr valign="top"><td align="left"><p class="tent">Subexpressions/back-references</p></td><td align="left"><p class="tent">\(\) \n</p></td></tr><tr valign="top"><td align="left"><p class="tent">Single-character-BRE duplication</p></td><td align="left"><p class="tent">* \{m,n\}</p></td></tr><tr valign="top"><td align="left"><p class="tent">Concatenation</p></td><td align="left"><p class="tent"> </p></td></tr><tr valign="top"><td align="left"><p class="tent">Anchoring</p></td><td align="left"><p class="tent">^ $</p></td></tr></table></center><h4><a name="tag_09_03_08"></a>BRE Expression Anchoring</h4><p>A BRE can be limited to matching strings that begin or end a line; this is called "anchoring". The circumflex and dollar signspecial characters shall be considered BRE anchors in the following contexts:</p><ol><li><p>A circumflex ( <tt>'^'</tt> ) shall be an anchor when used as the first character of an entire BRE. The implementation may treatthe circumflex as an anchor when used as the first character of a subexpression. The circumflex shall anchor the expression (oroptionally subexpression) to the beginning of a string; only sequences starting at the first character of a string shall be matchedby the BRE. For example, the BRE <tt>"^ab"</tt> matches <tt>"ab"</tt> in the string <tt>"abcdef"</tt> , but fails to match in thestring <tt>"cdefab"</tt> . The BRE <tt>"\(^ab\)"</tt> may match the former string. A portable BRE shall escape a leading circumflexin a subexpression to match a literal circumflex.</p></li><li><p>A dollar sign ( <tt>'$'</tt> ) shall be an anchor when used as the last character of an entire BRE. The implementation may treata dollar sign as an anchor when used as the last character of a subexpression. The dollar sign shall anchor the expression (oroptionally subexpression) to the end of the string being matched; the dollar sign can be said to match the end-of-string followingthe last character.</p></li><li><p>A BRE anchored by both <tt>'^'</tt> and <tt>'$'</tt> shall match only an entire string. For example, the BRE <tt>"^abcdef$"</tt>matches strings consisting only of <tt>"abcdef"</tt> .</p></li></ol><h3><a name="tag_09_04"></a>Extended Regular Expressions</h3><p>The extended regular expression (ERE) notation and construction rules shall apply to utilities defined as using extended regularexpressions; any exceptions to the following rules are noted in the descriptions of the specific utilities using EREs.</p><h4><a name="tag_09_04_01"></a>EREs Matching a Single Character or Collating Element</h4><p>An ERE ordinary character, a special character preceded by a backslash, or a period shall match a single character. A bracketexpression shall match a single character or a single collating element. An ERE matching a single character enclosed in parenthesesshall match the same as the ERE without parentheses would have matched.</p><h4><a name="tag_09_04_02"></a>ERE Ordinary Characters</h4><p>An ordinary character is an ERE that matches itself. An ordinary character is any character in the supported character set,except for the ERE special characters listed in <a href="#tag_09_04_03">ERE Special Characters</a> . The interpretation of anordinary character preceded by a backslash ( <tt>'\'</tt> ) is undefined.</p><h4><a name="tag_09_04_03"></a>ERE Special Characters</h4><p>An ERE special character has special properties in certain contexts. Outside those contexts, or when preceded by a backslash,such a character shall be an ERE that matches the special character itself. The extended regular expression special characters andthe contexts in which they shall have their special meaning are as follows:</p><dl compact><dt><tt>.[\(</tt></dt><dd>The period, left-bracket, backslash, and left-parenthesis shall be special except when used in a bracket expression (see <ahref="#tag_09_03_05">RE Bracket Expression</a> ). Outside a bracket expression, a left-parenthesis immediately followed by aright-parenthesis produces undefined results.</dd><dt><tt>)</tt></dt><dd>The right-parenthesis shall be special when matched with a preceding left-parenthesis, both outside a bracket expression.</dd><dt><tt>*+?{</tt></dt><dd>The asterisk, plus-sign, question-mark, and left-brace shall be special except when used in a bracket expression (see <a href="#tag_09_03_05">RE Bracket Expression</a> ). Any of the following uses produce undefined results: <ul><li><p>If these characters appear first in an ERE, or immediately following a vertical-line, circumflex, or left-parenthesis</p></li><li><p>If a left-brace is not part of a valid interval expression (see <a href="#tag_09_04_06">EREs Matching Multiple Characters</a>)</p></li></ul></dd>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -