📄 re.html
字号:
<li>as the first character of a bracket expression(see<xref href=rebrack><a href="#tag_007_003_005">RE Bracket Expression</a></xref>).</ul><dt>$<dd>The dollar sign is special when used as an anchor.</dl><h4><a name = "tag_007_003_004"> </a>Periods in BREs</h4>A period(.),when used outside a bracket expression, is a BRE that matches anycharacter in the supported character set except NUL.<h4><a name = "tag_007_003_005"> </a>RE Bracket Expression</h4><xref type="3" name="rebrack"></xref>A bracket expression(an expression enclosed in square brackets, [])is an RE that matches a singlecollating element contained in the non-empty set ofcollating elements represented by the bracket expression.<p>The following rules and definitions apply to bracket expressions:<ol><p><li>A<i>bracket expression</i>is either a matching list expressionor a non-matching list expression.It consists of one or more expressions:collating elements,collating symbols,equivalence classes,character classesor range expressions.Portable applicationsmust not use range expressions, even though allimplementations support them.The right-bracket (]) loses its special meaning and represents itselfin a bracket expression if itoccurs first in the list (after an initial circumflex (^), if any).Otherwise, it terminates the bracket expression, unless itappears in a collating symbol (such as [.].])or is the ending right-bracketfor a collating symbol, equivalence class or character class.The special characters:<code><pre>. * [ \</code></pre><p>(period, asterisk, left-bracket and backslash, respectively)lose their special meaning within a bracket expression.<p>The character sequences:<code><pre>[. [= [:</code></pre><p>(left-bracket followed by a period, equals-sign or colon)are special inside a bracket expression and are used todelimit collating symbols, equivalence class expressionsand character class expressions.These symbols must be followed by a valid expressionand the matching terminating sequence .], =] or :],as described in the following items.<p><li>A<i>matching list</i>expression specifies a list thatmatches any one of the expressionsrepresented in the list.The first character in the list must not be the circumflex.For example, [abc] is an RE that matches any of the characters a, b or c.<p><li>A<i>non-matching list</i>expression begins with a circumflex (^),and specifies a list that matches any character or collating elementexcept forthe expressions represented in the list after the leading circumflex.For example, [^abc] is an RE that matches any characteror collating element except the characters a, b or c.The circumflexwill have this special meaning only when it occurs first in thelist, immediately following the left-bracket.<p><li>A<i>collating symbol</i>is a collating element enclosed withinbracket-period ([. .]) delimiters.Collating elements are defined as described in<xref href=collorder><a href="locale.html#tag_005_003_002_004">Collation Order</a></xref>.Multi-character collating elementsmust be represented as collating symbolswhen it is necessaryto distinguish them from a list of the individual characters that makeup the multi-character collating element.For example, if the string chis a collating element in the current collation sequence with theassociated collating symbol <ch>, the expression [[.ch.]]will be treated as an RE matching the character sequence ch, while[ch] will be treated as an RE matching c or h.Collating symbols will be recognised only inside bracket expressions.This implies that the RE [[.ch.]]*cmatches the first to fifth character in the string chchch.If the string is not a collating element in the current collatingsequence definition, or if the collating element has no charactersassociated with it (for example, see the symbol <HIGH>in the example collation definition shown in<xref href=collorder><a href="locale.html#tag_005_003_002_004">Collation Order</a></xref>),the symbol will be treated as an invalid expression.<p><li>An<i>equivalence class expression</i>represents the set of collating elementsbelonging to an equivalence class,as described in<b>Collation Order</b>.Only primary equivalence classes will be recognised.The class is expressed by enclosing any oneof the collating elements in the equivalence classwithin bracket-equal ([= =]) delimiters.For example, if a, à and â belong to the same equivalence class, then[[=a=]b], [[=à=]b] and [[=â=]b] will each be equivalent to [aàâb].If the collating element does not belong to an equivalence class,the equivalence class expression will be treated as a<i>collating symbol</i>.<p><li>A<i>character class expression</i>represents the set of characters belongingto a character class, as definedin the LC_CTYPE category in the current locale.All character classes specified inthe current locale will be recognised.A character class expressionis expressed as a character classname enclosed within bracket-colon ([: :])delimiters.<p>The following character class expressionsare supported in all locales:<code><pre>[:alnum:] [:cntrl:] [:lower:] [:space:][:alpha:] [:digit:] [:print:] [:upper:][:blank:] [:graph:] [:punct:] [:xdigit:]</code></pre><p>In addition, character class expressions of the form:<code><pre>[:<i>name</i>:]</code></pre><p>are recognised in those locales where the<i>name</i>keyword has been given a<b>charclass</b>definition in the LC_CTYPE category.<p><li>A<i>range expression</i>represents the set of collating elements thatfall between two elements in the current collation sequence,inclusively.It is expressed as the starting pointand the ending point separated by a hyphen (-).<p>Range expressions must not be used inportable applications becausetheir behaviour is dependent on the collating sequence.Ranges will be treated according tothe current collating sequence, and include such characters thatfall within the range based on that collating sequence, regardlessof character values.This, however, means that the interpretationwill differ depending on collating sequence.If, for instance,one collating sequence defines äas a variant of a, while another defines it as a letterfollowing z, then the expression [ä-z]is valid in the first language and invalid in the second.<p>In the following, all examples assume the collation sequence specifiedfor the POSIX locale,unless another collation sequence is specifically defined.<p>The starting range point and the ending range point must be a collatingelement or collating symbol.An equivalence class expression used as astarting or ending point of a rangeexpression produces unspecified results.An equivalence class can be used portably withina bracket expression, but only outside the range.For example, the unspecified expression [[=e=]-f]should be given as [[=e=]e-f].The ending range point must collateequal to or higher than the starting range point; otherwise, theexpression will be treated as invalid.The order used is the order in which the collating elements arespecified in the current collation definition.One-to-many mappings (see the description ofLC_COLLATEin<xref href=locale><a href="locale.html#tag_005_003_002">Locale</a></xref>)will not be performed.For example, assuming that the character eszet (ß) is placedin the collation sequence after r and s, but before tand that it maps to the sequence ss for collation purposes,then the expression [r-s] matches only r and s, but the expression[s-t] matches s, ß or t.<p>The interpretation of range expressions where the ending range pointis also the starting range point of a subsequent range expression(for instance [a-m-o]) is undefined.<p>The hyphen character will be treatedas itself if it occurs first (after an initial ^,if any) or last in the list, or asan ending range point in a range expression.As examples, the expressions [-ac] and [ac-]are equivalent and match any of the characters a, c or -;[^-ac] and [^ac-] are equivalent and match any characters excepta, c or -; the expression [%--]matches any of the characters between % and - inclusive;the expression [--@] matches any of the characters between- and @ inclusive; and the expression [a--@] is invalid,because the letter a follows the symbol - in the POSIX locale.To use a hyphen as the starting range point, it must either come firstin the bracket expression or be specified as a collating symbol,for example: [][.-.]-0], which matches either a right bracketor any characteror collating element that collates between hyphen and 0, inclusive.<p>If a bracket expression must specify both - and ], the ]must be placed first (after the ^, if any) and the -last within the bracket expression.<p></ol><h4><a name = "tag_007_003_006"> </a>BREs Matching Multiple Characters</h4><xref type="3" name="bremult"></xref>The following rules can be used to constructBREs matching multiple characters from BREs matching a single character:<ol><p><li>The concatenation of BREs matches the concatenation ofthe strings matched by each component of the BRE.<p><li>A<i>subexpression</i>can be defined within a BRE by enclosing it between thecharacter pairs \( and \) .Such a subexpression matches whatever it would have matchedwithout the \( and \),except that anchoring within subexpressions is optional behaviour; see<xref href=breanc><a href="#tag_007_003_008">BRE Expression Anchoring</a></xref>.Subexpressions can be arbitrarily nested.<p><li>The<i>back-reference</i>expression \<i>n</i> matches the same (possibly empty)string of characters as was matched by a subexpression enclosed between\( and \) preceding the \<i>n</i>.The character <i>n</i> must be a digitfrom 1 to 9 inclusive,<i>n</i>th subexpression (the one that begins with the <i>n</i>th\( and ends with the corresponding paired \)).The expression is invalid if less than <i>n</i> subexpressions precedethe \<i>n</i>.For example, the expression ^\(.*\)\1$matches a line consisting of two adjacent appearances of the same string,and the expression \(a\)*\1 fails to match a.The limit of nine back-references to subexpressions in the REis based on the use of a single digit identifier.This does not imply that only nine subexpressions are allowed in REs.The following is a valid BRE with ten subexpressions:<pre><code>\(\(\(ab\)*c\)*d\)\(ef\)*\(gh\)\{2\}\(ij\)*\(kl\)*\(mn\)*\(op\)*\(qr\)*</code></pre><p><li>When a BRE matching a single character, a subexpression or aback-reference is followed by the special character asterisk (*),together with that asterisk it matcheswhat zero or more consecutive occurrences of the BRE would match.For example, [ab]* and [ab][ab] are equivalent when matching the string ab.<p><li>When a BRE matching a single character, a subexpression or aback-reference is followed by an<i>interval expression</i>of the format\{<i>m</i>\}, \{<i>m</i>,\} or\{<i>m</i>,<i>n</i>\},together with that interval expression it matcheswhat repeated consecutive occurrences of the BRE would match.The values of <i>m</i> and <i>n</i> will be decimal integers in the range0 <= <i>m</i> <= <i>n</i> <={RE_DUP_MAX},where <i>m</i> specifies the exact or minimum number ofoccurrences and <i>n</i> specifies the maximum number of occurrences.The expression\{<i>m</i>\} matches exactly <i>m</i> occurrences of thepreceding BRE,\{<i>m</i>,\} matches at least <i>m</i> occurrences and\{<i>m,n</i>\} matches any number of occurrences between<i>m</i> and <i>n</i>, inclusive.<p>For example, in the stringabababccccccdthe BREc\{3\}is matched by characters seven to nine, the BRE\(ab\)\{4,\}is not matched at all and the BREc\{1,3\}dis matched by characters ten to thirteen.<p></ol><p>The behaviour of multiple adjacent duplication symbols(*and intervals) produces undefined results.<h4><a name = "tag_007_003_007"> </a>BRE Precedence</h4>The order of precedence is as shown in the following table:<pre><table bordercolor=#000000 border=1 align=center><tr valign=top><th colspan=2 align=center><b>BRE Precedence (from high to low)</b><tr valign=top><td align=left>collation-related bracket symbols<td align=left>[= =] [: :] [. .]<tr valign=top><td align=left>escaped characters<td align=left>\<<i>special character</i>><tr valign=top><td align=left>bracket expression<td align=left>[ ]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -