📄 xregex.texi
字号:
@itemOtherwise, @samp{*} is ordinary.@end enumerate@cindex backtrackingThe matcher processes a match-zero-or-more operator by first matching asmany repetitions of the smallest preceding regular expression as it can.Then it continues to match the rest of the pattern. If it can't match the rest of the pattern, it backtracks (as many timesas necessary), each time discarding one of the matches until it caneither match the entire pattern or be certain that it cannot get amatch. For example, when matching @samp{ca*ar} against @samp{caaar},the matcher first matches all three @samp{a}s of the string with the@samp{a*} of the regular expression. However, it cannot then match thefinal @samp{ar} of the regular expression against the final @samp{r} ofthe string. So it backtracks, discarding the match of the last @samp{a}in the string. It can then match the remaining @samp{ar}.@node Match-one-or-more Operator, Match-zero-or-one Operator, Match-zero-or-more Operator, Repetition Operators@subsection The Match-one-or-more Operator (@code{+} or @code{\+})@cindex @samp{+} If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn't recognizethis operator. Otherwise, if the syntax bit @code{RE_BK_PLUS_QM} isn'tset, then @samp{+} represents this operator; if it is, then @samp{\+}does.This operator is similar to the match-zero-or-more operator except thatit repeats the preceding regular expression at least once;@pxref{Match-zero-or-more Operator}, for what it operates on, how somesyntax bits affect it, and how Regex backtracks to match it.For example, supposing that @samp{+} represents the match-one-or-moreoperator; then @samp{ca+r} matches, e.g., @samp{car} and@samp{caaaar}, but not @samp{cr}.@node Match-zero-or-one Operator, Interval Operators, Match-one-or-more Operator, Repetition Operators@subsection The Match-zero-or-one Operator (@code{?} or @code{\?})@cindex @samp{?}If the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn'trecognize this operator. Otherwise, if the syntax bit@code{RE_BK_PLUS_QM} isn't set, then @samp{?} represents this operator;if it is, then @samp{\?} does.This operator is similar to the match-zero-or-more operator except thatit repeats the preceding regular expression once or not at all;@pxref{Match-zero-or-more Operator}, to see what it operates on, howsome syntax bits affect it, and how Regex backtracks to match it.For example, supposing that @samp{?} represents the match-zero-or-oneoperator; then @samp{ca?r} matches both @samp{car} and @samp{cr}, butnothing else.@node Interval Operators, , Match-zero-or-one Operator, Repetition Operators@subsection Interval Operators (@code{@{} @dots{} @code{@}} or @code{\@{} @dots{} @code{\@}})@cindex interval expression@cindex @samp{@{}@cindex @samp{@}}@cindex @samp{\@{}@cindex @samp{\@}}If the syntax bit @code{RE_INTERVALS} is set, then Regex recognizes@dfn{interval expressions}. They repeat the smallest possible precedingregular expression a specified number of times.If the syntax bit @code{RE_NO_BK_BRACES} is set, @samp{@{} representsthe @dfn{open-interval operator} and @samp{@}} represents the@dfn{close-interval operator} ; otherwise, @samp{\@{} and @samp{\@}} do.Specifically, supposing that @samp{@{} and @samp{@}} represent theopen-interval and close-interval operators; then:@table @code@item @{@var{count}@}matches exactly @var{count} occurrences of the preceding regularexpression.@item @{@var{min,}@}matches @var{min} or more occurrences of the preceding regularexpression.@item @{@var{min, max}@}matches at least @var{min} but no more than @var{max} occurrences ofthe preceding regular expression.@end tableThe interval expression (but not necessarily the regular expression thatcontains it) is invalid if:@itemize @bullet@item@var{min} is greater than @var{max}, or @itemany of @var{count}, @var{min}, or @var{max} are outside the rangezero to @code{RE_DUP_MAX} (which symbol @file{regex.h}defines).@end itemizeIf the interval expression is invalid and the syntax bit@code{RE_NO_BK_BRACES} is set, then Regex considers all thecharacters in the would-be interval to be ordinary. If that bitisn't set, then the regular expression is invalid.If the interval expression is valid but there is no preceding regularexpression on which to operate, then if the syntax bit@code{RE_CONTEXT_INVALID_OPS} is set, the regular expression is invalid.If that bit isn't set, then Regex considers all the characters---otherthan backslashes, which it ignores---in the would-be interval to beordinary.@node Alternation Operator, List Operators, Repetition Operators, Common Operators@section The Alternation Operator (@code{|} or @code{\|})@kindex |@kindex \|@cindex alternation operator@cindex or operatorIf the syntax bit @code{RE_LIMITED_OPS} is set, then Regex doesn'trecognize this operator. Otherwise, if the syntax bit@code{RE_NO_BK_VBAR} is set, then @samp{|} represents this operator;otherwise, @samp{\|} does.Alternatives match one of a choice of regular expressions:if you put the character(s) representing the alternation operator betweenany two regular expressions @var{a} and @var{b}, the result matchesthe union of the strings that @var{a} and @var{b} match. Forexample, supposing that @samp{|} is the alternation operator, then@samp{foo|bar|quux} would match any of @samp{foo}, @samp{bar} or@samp{quux}.@ignore@c Nobody needs to disallow empty alternatives any more.If the syntax bit @code{RE_NO_EMPTY_ALTS} is set, then if either of the regularexpressions @var{a} or @var{b} is empty, theregular expression is invalid. More precisely, if this syntax bit isset, then the alternation operator can't:@itemize @bullet@itembe first or last in a regular expression;@itemfollow either another alternation operator or an open-group operator(@pxref{Grouping Operators}); or@itemprecede a close-group operator.@end itemize@noindentFor example, supposing @samp{(} and @samp{)} represent the open andclose-group operators, then @samp{|foo}, @samp{foo|}, @samp{foo||bar},@samp{foo(|bar)}, and @samp{(foo|)bar} would all be invalid.@end ignoreThe alternation operator operates on the @emph{largest} possiblesurrounding regular expressions. (Put another way, it has the lowestprecedence of any regular expression operator.)Thus, the only way you candelimit its arguments is to use grouping. For example, if @samp{(} and@samp{)} are the open and close-group operators, then @samp{fo(o|b)ar}would match either @samp{fooar} or @samp{fobar}. (@samp{foo|bar} wouldmatch @samp{foo} or @samp{bar}.)@cindex backtrackingThe matcher usually tries all combinations of alternatives so as to match the longest possible string. For example, when matching@samp{(fooq|foo)*(qbarquux|bar)} against @samp{fooqbarquux}, it cannottake, say, the first (``depth-first'') combination it could match, sincethen it would be content to match just @samp{fooqbar}. @comment xx something about leftmost-longest@node List Operators, Grouping Operators, Alternation Operator, Common Operators@section List Operators (@code{[} @dots{} @code{]} and @code{[^} @dots{} @code{]})@cindex matching list@cindex @samp{[}@cindex @samp{]}@cindex @samp{^}@cindex @samp{-}@cindex @samp{\}@cindex @samp{[^}@cindex nonmatching list@cindex matching newline@cindex bracket expression@dfn{Lists}, also called @dfn{bracket expressions}, are a set of one ormore items. An @dfn{item} is a character,@ignore(These get added when they get implemented.)a collating symbol, an equivalence class expression, @end ignorea character class expression, or a range expression. The syntax bitsaffect which kinds of items you can put in a list. We explain the lasttwo items in subsections below. Empty lists are invalid.A @dfn{matching list} matches a single character represented by one ofthe list items. You form a matching list by enclosing one or more itemswithin an @dfn{open-matching-list operator} (represented by @samp{[})and a @dfn{close-list operator} (represented by @samp{]}). For example, @samp{[ab]} matches either @samp{a} or @samp{b}.@samp{[ad]*} matches the empty string and any string composed of just@samp{a}s and @samp{d}s in any order. Regex considers invalid a regularexpression with a @samp{[} but no matching@samp{]}.@dfn{Nonmatching lists} are similar to matching lists except that theymatch a single character @emph{not} represented by one of the listitems. You use an @dfn{open-nonmatching-list operator} (represented by@samp{[^}@footnote{Regex therefore doesn't consider the @samp{^} to bethe first character in the list. If you put a @samp{^} character firstin (what you think is) a matching list, you'll turn it into anonmatching list.}) instead of an open-matching-list operator to start anonmatching list. For example, @samp{[^ab]} matches any character except @samp{a} or@samp{b}. If the @code{posix_newline} field in the pattern buffer (@pxref{GNUPattern Buffers} is set, then nonmatching lists do not match a newline.Most characters lose any special meaning inside a list. The specialcharacters inside a list follow.@table @samp@item ]ends the list if it's not the first list item. So, if you want to makethe @samp{]} character a list item, you must put it first.@item \quotes the next character if the syntax bit @code{RE_BACKSLASH_ESCAPE_IN_LISTS} isset.@ignorePut these in if they get implemented.@item [.represents the open-collating-symbol operator (@pxref{Collating SymbolOperators}).@item .]represents the close-collating-symbol operator.@item [=represents the open-equivalence-class operator (@pxref{Equivalence ClassOperators}).@item =]represents the close-equivalence-class operator.@end ignore@item [:represents the open-character-class operator (@pxref{Character ClassOperators}) if the syntax bit @code{RE_CHAR_CLASSES} is set and whatfollows is a valid character class expression.@item :]represents the close-character-class operator if the syntax bit@code{RE_CHAR_CLASSES} is set and what precedes it is anopen-character-class operator followed by a valid character class name.@item - represents the range operator (@pxref{Range Operator}) if it'snot first or last in a list or the ending point of a range.@end table@noindentAll other characters are ordinary. For example, @samp{[.*]} matches @samp{.} and @samp{*}. @menu* Character Class Operators:: [:class:]* Range Operator:: start-end@end menu@ignore(If collating symbols and equivalence class expressions get implemented,then add this.)node Collating Symbol Operatorssubsubsection Collating Symbol Operators (@code{[.} @dots{} @code{.]})If the syntax bit @code{XX} is set, then you can representcollating symbols inside lists. You form a @dfn{collating symbol} byputting a collating element between an @dfn{open-collating-symboloperator} and an @dfn{close-collating-symbol operator}. @samp{[.}represents the open-collating-symbol operator and @samp{.]} representsthe close-collating-symbol operator. For example, if @samp{ll} is acollating element, then @samp{[[.ll.]]} would match @samp{ll}.node Equivalence Class Operatorssubsubsection Equivalence Class Operators (@code{[=} @dots{} @code{=]})@cindex equivalence class expression in regex@cindex @samp{[=} in regex@cindex @samp{=]} in regexIf the syntax bit @code{XX} is set, then Regex recognizes equivalence classexpressions inside lists. A @dfn{equivalence class expression} is a setof collating elements which all belong to the same equivalence class.You form an equivalence class expression by putting a collatingelement between an @dfn{open-equivalence-class operator} and a@dfn{close-equivalence-class operator}. @samp{[=} represents theopen-equivalence-class operator and @samp{=]} represents theclose-equivalence-class operator. For example, if @samp{a} and @samp{A}were an equivalence class, then both @samp{[[=a=]]} and @samp{[[=A=]]}would match both @samp{a} and @samp{A}. If the collating element in anequivalence class expression isn't part of an equivalence class, thenthe matcher considers the equivalence class expression to be a collatingsymbol.@end ignore@node Character Class Operators, Range Operator, , List Operators@subsection Character Class Operators (@code{[:} @dots{} @code{:]})@cindex character classes@cindex @samp{[:} in regex@cindex @samp{:]} in regexIf the syntax bit @code{RE_CHARACTER_CLASSES} is set, then Regexrecognizes character class expressions inside lists. A @dfn{characterclass expression} matches one character from a given class. You form acharacter class expression by putting a character class name between an@dfn{open-character-class operator} (represented by @samp{[:}) and a@dfn{close-character-class operator} (represented by @samp{:]}). Thecharacter class names and their meanings are:@table @code@item alnum letters and digits@item alphaletters@item blanksystem-dependent; for @sc{gnu}, a space or tab@item cntrlcontrol characters (in the @sc{ascii} encoding, code 0177 and codesless than 040)@item digitdigits@item graph
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -