📄 xregex.texi
字号:
@subsection The Match-beginning-of-word Operator (@code{\<})@cindex @samp{\<}This operator (represented by @samp{\<}) matches the empty string at thebeginning of a word.@node Match-end-of-word Operator, Match-word-constituent Operator, Match-beginning-of-word Operator, Word Operators@subsection The Match-end-of-word Operator (@code{\>})@cindex @samp{\>}This operator (represented by @samp{\>}) matches the empty string at theend of a word.@node Match-word-constituent Operator, Match-non-word-constituent Operator, Match-end-of-word Operator, Word Operators@subsection The Match-word-constituent Operator (@code{\w})@cindex @samp{\w}This operator (represented by @samp{\w}) matches any word-constituentcharacter.@node Match-non-word-constituent Operator, , Match-word-constituent Operator, Word Operators@subsection The Match-non-word-constituent Operator (@code{\W})@cindex @samp{\W}This operator (represented by @samp{\W}) matches any character that isnot word-constituent.@node Buffer Operators, , Word Operators, GNU Operators@section Buffer Operators Following are operators which work on buffers. In Emacs, a @dfn{buffer}is, naturally, an Emacs buffer. For other programs, Regex considers theentire string to be matched as the buffer.@menu* Match-beginning-of-buffer Operator:: \`* Match-end-of-buffer Operator:: \'@end menu@node Match-beginning-of-buffer Operator, Match-end-of-buffer Operator, , Buffer Operators@subsection The Match-beginning-of-buffer Operator (@code{\`})@cindex @samp{\`}This operator (represented by @samp{\`}) matches the empty string at thebeginning of the buffer.@node Match-end-of-buffer Operator, , Match-beginning-of-buffer Operator, Buffer Operators@subsection The Match-end-of-buffer Operator (@code{\'})@cindex @samp{\'}This operator (represented by @samp{\'}) matches the empty string at theend of the buffer.@node GNU Emacs Operators, What Gets Matched?, GNU Operators, Top@chapter GNU Emacs OperatorsFollowing are operators that @sc{gnu} defines (and @sc{posix} doesn't)that you can use only when Regex is compiled with the preprocessorsymbol @code{emacs} defined. @menu* Syntactic Class Operators::@end menu@node Syntactic Class Operators, , , GNU Emacs Operators@section Syntactic Class OperatorsThe operators in this section require Regex to recognize the syntacticclasses of characters. Regex uses a syntax table to determine this.@menu* Emacs Syntax Tables::* Match-syntactic-class Operator:: \sCLASS* Match-not-syntactic-class Operator:: \SCLASS@end menu@node Emacs Syntax Tables, Match-syntactic-class Operator, , Syntactic Class Operators@subsection Emacs Syntax TablesA @dfn{syntax table} is an array indexed by the characters in yourcharacter set. In the @sc{ascii} encoding, therefore, a syntax tablehas 256 elements.If Regex is compiled with the preprocessor symbol @code{emacs} defined,then Regex expects you to define and initialize the variable@code{re_syntax_table} to be an Emacs syntax table. Emacs' syntaxtables are more complicated than Regex's own (@pxref{Non-Emacs SyntaxTables}). @xref{Syntax, , Syntax, emacs, The GNU Emacs User's Manual},for a description of Emacs' syntax tables.@node Match-syntactic-class Operator, Match-not-syntactic-class Operator, Emacs Syntax Tables, Syntactic Class Operators@subsection The Match-syntactic-class Operator (@code{\s}@var{class})@cindex @samp{\s}This operator matches any character whose syntactic class is representedby a specified character. @samp{\s@var{class}} represents this operatorwhere @var{class} is the character representing the syntactic class youwant. For example, @samp{w} represents the syntacticclass of word-constituent characters, so @samp{\sw} matches anyword-constituent character.@node Match-not-syntactic-class Operator, , Match-syntactic-class Operator, Syntactic Class Operators@subsection The Match-not-syntactic-class Operator (@code{\S}@var{class})@cindex @samp{\S}This operator is similar to the match-syntactic-class operator exceptthat it matches any character whose syntactic class is @emph{not}represented by the specified character. @samp{\S@var{class}} representsthis operator. For example, @samp{w} represents the syntactic class ofword-constituent characters, so @samp{\Sw} matches any character that isnot word-constituent.@node What Gets Matched?, Programming with Regex, GNU Emacs Operators, Top@chapter What Gets Matched?Regex usually matches strings according to the ``leftmost longest''rule; that is, it chooses the longest of the leftmost matches. Thisdoes not mean that for a regular expression containing subexpressionsthat it simply chooses the longest match for each subexpression, left toright; the overall match must also be the longest possible one.For example, @samp{(ac*)(c*d[ac]*)\1} matches @samp{acdacaaa}, not@samp{acdac}, as it would if it were to choose the longest match for thefirst subexpression.@node Programming with Regex, Copying, What Gets Matched?, Top@chapter Programming with RegexHere we describe how you use the Regex data structures and functions inC programs. Regex has three interfaces: one designed for @sc{gnu}, onecompatible with @sc{posix} and one compatible with Berkeley @sc{unix}.@menu* GNU Regex Functions::* POSIX Regex Functions::* BSD Regex Functions::@end menu@node GNU Regex Functions, POSIX Regex Functions, , Programming with Regex@section GNU Regex FunctionsIf you're writing code that doesn't need to be compatible with either@sc{posix} or Berkeley @sc{unix}, you can use these functions. Theyprovide more options than the other interfaces.@menu* GNU Pattern Buffers:: The re_pattern_buffer type.* GNU Regular Expression Compiling:: re_compile_pattern ()* GNU Matching:: re_match ()* GNU Searching:: re_search ()* Matching/Searching with Split Data:: re_match_2 (), re_search_2 ()* Searching with Fastmaps:: re_compile_fastmap ()* GNU Translate Tables:: The `translate' field.* Using Registers:: The re_registers type and related fns.* Freeing GNU Pattern Buffers:: regfree ()@end menu@node GNU Pattern Buffers, GNU Regular Expression Compiling, , GNU Regex Functions@subsection GNU Pattern Buffers@cindex pattern buffer, definition of@tindex re_pattern_buffer @r{definition}@tindex struct re_pattern_buffer @r{definition}To compile, match, or search for a given regular expression, you mustsupply a pattern buffer. A @dfn{pattern buffer} holds one compiledregular expression.@footnote{Regular expressions are also referred to as``patterns,'' hence the name ``pattern buffer.''}You can have several different pattern buffers simultaneously, eachholding a compiled pattern for a different regular expression.@file{regex.h} defines the pattern buffer @code{struct} as follows:@example[[[ pattern_buffer ]]]@end example@node GNU Regular Expression Compiling, GNU Matching, GNU Pattern Buffers, GNU Regex Functions@subsection GNU Regular Expression CompilingIn @sc{gnu}, you can both match and search for a given regularexpression. To do either, you must first compile it in a pattern buffer(@pxref{GNU Pattern Buffers}).@cindex syntax initialization@vindex re_syntax_options @r{initialization}Regular expressions match according to the syntax with which they werecompiled; with @sc{gnu}, you indicate what syntax you want by settingthe variable @code{re_syntax_options} (declared in @file{regex.h} anddefined in @file{regex.c}) before calling the compiling function,@code{re_compile_pattern} (see below). @xref{Syntax Bits}, and@ref{Predefined Syntaxes}.You can change the value of @code{re_syntax_options} at any time.Usually, however, you set its value once and then never change it.@cindex pattern buffer initialization@code{re_compile_pattern} takes a pattern buffer as an argument. Youmust initialize the following fields:@table @code@item translate @r{initialization}@item translate@vindex translate @r{initialization}Initialize this to point to a translate table if you want one, or tozero if you don't. We explain translate tables in @ref{GNU TranslateTables}.@item fastmap@vindex fastmap @r{initialization}Initialize this to nonzero if you want a fastmap, or to zero if youdon't.@item buffer@itemx allocated@vindex buffer @r{initialization}@vindex allocated @r{initialization}@findex mallocIf you want @code{re_compile_pattern} to allocate memory for thecompiled pattern, set both of these to zero. If you have an existingblock of memory (allocated with @code{malloc}) you want Regex to use,set @code{buffer} to its address and @code{allocated} to its size (inbytes).@code{re_compile_pattern} uses @code{realloc} to extend the space forthe compiled pattern as necessary.@end tableTo compile a pattern buffer, use:@findex re_compile_pattern@examplechar * re_compile_pattern (const char *@var{regex}, const int @var{regex_size}, struct re_pattern_buffer *@var{pattern_buffer})@end example@noindent@var{regex} is the regular expression's address, @var{regex_size} is itslength, and @var{pattern_buffer} is the pattern buffer's address.If @code{re_compile_pattern} successfully compiles the regularexpression, it returns zero and sets @code{*@var{pattern_buffer}} to thecompiled pattern. It sets the pattern buffer's fields as follows:@table @code@item buffer@vindex buffer @r{field, set by @code{re_compile_pattern}}to the compiled pattern.@item used@vindex used @r{field, set by @code{re_compile_pattern}}to the number of bytes the compiled pattern in @code{buffer} occupies.@item syntax@vindex syntax @r{field, set by @code{re_compile_pattern}}to the current value of @code{re_syntax_options}.@item re_nsub@vindex re_nsub @r{field, set by @code{re_compile_pattern}}to the number of subexpressions in @var{regex}.@item fastmap_accurate@vindex fastmap_accurate @r{field, set by @code{re_compile_pattern}}to zero on the theory that the pattern you're compiling is differentthan the one previously compiled into @code{buffer}; in that case (sinceyou can't make a fastmap without a compiled pattern), @code{fastmap} would either contain an incompatible fastmap, or nothingat all.@c xx what else?@end tableIf @code{re_compile_pattern} can't compile @var{regex}, it returns anerror string corresponding to one of the errors listed in @ref{POSIXRegular Expression Compiling}.@node GNU Matching, GNU Searching, GNU Regular Expression Compiling, GNU Regex Functions@subsection GNU Matching @cindex matching with GNU functionsMatching the @sc{gnu} way means trying to match as much of a string aspossible starting at a position within it you specify. Once you've compileda pattern into a pattern buffer (@pxref{GNU Regular ExpressionCompiling}), you can ask the matcher to match that pattern against astring using:@findex re_match@exampleintre_match (struct re_pattern_buffer *@var{pattern_buffer}, const char *@var{string}, const int @var{size}, const int @var{start}, struct re_registers *@var{regs})@end example@noindent@var{pattern_buffer} is the address of a pattern buffer containing acompiled pattern. @var{string} is the string you want to match; it cancontain newline and null characters. @var{size} is the length of thatstring. @var{start} is the string index at which you want tobegin matching; the first character of @var{string} is at index zero.@xref{Using Registers}, for a explanation of @var{regs}; you can safelypass zero.@code{re_match} matches the regular expression in @var{pattern_buffer}against the string @var{string} according to the syntax in@var{pattern_buffers}'s @code{syntax} field. (@xref{GNU RegularExpression Compiling}, for how to set it.) The function returns@math{-1} if the compiled pattern does not match any part of@var{string} and @math{-2} if an internal error happens; otherwise, itreturns how many (possibly zero) characters of @var{string} the patternmatched.An example: suppose @var{pattern_buffer} points to a pattern buffercontaining the compiled pattern for @samp{a*}, and @var{string} pointsto @samp{aaaaab} (whereupon @var{size} should be 6). Then if @var{start}is 2, @code{re_match} returns 3, i.e., @samp{a*} would have matched thelast three @samp{a}s in @var{string}. If @var{start} is 0,@code{re_match} returns 5, i.e., @samp{a*} would have matched all the@samp{a}s in @var{string}. If @var{start} is either 5 or 6, it returnszero.If @var{start} is not between zero and @var{size}, then@code{re_match} returns @math{-1}.@node GNU Searching, Matching/Searching with Split Data, GNU Matching, GNU Regex Functions@subsection GNU Searching @cindex searching with GNU functions@dfn{Searching} means trying to match starting at successive positionswithin a string. The function @code{re_search} does this.Before calling @code{re_search}, you must compile your regularexpression. @xref{GNU Regular Expression Compiling}.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -