⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 xregex.texi

📁 正则表达式库
💻 TEXI
📖 第 1 页 / 共 5 页
字号:
@subsection The Match-beginning-of-word Operator (@code{\<})@cindex @samp{\<}This operator (represented by @samp{\<}) matches the empty string at thebeginning of a word.@node Match-end-of-word Operator, Match-word-constituent Operator, Match-beginning-of-word Operator, Word Operators@subsection The Match-end-of-word Operator (@code{\>})@cindex @samp{\>}This operator (represented by @samp{\>}) matches the empty string at theend of a word.@node Match-word-constituent Operator, Match-non-word-constituent Operator, Match-end-of-word Operator, Word Operators@subsection The Match-word-constituent Operator (@code{\w})@cindex @samp{\w}This operator (represented by @samp{\w}) matches any word-constituentcharacter.@node Match-non-word-constituent Operator,  , Match-word-constituent Operator, Word Operators@subsection The Match-non-word-constituent Operator (@code{\W})@cindex @samp{\W}This operator (represented by @samp{\W}) matches any character that isnot word-constituent.@node Buffer Operators,  , Word Operators, GNU Operators@section Buffer Operators    Following are operators which work on buffers.  In Emacs, a @dfn{buffer}is, naturally, an Emacs buffer.  For other programs, Regex considers theentire string to be matched as the buffer.@menu* Match-beginning-of-buffer Operator::	\`* Match-end-of-buffer Operator::	\'@end menu@node Match-beginning-of-buffer Operator, Match-end-of-buffer Operator,  , Buffer Operators@subsection The Match-beginning-of-buffer Operator (@code{\`})@cindex @samp{\`}This operator (represented by @samp{\`}) matches the empty string at thebeginning of the buffer.@node Match-end-of-buffer Operator,  , Match-beginning-of-buffer Operator, Buffer Operators@subsection The Match-end-of-buffer Operator (@code{\'})@cindex @samp{\'}This operator (represented by @samp{\'}) matches the empty string at theend of the buffer.@node GNU Emacs Operators, What Gets Matched?, GNU Operators, Top@chapter GNU Emacs OperatorsFollowing are operators that @sc{gnu} defines (and @sc{posix} doesn't)that you can use only when Regex is compiled with the preprocessorsymbol @code{emacs} defined.  @menu* Syntactic Class Operators::@end menu@node Syntactic Class Operators,  ,  , GNU Emacs Operators@section Syntactic Class OperatorsThe operators in this section require Regex to recognize the syntacticclasses of characters.  Regex uses a syntax table to determine this.@menu* Emacs Syntax Tables::* Match-syntactic-class Operator::	\sCLASS* Match-not-syntactic-class Operator::  \SCLASS@end menu@node Emacs Syntax Tables, Match-syntactic-class Operator,  , Syntactic Class Operators@subsection Emacs Syntax TablesA @dfn{syntax table} is an array indexed by the characters in yourcharacter set.  In the @sc{ascii} encoding, therefore, a syntax tablehas 256 elements.If Regex is compiled with the preprocessor symbol @code{emacs} defined,then Regex expects you to define and initialize the variable@code{re_syntax_table} to be an Emacs syntax table.  Emacs' syntaxtables are more complicated than Regex's own (@pxref{Non-Emacs SyntaxTables}).  @xref{Syntax, , Syntax, emacs, The GNU Emacs User's Manual},for a description of Emacs' syntax tables.@node Match-syntactic-class Operator, Match-not-syntactic-class Operator, Emacs Syntax Tables, Syntactic Class Operators@subsection The Match-syntactic-class Operator (@code{\s}@var{class})@cindex @samp{\s}This operator matches any character whose syntactic class is representedby a specified character.  @samp{\s@var{class}} represents this operatorwhere @var{class} is the character representing the syntactic class youwant.  For example, @samp{w} represents the syntacticclass of word-constituent characters, so @samp{\sw} matches anyword-constituent character.@node Match-not-syntactic-class Operator,  , Match-syntactic-class Operator, Syntactic Class Operators@subsection The Match-not-syntactic-class Operator (@code{\S}@var{class})@cindex @samp{\S}This operator is similar to the match-syntactic-class operator exceptthat it matches any character whose syntactic class is @emph{not}represented by the specified character.  @samp{\S@var{class}} representsthis operator.  For example, @samp{w} represents the syntactic class ofword-constituent characters, so @samp{\Sw} matches any character that isnot word-constituent.@node What Gets Matched?, Programming with Regex, GNU Emacs Operators, Top@chapter What Gets Matched?Regex usually matches strings according to the ``leftmost longest''rule; that is, it chooses the longest of the leftmost matches.  Thisdoes not mean that for a regular expression containing subexpressionsthat it simply chooses the longest match for each subexpression, left toright; the overall match must also be the longest possible one.For example, @samp{(ac*)(c*d[ac]*)\1} matches @samp{acdacaaa}, not@samp{acdac}, as it would if it were to choose the longest match for thefirst subexpression.@node Programming with Regex, Copying, What Gets Matched?, Top@chapter Programming with RegexHere we describe how you use the Regex data structures and functions inC programs.  Regex has three interfaces: one designed for @sc{gnu}, onecompatible with @sc{posix} and one compatible with Berkeley @sc{unix}.@menu* GNU Regex Functions::* POSIX Regex Functions::* BSD Regex Functions::@end menu@node GNU Regex Functions, POSIX Regex Functions,  , Programming with Regex@section GNU Regex FunctionsIf you're writing code that doesn't need to be compatible with either@sc{posix} or Berkeley @sc{unix}, you can use these functions.  Theyprovide more options than the other interfaces.@menu* GNU Pattern Buffers::         The re_pattern_buffer type.* GNU Regular Expression Compiling::  re_compile_pattern ()* GNU Matching::                re_match ()* GNU Searching::               re_search ()* Matching/Searching with Split Data::  re_match_2 (), re_search_2 ()* Searching with Fastmaps::     re_compile_fastmap ()* GNU Translate Tables::        The `translate' field.* Using Registers::             The re_registers type and related fns.* Freeing GNU Pattern Buffers::  regfree ()@end menu@node GNU Pattern Buffers, GNU Regular Expression Compiling,  , GNU Regex Functions@subsection GNU Pattern Buffers@cindex pattern buffer, definition of@tindex re_pattern_buffer @r{definition}@tindex struct re_pattern_buffer @r{definition}To compile, match, or search for a given regular expression, you mustsupply a pattern buffer.  A @dfn{pattern buffer} holds one compiledregular expression.@footnote{Regular expressions are also referred to as``patterns,'' hence the name ``pattern buffer.''}You can have several different pattern buffers simultaneously, eachholding a compiled pattern for a different regular expression.@file{regex.h} defines the pattern buffer @code{struct} as follows:@example[[[ pattern_buffer ]]]@end example@node GNU Regular Expression Compiling, GNU Matching, GNU Pattern Buffers, GNU Regex Functions@subsection GNU Regular Expression CompilingIn @sc{gnu}, you can both match and search for a given regularexpression.  To do either, you must first compile it in a pattern buffer(@pxref{GNU Pattern Buffers}).@cindex syntax initialization@vindex re_syntax_options @r{initialization}Regular expressions match according to the syntax with which they werecompiled; with @sc{gnu}, you indicate what syntax you want by settingthe variable @code{re_syntax_options} (declared in @file{regex.h} anddefined in @file{regex.c}) before calling the compiling function,@code{re_compile_pattern} (see below).  @xref{Syntax Bits}, and@ref{Predefined Syntaxes}.You can change the value of @code{re_syntax_options} at any time.Usually, however, you set its value once and then never change it.@cindex pattern buffer initialization@code{re_compile_pattern} takes a pattern buffer as an argument.  Youmust initialize the following fields:@table @code@item translate @r{initialization}@item translate@vindex translate @r{initialization}Initialize this to point to a translate table if you want one, or tozero if you don't.  We explain translate tables in @ref{GNU TranslateTables}.@item fastmap@vindex fastmap @r{initialization}Initialize this to nonzero if you want a fastmap, or to zero if youdon't.@item buffer@itemx allocated@vindex buffer @r{initialization}@vindex allocated @r{initialization}@findex mallocIf you want @code{re_compile_pattern} to allocate memory for thecompiled pattern, set both of these to zero.  If you have an existingblock of memory (allocated with @code{malloc}) you want Regex to use,set @code{buffer} to its address and @code{allocated} to its size (inbytes).@code{re_compile_pattern} uses @code{realloc} to extend the space forthe compiled pattern as necessary.@end tableTo compile a pattern buffer, use:@findex re_compile_pattern@examplechar * re_compile_pattern (const char *@var{regex}, const int @var{regex_size},                     struct re_pattern_buffer *@var{pattern_buffer})@end example@noindent@var{regex} is the regular expression's address, @var{regex_size} is itslength, and @var{pattern_buffer} is the pattern buffer's address.If @code{re_compile_pattern} successfully compiles the regularexpression, it returns zero and sets @code{*@var{pattern_buffer}} to thecompiled pattern.  It sets the pattern buffer's fields as follows:@table @code@item buffer@vindex buffer @r{field, set by @code{re_compile_pattern}}to the compiled pattern.@item used@vindex used @r{field, set by @code{re_compile_pattern}}to the number of bytes the compiled pattern in @code{buffer} occupies.@item syntax@vindex syntax @r{field, set by @code{re_compile_pattern}}to the current value of @code{re_syntax_options}.@item re_nsub@vindex re_nsub @r{field, set by @code{re_compile_pattern}}to the number of subexpressions in @var{regex}.@item fastmap_accurate@vindex fastmap_accurate @r{field, set by @code{re_compile_pattern}}to zero on the theory that the pattern you're compiling is differentthan the one previously compiled into @code{buffer}; in that case (sinceyou can't make a fastmap without a compiled pattern), @code{fastmap} would either contain an incompatible fastmap, or nothingat all.@c xx what else?@end tableIf @code{re_compile_pattern} can't compile @var{regex}, it returns anerror string corresponding to one of the errors listed in @ref{POSIXRegular Expression Compiling}.@node GNU Matching, GNU Searching, GNU Regular Expression Compiling, GNU Regex Functions@subsection GNU Matching @cindex matching with GNU functionsMatching the @sc{gnu} way means trying to match as much of a string aspossible starting at a position within it you specify.  Once you've compileda pattern into a pattern buffer (@pxref{GNU Regular ExpressionCompiling}), you can ask the matcher to match that pattern against astring using:@findex re_match@exampleintre_match (struct re_pattern_buffer *@var{pattern_buffer},           const char *@var{string}, const int @var{size},           const int @var{start}, struct re_registers *@var{regs})@end example@noindent@var{pattern_buffer} is the address of a pattern buffer containing acompiled pattern.  @var{string} is the string you want to match; it cancontain newline and null characters.  @var{size} is the length of thatstring.  @var{start} is the string index at which you want tobegin matching; the first character of @var{string} is at index zero.@xref{Using Registers}, for a explanation of @var{regs}; you can safelypass zero.@code{re_match} matches the regular expression in @var{pattern_buffer}against the string @var{string} according to the syntax in@var{pattern_buffers}'s @code{syntax} field.  (@xref{GNU RegularExpression Compiling}, for how to set it.)  The function returns@math{-1} if the compiled pattern does not match any part of@var{string} and @math{-2} if an internal error happens; otherwise, itreturns how many (possibly zero) characters of @var{string} the patternmatched.An example: suppose @var{pattern_buffer} points to a pattern buffercontaining the compiled pattern for @samp{a*}, and @var{string} pointsto @samp{aaaaab} (whereupon @var{size} should be 6). Then if @var{start}is 2, @code{re_match} returns 3, i.e., @samp{a*} would have matched thelast three @samp{a}s in @var{string}.  If @var{start} is 0,@code{re_match} returns 5, i.e., @samp{a*} would have matched all the@samp{a}s in @var{string}.  If @var{start} is either 5 or 6, it returnszero.If @var{start} is not between zero and @var{size}, then@code{re_match} returns @math{-1}.@node GNU Searching, Matching/Searching with Split Data, GNU Matching, GNU Regex Functions@subsection GNU Searching @cindex searching with GNU functions@dfn{Searching} means trying to match starting at successive positionswithin a string.  The function @code{re_search} does this.Before calling @code{re_search}, you must compile your regularexpression.  @xref{GNU Regular Expression Compiling}.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -