⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 regex.3

📁 关系型数据库 Postgresql 6.5.2
💻 3
📖 第 1 页 / 共 2 页
字号:
An empty substring is denoted by equal offsets,both indicating the character following the empty substring..PPThe 0th member of the.I pmatcharray is filled in to indicate what substring of.I stringwas matched by the entire RE.Remaining members report what substring was matched by parenthesizedsubexpressions within the RE;member.I ireports subexpression.IR i ,with subexpressions counted (starting at 1) by the order of their openingparentheses in the RE, left to right.Unused entries in the array\(emcorresponding either to subexpressions thatdid not participate in the match at all, or to subexpressions that do notexist in the RE (that is, \fIi\fR\ > \fIpreg\fR\->\fIre_nsub\fR)\(emhave both.I rm_soand.I rm_eoset to \-1.If a subexpression participated in the match several times,the reported substring is the last one it matched.(Note, as an example in particular, that when the RE `(b*)+' matches `bbb',the parenthesized subexpression matches each of the three `b's and thenan infinite number of empty strings following the last `b',so the reported substring is one of the empties.).PPIf REG_STARTEND is specified,.I pmatchmust point to at least one.I regmatch_t(even if.I nmatchis 0 or REG_NOSUB was specified),to hold the input offsets for REG_STARTEND.Use for output is still entirely controlled by.IR nmatch ;if.I nmatchis 0 or REG_NOSUB was specified,the value of.IR pmatch [0]will not be changed by a successful.IR regexec ..PP.I Regerrormaps a non-zero.I errcodefrom either.I regcompor.I regexecto a human-readable, printable message.If.I pregis non-NULL,the error code should have arisen from use ofthe.I regex_tpointed to by.IR preg ,and if the error code came from.IR regcomp ,it should have been the result from the most recent.I regcompusing that.IR regex_t ..RI ( Regerrormay be able to supply a more detailed message using informationfrom the.IR regex_t .).I Regerrorplaces the NUL-terminated message into the buffer pointed to by.IR errbuf ,limiting the length (including the NUL) to at most.I errbuf_sizebytes.If the whole message won't fit,as much of it as will fit before the terminating NUL is supplied.In any case,the returned value is the size of buffer needed to hold the wholemessage (including terminating NUL).If.I errbuf_sizeis 0,.I errbufis ignored but the return value is still correct..PPIf the.I errcodegiven to.I regerroris first ORed with REG_ITOA,the ``message'' that results is the printable name of the error code,e.g. ``REG_NOMATCH'',rather than an explanation thereof.If.I errcodeis REG_ATOI,then.I pregshall be non-NULL and the.I re_endpmember of the structure it points tomust point to the printable name of an error code;in this case, the result in.I errbufis the decimal digits ofthe numeric value of the error code(0 if the name is not recognized).REG_ITOA and REG_ATOI are intended primarily as debugging facilities;they are extensions,compatible with but not specified by POSIX 1003.2,and should be used withcaution in software intended to be portable to other systems.Be warned also that they are considered experimental and changes are possible..PP.I Regfreefrees any dynamically-allocated storage associated with the compiled REpointed to by.IR preg .The remaining.I regex_tis no longer a valid compiled REand the effect of supplying it to.I regexecor.I regerroris undefined..PPNone of these functions references global variables except for tablesof constants;all are safe for use from multiple threads if the arguments are safe..SH IMPLEMENTATION CHOICESThere are a number of decisions that 1003.2 leaves up to the implementor,either by explicitly saying ``undefined'' or by virtue of them beingforbidden by the RE grammar.This implementation treats them as follows..PPSee.ZRfor a discussion of the definition of case-independent matching..PPThere is no particular limit on the length of REs,except insofar as memory is limited.Memory usage is approximately linear in RE size, and largely insensitiveto RE complexity, except for bounded repetitions.See BUGS for one short RE using themthat will run almost any system out of memory..PPA backslashed character other than one specifically given a magic meaningby 1003.2 (such magic meanings occur only in obsolete [``basic''] REs)is taken as an ordinary character..PPAny unmatched [ is a REG_EBRACK error..PPEquivalence classes cannot begin or end bracket-expression ranges.The endpoint of one range cannot begin another..PPRE_DUP_MAX, the limit on repetition counts in bounded repetitions, is 255..PPA repetition operator (?, *, +, or bounds) cannot follow anotherrepetition operator.A repetition operator cannot begin an expression or subexpressionor follow `^' or `|'..PP`|' cannot appear first or last in a (sub)expression or after another `|',i.e. an operand of `|' cannot be an empty subexpression.An empty parenthesized subexpression, `()', is legal and matches anempty (sub)string.An empty string is not a legal RE..PPA `{' followed by a digit is considered the beginning of bounds for abounded repetition, which must then follow the syntax for bounds.A `{' \fInot\fR followed by a digit is considered an ordinary character..PP`^' and `$' beginning and ending subexpressions in obsolete (``basic'')REs are anchors, not ordinary characters..SH SEE ALSOgrep(1), re_format(7).PPPOSIX 1003.2, sections 2.8 (Regular Expression Notation)andB.5 (C Binding for Regular Expression Matching)..SH DIAGNOSTICSNon-zero error codes from.I regcompand.I regexecinclude the following:.PP.nf.ta \w'REG_ECOLLATE'u+3nREG_NOMATCH	regexec() failed to matchREG_BADPAT	invalid regular expressionREG_ECOLLATE	invalid collating elementREG_ECTYPE	invalid character classREG_EESCAPE	\e applied to unescapable characterREG_ESUBREG	invalid backreference numberREG_EBRACK	brackets [ ] not balancedREG_EPAREN	parentheses ( ) not balancedREG_EBRACE	braces { } not balancedREG_BADBR	invalid repetition count(s) in { }REG_ERANGE	invalid character range in [ ]REG_ESPACE	ran out of memoryREG_BADRPT	?, *, or + operand invalidREG_EMPTY	empty (sub)expressionREG_ASSERT	``can't happen''\(emyou found a bugREG_INVARG	invalid argument, e.g. negative-length string.fi.SH HISTORYOriginally written by Henry Spencer.Altered for inclusion in the 4.4BSD distribution..SH BUGSThis is an alpha release with known defects.Please report problems..PPThere is one known functionality bug.The implementation of internationalization is incomplete:the locale is always assumed to be the default one of 1003.2,and only the collating elements etc. of that locale are available..PPThe back-reference code is subtle and doubts linger about its correctnessin complex cases..PP.I Regexecperformance is poor.This will improve with later releases..I Nmatchexceeding 0 is expensive;.I nmatchexceeding 1 is worse..I Regexecis largely insensitive to RE complexity \fIexcept\fR that backreferences are massively expensive.RE length does matter; in particular, there is a strong speed bonusfor keeping RE length under about 30 characters,with most special characters counting roughly double..PP.I Regcompimplements bounded repetitions by macro expansion,which is costly in time and space if counts are largeor bounded repetitions are nested.An RE like, say,`((((a{1,100}){1,100}){1,100}){1,100}){1,100}'will (eventually) run almost any existing machine out of swap space..PPThere are suspected problems with response to obscure error conditions.Notably,certain kinds of internal overflow,produced only by truly enormous REs or by multiply nested bounded repetitions,are probably not handled well..PPDue to a mistake in 1003.2, things like `a)b' are legal REs because `)' isa special character only in the presence of a previous unmatched `('.This can't be fixed until the spec is fixed..PPThe standard's definition of back references is vague.For example, does`a\e(\e(b\e)*\e2\e)*d' match `abbbd'?Until the standard is clarified,behavior in such cases should not be relied on..PPThe implementation of word-boundary matching is a bit of a kludge,and bugs may lurk in combinations of word-boundary matching and anchoring.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -