⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 pcre.3

📁 Apache V2.0.15 Alpha For Linuxhttpd-2_0_15-alpha.tar.Z
💻 3
📖 第 1 页 / 共 5 页
字号:
the following negative numbers:  PCRE_ERROR_NULL       the argument \fIcode\fR was NULL                        the argument \fIwhere\fR was NULL  PCRE_ERROR_BADMAGIC   the "magic number" was not found  PCRE_ERROR_BADOPTION  the value of \fIwhat\fR was invalidThe possible values for the third argument are defined in \fBpcre.h\fR, and areas follows:  PCRE_INFO_OPTIONSReturn a copy of the options with which the pattern was compiled. The fourthargument should point to au \fBunsigned long int\fR variable. These option bitsare those specified in the call to \fBpcre_compile()\fR, modified by anytop-level option settings within the pattern itself, and with the PCRE_ANCHOREDbit forcibly set if the form of the pattern implies that it can match only atthe start of a subject string.  PCRE_INFO_SIZEReturn the size of the compiled pattern, that is, the value that was passed asthe argument to \fBpcre_malloc()\fR when PCRE was getting memory in which toplace the compiled data. The fourth argument should point to a \fBsize_t\fRvariable.  PCRE_INFO_CAPTURECOUNTReturn the number of capturing subpatterns in the pattern. The fourth argumentshould point to an \fbint\fR variable.  PCRE_INFO_BACKREFMAXReturn the number of the highest back reference in the pattern. The fourthargument should point to an \fBint\fR variable. Zero is returned if there areno back references.  PCRE_INFO_FIRSTCHARReturn information about the first character of any matched string, for anon-anchored pattern. If there is a fixed first character, e.g. from a patternsuch as (cat|cow|coyote), it is returned in the integer pointed to by\fIwhere\fR. Otherwise, if either(a) the pattern was compiled with the PCRE_MULTILINE option, and every branchstarts with "^", or(b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not set(if it were set, the pattern would be anchored),-1 is returned, indicating that the pattern matches only at the start of asubject string or after any "\\n" within the string. Otherwise -2 is returned.For anchored patterns, -2 is returned.  PCRE_INFO_FIRSTTABLEIf the pattern was studied, and this resulted in the construction of a 256-bittable indicating a fixed set of characters for the first character in anymatching string, a pointer to the table is returned. Otherwise NULL isreturned. The fourth argument should point to an \fBunsigned char *\fRvariable.  PCRE_INFO_LASTLITERALFor a non-anchored pattern, return the value of the rightmost literal characterwhich must exist in any matched string, other than at its start. The fourthargument should point to an \fBint\fR variable. If there is no such character,or if the pattern is anchored, -1 is returned. For example, for the pattern/a\\d+z\\d+/ the returned value is 'z'.The \fBpcre_info()\fR function is now obsolete because its interface is toorestrictive to return all the available data about a compiled pattern. Newprograms should use \fBpcre_fullinfo()\fR instead. The yield of\fBpcre_info()\fR is the number of capturing subpatterns, or one of thefollowing negative numbers:  PCRE_ERROR_NULL       the argument \fIcode\fR was NULL  PCRE_ERROR_BADMAGIC   the "magic number" was not foundIf the \fIoptptr\fR argument is not NULL, a copy of the options with which thepattern was compiled is placed in the integer it points to (seePCRE_INFO_OPTIONS above).If the pattern is not anchored and the \fIfirstcharptr\fR argument is not NULL,it is used to pass back information about the first character of any matchedstring (see PCRE_INFO_FIRSTCHAR above)..SH MATCHING A PATTERNThe function \fBpcre_exec()\fR is called to match a subject string against apre-compiled pattern, which is passed in the \fIcode\fR argument. If thepattern has been studied, the result of the study should be passed in the\fIextra\fR argument. Otherwise this must be NULL.The PCRE_ANCHORED option can be passed in the \fIoptions\fR argument, whoseunused bits must be zero. However, if a pattern was compiled withPCRE_ANCHORED, or turned out to be anchored by virtue of its contents, itcannot be made unachored at matching time.There are also three further options that can be set only at matching time:  PCRE_NOTBOLThe first character of the string is not the beginning of a line, so thecircumflex metacharacter should not match before it. Setting this withoutPCRE_MULTILINE (at compile time) causes circumflex never to match.  PCRE_NOTEOLThe end of the string is not the end of a line, so the dollar metacharactershould not match it nor (except in multiline mode) a newline immediately beforeit. Setting this without PCRE_MULTILINE (at compile time) causes dollar neverto match.  PCRE_NOTEMPTYAn empty string is not considered to be a valid match if this option is set. Ifthere are alternatives in the pattern, they are tried. If all the alternativesmatch the empty string, the entire match fails. For example, if the pattern  a?b?is applied to a string not beginning with "a" or "b", it matches the emptystring at the start of the subject. With PCRE_NOTEMPTY set, this match is notvalid, so PCRE searches further into the string for occurrences of "a" or "b".Perl has no direct equivalent of PCRE_NOTEMPTY, but it does make a special caseof a pattern match of the empty string within its \fBsplit()\fR function, andwhen using the /g modifier. It is possible to emulate Perl's behaviour aftermatching a null string by first trying the match again at the same offset withPCRE_NOTEMPTY set, and then if that fails by advancing the starting offset (seebelow) and trying an ordinary match again.The subject string is passed as a pointer in \fIsubject\fR, a length in\fIlength\fR, and a starting offset in \fIstartoffset\fR. Unlike the patternstring, it may contain binary zero characters. When the starting offset iszero, the search for a match starts at the beginning of the subject, and thisis by far the most common case.A non-zero starting offset is useful when searching for another match in thesame subject by calling \fBpcre_exec()\fR again after a previous success.Setting \fIstartoffset\fR differs from just passing over a shortened string andsetting PCRE_NOTBOL in the case of a pattern that begins with any kind oflookbehind. For example, consider the pattern  \\Biss\\Bwhich finds occurrences of "iss" in the middle of words. (\\B matches only ifthe current position in the subject is not a word boundary.) When applied tothe string "Mississipi" the first call to \fBpcre_exec()\fR finds the firstoccurrence. If \fBpcre_exec()\fR is called again with just the remainder of thesubject, namely "issipi", it does not match, because \\B is always false at thestart of the subject, which is deemed to be a word boundary. However, if\fBpcre_exec()\fR is passed the entire string again, but with \fIstartoffset\fRset to 4, it finds the second occurrence of "iss" because it is able to lookbehind the starting point to discover that it is preceded by a letter.If a non-zero starting offset is passed when the pattern is anchored, oneattempt to match at the given offset is tried. This can only succeed if thepattern does not require the match to be at the start of the subject.In general, a pattern matches a certain portion of the subject, and inaddition, further substrings from the subject may be picked out by parts of thepattern. Following the usage in Jeffrey Friedl's book, this is called"capturing" in what follows, and the phrase "capturing subpattern" is used fora fragment of a pattern that picks out a substring. PCRE supports several otherkinds of parenthesized subpattern that do not cause substrings to be captured.Captured substrings are returned to the caller via a vector of integer offsetswhose address is passed in \fIovector\fR. The number of elements in the vectoris passed in \fIovecsize\fR. The first two-thirds of the vector is used to passback captured substrings, each substring using a pair of integers. Theremaining third of the vector is used as workspace by \fBpcre_exec()\fR whilematching capturing subpatterns, and is not available for passing backinformation. The length passed in \fIovecsize\fR should always be a multiple ofthree. If it is not, it is rounded down.When a match has been successful, information about captured substrings isreturned in pairs of integers, starting at the beginning of \fIovector\fR, andcontinuing up to two-thirds of its length at the most. The first element of apair is set to the offset of the first character in a substring, and the secondis set to the offset of the first character after the end of a substring. Thefirst pair, \fIovector[0]\fR and \fIovector[1]\fR, identify the portion of thesubject string matched by the entire pattern. The next pair is used for thefirst capturing subpattern, and so on. The value returned by \fBpcre_exec()\fRis the number of pairs that have been set. If there are no capturingsubpatterns, the return value from a successful match is 1, indicating thatjust the first pair of offsets has been set.Some convenience functions are provided for extracting the captured substringsas separate strings. These are described in the following section.It is possible for an capturing subpattern number \fIn+1\fR to match somepart of the subject when subpattern \fIn\fR has not been used at all. Forexample, if the string "abc" is matched against the pattern (a|(z))(bc)subpatterns 1 and 3 are matched, but 2 is not. When this happens, both offsetvalues corresponding to the unused subpattern are set to -1.If a capturing subpattern is matched repeatedly, it is the last portion of thestring that it matched that gets returned.If the vector is too small to hold all the captured substrings, it is used asfar as possible (up to two-thirds of its length), and the function returns avalue of zero. In particular, if the substring offsets are not of interest,\fBpcre_exec()\fR may be called with \fIovector\fR passed as NULL and\fIovecsize\fR as zero. However, if the pattern contains back references andthe \fIovector\fR isn't big enough to remember the related substrings, PCRE hasto get additional memory for use during matching. Thus it is usually advisableto supply an \fIovector\fR.Note that \fBpcre_info()\fR can be used to find out how many capturingsubpatterns there are in a compiled pattern. The smallest size for\fIovector\fR that will allow for \fIn\fR captured substrings in addition tothe offsets of the substring matched by the whole pattern is (\fIn\fR+1)*3.If \fBpcre_exec()\fR fails, it returns a negative number. The following aredefined in the header file:  PCRE_ERROR_NOMATCH        (-1)The subject string did not match the pattern.  PCRE_ERROR_NULL           (-2)Either \fIcode\fR or \fIsubject\fR was passed as NULL, or \fIovector\fR wasNULL and \fIovecsize\fR was not zero.  PCRE_ERROR_BADOPTION      (-3)An unrecognized bit was set in the \fIoptions\fR argument.  PCRE_ERROR_BADMAGIC       (-4)PCRE stores a 4-byte "magic number" at the start of the compiled code, to catchthe case when it is passed a junk pointer. This is the error it gives when themagic number isn't present.  PCRE_ERROR_UNKNOWN_NODE   (-5)While running the pattern match, an unknown item was encountered in thecompiled pattern. This error could be caused by a bug in PCRE or by overwritingof the compiled pattern.  PCRE_ERROR_NOMEMORY       (-6)If a pattern contains back references, but the \fIovector\fR that is passed to\fBpcre_exec()\fR is not big enough to remember the referenced substrings, PCREgets a block of memory at the start of matching to use for this purpose. If thecall via \fBpcre_malloc()\fR fails, this error is given. The memory is freed atthe end of matching..SH EXTRACTING CAPTURED SUBSTRINGSCaptured substrings can be accessed directly by using the offsets returned by\fBpcre_exec()\fR in \fIovector\fR. For convenience, the functions\fBpcre_copy_substring()\fR, \fBpcre_get_substring()\fR, and\fBpcre_get_substring_list()\fR are provided for extracting captured substringsas new, separate, zero-terminated strings. A substring that contains a binaryzero is correctly extracted and has a further zero added on the end, but theresult does not, of course, function as a C string.The first three arguments are the same for all three functions: \fIsubject\fRis the subject string which has just been successfully matched, \fIovector\fRis a pointer to the vector of integer offsets that was passed to\fBpcre_exec()\fR, and \fIstringcount\fR is the number of substrings thatwere captured by the match, including the substring that matched the entireregular expression. This is the value returned by \fBpcre_exec\fR if itis greater than zero. If \fBpcre_exec()\fR returned zero, indicating that itran out of space in \fIovector\fR, the value passed as \fIstringcount\fR shouldbe the size of the vector divided by three.The functions \fBpcre_copy_substring()\fR and \fBpcre_get_substring()\fRextract a single substring, whose number is given as \fIstringnumber\fR. Avalue of zero extracts the substring that matched the entire pattern, whilehigher values extract the captured substrings. For \fBpcre_copy_substring()\fR,the string is placed in \fIbuffer\fR, whose length is given by\fIbuffersize\fR, while for \fBpcre_get_substring()\fR a new block of store isobtained via \fBpcre_malloc\fR, and its address is returned via\fIstringptr\fR. The yield of the function is the length of the string, notincluding the terminating zero, or one of  PCRE_ERROR_NOMEMORY       (-6)The buffer was too small for \fBpcre_copy_substring()\fR, or the attempt to get

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -