⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 pcre.txt

📁 Apache 2.0.63 is the current stable version of the 2.0 series, and is recommended over any previous
💻 TXT
📖 第 1 页 / 共 5 页
字号:
     information  about  the  pattern;  this  can  be  passed  to
     pcre_exec(). If no additional information is available, NULL
     is returned.

     The second argument contains option  bits.  At  present,  no
     options  are  defined  for  pcre_study(),  and this argument
     should always be zero.

     The third argument for pcre_study() is a pointer to an error
     message. If studying succeeds (even if no data is returned),
     the variable it points to  is  set  to  NULL.  Otherwise  it
     points to a textual error message.

     This is a typical call to pcre_study():

       pcre_extra *pe;
       pe = pcre_study(
         re,             /* result of pcre_compile() */
         0,              /* no options exist */
         &error);        /* set to NULL or points to a message */

     At present, studying a  pattern  is  useful  only  for  non-
     anchored  patterns  that do not have a single fixed starting
     character. A  bitmap  of  possible  starting  characters  is
     created.



LOCALE SUPPORT
     PCRE handles caseless matching, and determines whether char-
     acters  are  letters, digits, or whatever, by reference to a
     set of tables. The library contains a default set of  tables
     which  is  created in the default C locale when PCRE is com-
     piled.  This  is   used   when   the   final   argument   of
     pcre_compile()  is NULL, and is sufficient for many applica-
     tions.

     An alternative set of tables can, however, be supplied. Such
     tables  are built by calling the pcre_maketables() function,
     which has no arguments, in the relevant locale.  The  result
     can  then be passed to pcre_compile() as often as necessary.
     For example, to build and use tables  that  are  appropriate
     for  the French locale (where accented characters with codes
     greater than 128 are treated as letters), the following code
     could be used:

       setlocale(LC_CTYPE, "fr");
       tables = pcre_maketables();
       re = pcre_compile(..., tables);

     The  tables  are  built  in  memory  that  is  obtained  via
     pcre_malloc.  The  pointer that is passed to pcre_compile is
     saved with the compiled pattern, and  the  same  tables  are
     used  via this pointer by pcre_study() and pcre_exec(). Thus
     for any single pattern, compilation, studying  and  matching
     all happen in the same locale, but different patterns can be
     compiled in different locales. It is the caller's  responsi-
     bility  to  ensure  that  the  memory  containing the tables
     remains available for as long as it is needed.



INFORMATION ABOUT A PATTERN
     The pcre_fullinfo() function  returns  information  about  a
     compiled pattern. It replaces the obsolete pcre_info() func-
     tion, which is nevertheless retained for backwards compabil-
     ity (and is documented below).

     The first argument for pcre_fullinfo() is a pointer  to  the
     compiled  pattern.  The  second  argument  is  the result of
     pcre_study(), or NULL if the pattern was  not  studied.  The
     third  argument  specifies  which  piece  of  information is
     required, while the fourth argument is a pointer to a  vari-
     able  to receive the data. The yield of the function is zero
     for success, or one of the following negative numbers:

       PCRE_ERROR_NULL       the argument code was NULL
                             the argument where was NULL
       PCRE_ERROR_BADMAGIC   the "magic number" was not found
       PCRE_ERROR_BADOPTION  the value of what was invalid

     Here is a typical call of  pcre_fullinfo(),  to  obtain  the
     length of the compiled pattern:

       int rc;
       unsigned long int length;
       rc = pcre_fullinfo(
         re,               /* result of pcre_compile() */
         pe,               /* result of pcre_study(), or NULL */
         PCRE_INFO_SIZE,   /* what is required */
         &length);         /* where to put the data */

     The possible values for the third argument  are  defined  in
     pcre.h, and are as follows:

       PCRE_INFO_OPTIONS

     Return a copy of the options with which the pattern was com-
     piled.  The fourth argument should point to an unsigned long
     int variable. These option bits are those specified  in  the
     call  to  pcre_compile(),  modified  by any top-level option
     settings  within  the   pattern   itself,   and   with   the
     PCRE_ANCHORED  bit  forcibly  set if the form of the pattern
     implies that it can match only at the  start  of  a  subject
     string.

       PCRE_INFO_SIZE

     Return the size of the compiled pattern, that is, the  value
     that  was  passed as the argument to pcre_malloc() when PCRE
     was getting memory in which to place the compiled data.  The
     fourth argument should point to a size_t variable.

       PCRE_INFO_CAPTURECOUNT

     Return the number of capturing subpatterns in  the  pattern.
     The fourth argument should point to an int variable.

       PCRE_INFO_BACKREFMAX

     Return the number of the highest back reference in the  pat-
     tern.  The  fourth argument should point to an int variable.
     Zero is returned if there are no back references.

       PCRE_INFO_FIRSTCHAR

     Return information about the first character of any  matched
     string,  for  a  non-anchored  pattern.  If there is a fixed
     first   character,   e.g.   from   a   pattern    such    as
     (cat|cow|coyote),  it  is returned in the integer pointed to
     by where. Otherwise, if either

     (a) the pattern was compiled with the PCRE_MULTILINE option,
     and every branch starts with "^", or

     (b) every  branch  of  the  pattern  starts  with  ".*"  and
     PCRE_DOTALL is not set (if it were set, the pattern would be
     anchored),

     -1 is returned, indicating that the pattern matches only  at
     the  start  of a subject string or after any "\n" within the
     string. Otherwise -2 is returned.  For anchored patterns, -2
     is returned.

       PCRE_INFO_FIRSTTABLE

     If the pattern was studied, and this resulted  in  the  con-
     struction of a 256-bit table indicating a fixed set of char-
     acters for the first character in  any  matching  string,  a
     pointer   to  the  table  is  returned.  Otherwise  NULL  is
     returned. The fourth argument should point  to  an  unsigned
     char * variable.

       PCRE_INFO_LASTLITERAL

     For a non-anchored pattern, return the value of  the  right-
     most  literal  character  which  must  exist  in any matched
     string, other than at its start. The fourth argument  should
     point  to an int variable. If there is no such character, or
     if the pattern is anchored, -1 is returned. For example, for
     the pattern /a\d+z\d+/ the returned value is 'z'.

     The pcre_info() function is now obsolete because its  inter-
     face  is  too  restrictive  to return all the available data
     about  a  compiled  pattern.   New   programs   should   use
     pcre_fullinfo()  instead.  The  yield  of pcre_info() is the
     number of capturing subpatterns, or  one  of  the  following
     negative numbers:

       PCRE_ERROR_NULL       the argument code was NULL
       PCRE_ERROR_BADMAGIC   the "magic number" was not found

     If the optptr argument is not NULL, a copy  of  the  options
     with which the pattern was compiled is placed in the integer
     it points to (see PCRE_INFO_OPTIONS above).

     If the pattern is not anchored and the firstcharptr argument
     is  not  NULL, it is used to pass back information about the
     first    character    of    any    matched    string    (see
     PCRE_INFO_FIRSTCHAR above).



MATCHING A PATTERN
     The function pcre_exec() is called to match a subject string





SunOS 5.8                 Last change:                          9



     against  a pre-compiled pattern, which is passed in the code
     argument. If the pattern has been studied, the result of the
     study should be passed in the extra argument. Otherwise this
     must be NULL.

     Here is an example of a simple call to pcre_exec():

       int rc;
       int ovector[30];
       rc = pcre_exec(
         re,             /* result of pcre_compile() */
         NULL,           /* we didn't study the pattern */
         "some string",  /* the subject string */
         11,             /* the length of the subject string */
         0,              /* start at offset 0 in the subject */
         0,              /* default options */
         ovector,        /* vector for substring information */
         30);            /* number of elements in the vector */

     The PCRE_ANCHORED option can be passed in the options  argu-
     ment,  whose unused bits must be zero. However, if a pattern
     was  compiled  with  PCRE_ANCHORED,  or  turned  out  to  be
     anchored  by  virtue  of  its  contents,  it  cannot be made
     unachored at matching time.

     There are also three further options that can be set only at
     matching time:

       PCRE_NOTBOL

     The first character of the string is not the beginning of  a
     line,  so  the  circumflex  metacharacter  should  not match
     before it. Setting this without PCRE_MULTILINE  (at  compile
     time) causes circumflex never to match.

       PCRE_NOTEOL

     The end of the string is not the end of a line, so the  dol-
     lar  metacharacter should not match it nor (except in multi-
     line mode) a newline immediately  before  it.  Setting  this
     without PCRE_MULTILINE (at compile time) causes dollar never
     to match.

       PCRE_NOTEMPTY

     An empty string is not considered to be  a  valid  match  if
     this  option  is  set. If there are alternatives in the pat-
     tern, they are tried. If  all  the  alternatives  match  the
     empty  string,  the  entire match fails. For example, if the
     pattern

       a?b?

     is applied to a string not beginning with  "a"  or  "b",  it
     matches  the  empty string at the start of the subject. With
     PCRE_NOTEMPTY set, this match is not valid, so PCRE searches
     further into the string for occurrences of "a" or "b".

     Perl has no direct equivalent of PCRE_NOTEMPTY, but it  does
     make  a  special case of a pattern match of the empty string
     within its split() function, and when using the /g modifier.
     It  is possible to emulate Perl's behaviour after matching a
     null string by first trying the  match  again  at  the  same
     offset  with  PCRE_NOTEMPTY  set,  and then if that fails by
     advancing the starting offset  (see  below)  and  trying  an
     ordinary match again.

     The subject string is passed as  a  pointer  in  subject,  a
     length  in  length,  and  a  starting offset in startoffset.
     Unlike the pattern string, the subject  may  contain  binary
     zero  characters.  When  the  starting  offset  is zero, the
     search for a match starts at the beginning of  the  subject,
     and this is by far the most common case.

     A non-zero starting offset  is  useful  when  searching  for
     another  match  in  the  same subject by calling pcre_exec()
     again after a previous success.  Setting startoffset differs
     from  just  passing  over  a  shortened  string  and setting
     PCRE_NOTBOL in the case of a pattern that  begins  with  any
     kind of lookbehind. For example, consider the pattern

       \Biss\B

     which finds occurrences of "iss" in the middle of words. (\B
     matches only if the current position in the subject is not a
     word boundary.) When applied to the string "Mississipi"  the
     first  call  to  pcre_exec()  finds the first occurrence. If
     pcre_exec() is called again with just the remainder  of  the
     subject,  namely  "issipi", it does not match, because \B is
     always false at the start of the subject, which is deemed to
     be  a  word  boundary. However, if pcre_exec() is passed the
     entire string again, but with startoffset set to 4, it finds
     the  second  occurrence  of "iss" because it is able to look
     behind the starting point to discover that it is preceded by

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -