📄 pcre.txt

📁 php-4.4.7学习linux时下载的源代码
💻 TXT
📖 第 1 页 / 共 5 页
字号:
         cat(er(pillar)?)       is  matched  against the string "the caterpillar catchment", the result       will be the three strings "cat", "cater", and "caterpillar" that  start       at the fourth character of the subject. The algorithm does not automat-       ically move on to find matches that start at later positions.       There are a number of features of PCRE regular expressions that are not       supported by the alternative matching algorithm. They are as follows:       1.  Because  the  algorithm  finds  all possible matches, the greedy or       ungreedy nature of repetition quantifiers is not relevant.  Greedy  and       ungreedy quantifiers are treated in exactly the same way. However, pos-       sessive quantifiers can make a difference when what follows could  also       match what is quantified, for example in a pattern like this:         ^a++\w!       This  pattern matches "aaab!" but not "aaa!", which would be matched by       a non-possessive quantifier. Similarly, if an atomic group is  present,       it  is matched as if it were a standalone pattern at the current point,       and the longest match is then "locked in" for the rest of  the  overall       pattern.       2. When dealing with multiple paths through the tree simultaneously, it       is not straightforward to keep track of  captured  substrings  for  the       different  matching  possibilities,  and  PCRE's implementation of this       algorithm does not attempt to do this. This means that no captured sub-       strings are available.       3.  Because no substrings are captured, back references within the pat-       tern are not supported, and cause errors if encountered.       4. For the same reason, conditional expressions that use  a  backrefer-       ence  as  the  condition or test for a specific group recursion are not       supported.       5. Callouts are supported, but the value of the  capture_top  field  is       always 1, and the value of the capture_last field is always -1.       6.  The \C escape sequence, which (in the standard algorithm) matches a       single byte, even in UTF-8 mode, is not supported because the  alterna-       tive  algorithm  moves  through  the  subject string one character at a       time, for all active paths through the tree.ADVANTAGES OF THE ALTERNATIVE ALGORITHM       Using the alternative matching algorithm provides the following  advan-       tages:       1. All possible matches (at a single point in the subject) are automat-       ically found, and in particular, the longest match is  found.  To  find       more than one match using the standard algorithm, you have to do kludgy       things with callouts.       2. There is much better support for partial matching. The  restrictions       on  the content of the pattern that apply when using the standard algo-       rithm for partial matching do not apply to the  alternative  algorithm.       For  non-anchored patterns, the starting position of a partial match is       available.       3. Because the alternative algorithm  scans  the  subject  string  just       once,  and  never  needs to backtrack, it is possible to pass very long       subject strings to the matching function in  several  pieces,  checking       for partial matching each time.DISADVANTAGES OF THE ALTERNATIVE ALGORITHM       The alternative algorithm suffers from a number of disadvantages:       1.  It  is  substantially  slower  than the standard algorithm. This is       partly because it has to search for all possible matches, but  is  also       because it is less susceptible to optimization.       2. Capturing parentheses and back references are not supported.       3. Although atomic groups are supported, their use does not provide the       performance advantage that it does for the standard algorithm.Last updated: 24 November 2006Copyright (c) 1997-2006 University of Cambridge.------------------------------------------------------------------------------PCREAPI(3)                                                          PCREAPI(3)NAME       PCRE - Perl-compatible regular expressionsPCRE NATIVE API       #include <pcre.h>       pcre *pcre_compile(const char *pattern, int options,            const char **errptr, int *erroffset,            const unsigned char *tableptr);       pcre *pcre_compile2(const char *pattern, int options,            int *errorcodeptr,            const char **errptr, int *erroffset,            const unsigned char *tableptr);       pcre_extra *pcre_study(const pcre *code, int options,            const char **errptr);       int pcre_exec(const pcre *code, const pcre_extra *extra,            const char *subject, int length, int startoffset,            int options, int *ovector, int ovecsize);       int pcre_dfa_exec(const pcre *code, const pcre_extra *extra,            const char *subject, int length, int startoffset,            int options, int *ovector, int ovecsize,            int *workspace, int wscount);       int pcre_copy_named_substring(const pcre *code,            const char *subject, int *ovector,            int stringcount, const char *stringname,            char *buffer, int buffersize);       int pcre_copy_substring(const char *subject, int *ovector,            int stringcount, int stringnumber, char *buffer,            int buffersize);       int pcre_get_named_substring(const pcre *code,            const char *subject, int *ovector,            int stringcount, const char *stringname,            const char **stringptr);       int pcre_get_stringnumber(const pcre *code,            const char *name);       int pcre_get_stringtable_entries(const pcre *code,            const char *name, char **first, char **last);       int pcre_get_substring(const char *subject, int *ovector,            int stringcount, int stringnumber,            const char **stringptr);       int pcre_get_substring_list(const char *subject,            int *ovector, int stringcount, const char ***listptr);       void pcre_free_substring(const char *stringptr);       void pcre_free_substring_list(const char **stringptr);       const unsigned char *pcre_maketables(void);       int pcre_fullinfo(const pcre *code, const pcre_extra *extra,            int what, void *where);       int pcre_info(const pcre *code, int *optptr, int *firstcharptr);       int pcre_refcount(pcre *code, int adjust);       int pcre_config(int what, void *where);       char *pcre_version(void);       void *(*pcre_malloc)(size_t);       void (*pcre_free)(void *);       void *(*pcre_stack_malloc)(size_t);       void (*pcre_stack_free)(void *);       int (*pcre_callout)(pcre_callout_block *);PCRE API OVERVIEW       PCRE has its own native API, which is described in this document. There       are also some wrapper functions that correspond to  the  POSIX  regular       expression  API.  These  are  described in the pcreposix documentation.       Both of these APIs define a set of C function calls. A C++  wrapper  is       distributed with PCRE. It is documented in the pcrecpp page.       The  native  API  C  function prototypes are defined in the header file       pcre.h, and on Unix systems the library itself is called  libpcre.   It       can normally be accessed by adding -lpcre to the command for linking an       application  that  uses  PCRE.  The  header  file  defines  the  macros       PCRE_MAJOR  and  PCRE_MINOR to contain the major and minor release num-       bers for the library.  Applications can use these  to  include  support       for different releases of PCRE.       The   functions   pcre_compile(),  pcre_compile2(),  pcre_study(),  and       pcre_exec() are used for compiling and matching regular expressions  in       a  Perl-compatible  manner. A sample program that demonstrates the sim-       plest way of using them is provided in the file  called  pcredemo.c  in       the  source distribution. The pcresample documentation describes how to       run it.       A second matching function, pcre_dfa_exec(), which is not Perl-compati-       ble,  is  also provided. This uses a different algorithm for the match-       ing. The alternative algorithm finds all possible matches (at  a  given       point  in  the subject), and scans the subject just once. However, this       algorithm does not return captured substrings. A description of the two       matching  algorithms and their advantages and disadvantages is given in       the pcrematching documentation.       In addition to the main compiling and  matching  functions,  there  are       convenience functions for extracting captured substrings from a subject       string that is matched by pcre_exec(). They are:         pcre_copy_substring()         pcre_copy_named_substring()         pcre_get_substring()         pcre_get_named_substring()         pcre_get_substring_list()         pcre_get_stringnumber()         pcre_get_stringtable_entries()       pcre_free_substring() and pcre_free_substring_list() are also provided,       to free the memory used for extracted strings.       The  function  pcre_maketables()  is  used  to build a set of character       tables  in  the  current  locale   for   passing   to   pcre_compile(),       pcre_exec(),  or  pcre_dfa_exec(). This is an optional facility that is       provided for specialist use.  Most  commonly,  no  special  tables  are       passed,  in  which case internal tables that are generated when PCRE is       built are used.       The function pcre_fullinfo() is used to find out  information  about  a       compiled  pattern; pcre_info() is an obsolete version that returns only       some of the available information, but is retained for  backwards  com-       patibility.   The function pcre_version() returns a pointer to a string       containing the version of PCRE and its date of release.       The function pcre_refcount() maintains a  reference  count  in  a  data       block  containing  a compiled pattern. This is provided for the benefit       of object-oriented applications.       The global variables pcre_malloc and pcre_free  initially  contain  the       entry  points  of  the  standard malloc() and free() functions, respec-       tively. PCRE calls the memory management functions via these variables,       so  a  calling  program  can replace them if it wishes to intercept the       calls. This should be done before calling any PCRE functions.       The global variables pcre_stack_malloc  and  pcre_stack_free  are  also       indirections  to  memory  management functions. These special functions       are used only when PCRE is compiled to use  the  heap  for  remembering       data, instead of recursive function calls, when running the pcre_exec()       function. See the pcrebuild documentation for  details  of  how  to  do       this.  It  is  a non-standard way of building PCRE, for use in environ-       ments that have limited stacks. Because of the greater  use  of  memory       management,  it  runs  more  slowly. Separate functions are provided so       that special-purpose external code can be  used  for  this  case.  When       used,  these  functions  are always called in a stack-like manner (last       obtained, first freed), and always for memory blocks of the same  size.       There  is  a discussion about PCRE's stack usage in the pcrestack docu-       mentation.       The global variable pcre_callout initially contains NULL. It can be set       by  the  caller  to  a "callout" function, which PCRE will then call at       specified points during a matching operation. Details are given in  the       pcrecallout documentation.NEWLINES       PCRE  supports four different conventions for indicating line breaks in       strings: a single CR (carriage return) character, a  single  LF  (line-       feed)  character,  the two-character sequence CRLF, or any Unicode new-       line sequence.  The Unicode newline sequences are the three  just  men-       tioned, plus the single characters VT (vertical tab, U+000B), FF (form-       feed, U+000C), NEL (next line, U+0085), LS  (line  separator,  U+2028),
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -