⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 pcre.txt

📁 Apache V2.0.15 Alpha For Linuxhttpd-2_0_15-alpha.tar.Z
💻 TXT
📖 第 1 页 / 共 5 页
字号:
VERTICAL BAR     Vertical bar characters are  used  to  separate  alternative     patterns. For example, the pattern       gilbert|sullivan     matches either "gilbert" or "sullivan". Any number of alter-     natives  may  appear,  and an empty alternative is permitted     (matching the empty string).   The  matching  process  tries     each  alternative in turn, from left to right, and the first     one that succeeds is used. If the alternatives are within  a     subpattern  (defined  below),  "succeeds" means matching the     rest of the main pattern as well as the alternative  in  the     subpattern.INTERNAL OPTION SETTING     The settings of PCRE_CASELESS, PCRE_MULTILINE,  PCRE_DOTALL,     and  PCRE_EXTENDED can be changed from within the pattern by     a sequence of Perl option letters enclosed between "(?"  and     ")". The option letters are       i  for PCRE_CASELESS       m  for PCRE_MULTILINE       s  for PCRE_DOTALL       x  for PCRE_EXTENDED     For example, (?im) sets caseless, multiline matching. It  is     also possible to unset these options by preceding the letter     with a hyphen, and a combined setting and unsetting such  as     (?im-sx),  which sets PCRE_CASELESS and PCRE_MULTILINE while     unsetting PCRE_DOTALL and PCRE_EXTENDED, is also  permitted.     If  a  letter  appears both before and after the hyphen, the     option is unset.     The scope of these option changes depends on  where  in  the     pattern  the  setting  occurs. For settings that are outside     any subpattern (defined below), the effect is the same as if     the  options were set or unset at the start of matching. The     following patterns all behave in exactly the same way:       (?i)abc       a(?i)bc       ab(?i)c       abc(?i)     which in turn is the same as compiling the pattern abc  with     PCRE_CASELESS  set.   In  other words, such "top level" set-     tings apply to the whole pattern  (unless  there  are  other     changes  inside subpatterns). If there is more than one set-     ting of the same option at top level, the rightmost  setting     is used.     If an option change occurs inside a subpattern,  the  effect     is  different.  This is a change of behaviour in Perl 5.005.     An option change inside a subpattern affects only that  part     of the subpattern that follows it, so       (a(?i)b)c     matches  abc  and  aBc  and  no  other   strings   (assuming     PCRE_CASELESS  is  not used).  By this means, options can be     made to have different settings in different  parts  of  the     pattern.  Any  changes  made  in one alternative do carry on     into subsequent branches within  the  same  subpattern.  For     example,       (a(?i)b|c)     matches "ab", "aB", "c", and "C", even though when  matching     "C" the first branch is abandoned before the option setting.     This is because the effects of  option  settings  happen  at     compile  time. There would be some very weird behaviour oth-     erwise.     The PCRE-specific options PCRE_UNGREEDY and  PCRE_EXTRA  can     be changed in the same way as the Perl-compatible options by     using the characters U and X  respectively.  The  (?X)  flag     setting  is  special in that it must always occur earlier in     the pattern than any of the additional features it turns on,     even when it is at top level. It is best put at the start.SUBPATTERNS     Subpatterns are delimited by parentheses  (round  brackets),     which can be nested.  Marking part of a pattern as a subpat-     tern does two things:     1. It localizes a set of alternatives. For example, the pat-     tern       cat(aract|erpillar|)     matches one of the words "cat",  "cataract",  or  "caterpil-     lar".  Without  the  parentheses, it would match "cataract",     "erpillar" or the empty string.     2. It sets up the subpattern as a capturing  subpattern  (as     defined  above).   When the whole pattern matches, that por-     tion of the subject string that matched  the  subpattern  is     passed  back  to  the  caller  via  the  ovector argument of     pcre_exec(). Opening parentheses are counted  from  left  to     right (starting from 1) to obtain the numbers of the captur-     ing subpatterns.     For example, if the string "the red king" is matched against     the pattern       the ((red|white) (king|queen))     the captured substrings are "red king", "red",  and  "king",     and are numbered 1, 2, and 3.     The fact that plain parentheses fulfil two functions is  not     always  helpful.  There are often times when a grouping sub-     pattern is required without a capturing requirement.  If  an     opening parenthesis is followed by "?:", the subpattern does     not do any capturing, and is not counted when computing  the     number of any subsequent capturing subpatterns. For example,     if the string "the white queen" is matched against the  pat-     tern       the ((?:red|white) (king|queen))     the captured substrings are "white queen" and  "queen",  and     are  numbered  1  and 2. The maximum number of captured sub-     strings is 99, and the maximum number  of  all  subpatterns,     both capturing and non-capturing, is 200.     As a  convenient  shorthand,  if  any  option  settings  are     required  at  the  start  of a non-capturing subpattern, the     option letters may appear between the "?" and the ":".  Thus     the two patterns       (?i:saturday|sunday)       (?:(?i)saturday|sunday)     match exactly the same set of strings.  Because  alternative     branches  are  tried from left to right, and options are not     reset until the end of the subpattern is reached, an  option     setting  in  one  branch does affect subsequent branches, so     the above patterns match "SUNDAY" as well as "Saturday".REPETITION     Repetition is specified by quantifiers, which can follow any     of the following items:       a single character, possibly escaped       the . metacharacter       a character class       a back reference (see next section)       a parenthesized subpattern (unless it is  an  assertion  -     see below)     The general repetition quantifier specifies  a  minimum  and     maximum  number  of  permitted  matches,  by  giving the two     numbers in curly brackets (braces), separated  by  a  comma.     The  numbers  must be less than 65536, and the first must be     less than or equal to the second. For example:       z{2,4}     matches "zz", "zzz", or "zzzz". A closing brace on  its  own     is not a special character. If the second number is omitted,     but the comma is present, there is no upper  limit;  if  the     second number and the comma are both omitted, the quantifier     specifies an exact number of required matches. Thus       [aeiou]{3,}     matches at least 3 successive vowels,  but  may  match  many     more, while       \d{8}     matches exactly 8 digits.  An  opening  curly  bracket  that     appears  in a position where a quantifier is not allowed, or     one that does not match the syntax of a quantifier, is taken     as  a literal character. For example, {,6} is not a quantif-     ier, but a literal string of four characters.     The quantifier {0} is permitted, causing the  expression  to     behave  as  if the previous item and the quantifier were not     present.     For convenience (and  historical  compatibility)  the  three     most common quantifiers have single-character abbreviations:       *    is equivalent to {0,}       +    is equivalent to {1,}       ?    is equivalent to {0,1}     It is possible to construct infinite loops  by  following  a     subpattern  that  can  match no characters with a quantifier     that has no upper limit, for example:       (a?)*     Earlier versions of Perl and PCRE used to give an  error  at     compile  time  for such patterns. However, because there are     cases where this  can  be  useful,  such  patterns  are  now     accepted,  but  if  any repetition of the subpattern does in     fact match no characters, the loop is forcibly broken.     By default, the quantifiers  are  "greedy",  that  is,  they     match  as much as possible (up to the maximum number of per-     mitted times), without causing the rest of  the  pattern  to     fail. The classic example of where this gives problems is in     trying to match comments in C programs. These appear between     the  sequences /* and */ and within the sequence, individual     * and / characters may appear. An attempt to  match  C  com-     ments by applying the pattern       /\*.*\*/     to the string       /* first command */  not comment  /* second comment */     fails, because it matches  the  entire  string  due  to  the     greediness of the .*  item.     However, if a quantifier is followed by a question mark,  it     ceases  to be greedy, and instead matches the minimum number     of times possible, so the pattern       /\*.*?\*/     does the right thing with the C comments. The meaning of the     various  quantifiers is not otherwise changed, just the pre-     ferred number of matches.  Do not confuse this use of  ques-     tion  mark  with  its  use as a quantifier in its own right.     Because it has two uses, it can sometimes appear doubled, as     in       \d??\d     which matches one digit by preference, but can match two  if     that is the only way the rest of the pattern matches.     If the PCRE_UNGREEDY option is set (an option which  is  not     available  in  Perl),  the  quantifiers  are  not  greedy by     default, but individual ones can be made greedy by following     them  with  a  question mark. In other words, it inverts the     default behaviour.     When a parenthesized subpattern is quantified with a minimum     repeat  count  that is greater than 1 or with a limited max-     imum, more store is required for the  compiled  pattern,  in     proportion to the size of the minimum or maximum.     If a pattern starts with .* or  .{0,}  and  the  PCRE_DOTALL     option (equivalent to Perl's /s) is set, thus allowing the .     to match  newlines,  the  pattern  is  implicitly  anchored,     because whatever follows will be tried against every charac-     ter position in the subject string, so there is no point  in     retrying  the overall match at any position after the first.     PCRE treats such a pattern as though it were preceded by \A.     In  cases where it is known that the subject string contains     no newlines, it is worth setting PCRE_DOTALL when  the  pat-     tern begins with .* in order to obtain this optimization, or     alternatively using ^ to indicate anchoring explicitly.     When a capturing subpattern is repeated, the value  captured     is the substring that matched the final iteration. For exam-     ple, after       (tweedle[dume]{3}\s*)+     has matched "tweedledum tweedledee" the value  of  the  cap-     tured  substring  is  "tweedledee".  However,  if  there are     nested capturing  subpatterns,  the  corresponding  captured     values  may  have been set in previous iterations. For exam-     ple, after       /(a|(b))+/     matches "aba" the value of the second captured substring  is     "b".BACK REFERENCES     Outside a character class, a backslash followed by  a  digit     greater  than  0  (and  possibly  further  digits) is a back     reference to a capturing subpattern  earl

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -