⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 pcre_exec.c

📁 php-4.4.7学习linux时下载的源代码
💻 C
📖 第 1 页 / 共 5 页
字号:
/**************************************************      Perl-Compatible Regular Expressions       **************************************************//* PCRE is a library of functions to support regular expressions whose syntaxand semantics are as close as possible to those of the Perl 5 language.                       Written by Philip Hazel           Copyright (c) 1997-2006 University of Cambridge-----------------------------------------------------------------------------Redistribution and use in source and binary forms, with or withoutmodification, are permitted provided that the following conditions are met:    * Redistributions of source code must retain the above copyright notice,      this list of conditions and the following disclaimer.    * Redistributions in binary form must reproduce the above copyright      notice, this list of conditions and the following disclaimer in the      documentation and/or other materials provided with the distribution.    * Neither the name of the University of Cambridge nor the names of its      contributors may be used to endorse or promote products derived from      this software without specific prior written permission.THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THEIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSEARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BELIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, ORCONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OFSUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESSINTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER INCONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THEPOSSIBILITY OF SUCH DAMAGE.-----------------------------------------------------------------------------*//* This module contains pcre_exec(), the externally visible function that doespattern matching using an NFA algorithm, trying to mimic Perl as closely aspossible. There are also some static supporting functions. */#define NLBLOCK md             /* Block containing newline information */#define PSSTART start_subject  /* Field containing processed string start */#define PSEND   end_subject    /* Field containing processed string end */#include "pcre_internal.h"/* The chain of eptrblocks for tail recursions uses memory in stack workspace,obtained at top level, the size of which is defined by EPTR_WORK_SIZE. */#define EPTR_WORK_SIZE (1000)/* Flag bits for the match() function */#define match_condassert     0x01  /* Called to check a condition assertion */#define match_cbegroup       0x02  /* Could-be-empty unlimited repeat group */#define match_tail_recursed  0x04  /* Tail recursive call *//* Non-error returns from the match() function. Error returns are externallydefined PCRE_ERROR_xxx codes, which are all negative. */#define MATCH_MATCH        1#define MATCH_NOMATCH      0/* Maximum number of ints of offset to save on the stack for recursive calls.If the offset vector is bigger, malloc is used. This should be a multiple of 3,because the offset vector is always a multiple of 3 long. */#define REC_STACK_SAVE_MAX 30/* Min and max values for the common repeats; for the maxima, 0 => infinity */static const char rep_min[] = { 0, 0, 1, 1, 0, 0 };static const char rep_max[] = { 0, 0, 0, 0, 1, 1 };#ifdef DEBUG/**************************************************        Debugging function to print chars       **************************************************//* Print a sequence of chars in printable format, stopping at the end of thesubject if the requested.Arguments:  p           points to characters  length      number to print  is_subject  TRUE if printing from within md->start_subject  md          pointer to matching data block, if is_subject is TRUEReturns:     nothing*/static voidpchars(const uschar *p, int length, BOOL is_subject, match_data *md){unsigned int c;if (is_subject && length > md->end_subject - p) length = md->end_subject - p;while (length-- > 0)  if (isprint(c = *(p++))) printf("%c", c); else printf("\\x%02x", c);}#endif/**************************************************          Match a back-reference                **************************************************//* If a back reference hasn't been set, the length that is passed is greaterthan the number of characters left in the string, so the match fails.Arguments:  offset      index into the offset vector  eptr        points into the subject  length      length to be matched  md          points to match data block  ims         the ims flagsReturns:      TRUE if matched*/static BOOLmatch_ref(int offset, register USPTR eptr, int length, match_data *md,  unsigned long int ims){USPTR p = md->start_subject + md->offset_vector[offset];#ifdef DEBUGif (eptr >= md->end_subject)  printf("matching subject <null>");else  {  printf("matching subject ");  pchars(eptr, length, TRUE, md);  }printf(" against backref ");pchars(p, length, FALSE, md);printf("\n");#endif/* Always fail if not enough characters left */if (length > md->end_subject - eptr) return FALSE;/* Separate the caselesss case for speed */if ((ims & PCRE_CASELESS) != 0)  {  while (length-- > 0)    if (md->lcc[*p++] != md->lcc[*eptr++]) return FALSE;  }else  { while (length-- > 0) if (*p++ != *eptr++) return FALSE; }return TRUE;}/*******************************************************************************************************************************************************                   RECURSION IN THE match() FUNCTIONThe match() function is highly recursive, though not every recursive callincreases the recursive depth. Nevertheless, some regular expressions can causeit to recurse to a great depth. I was writing for Unix, so I just let it callitself recursively. This uses the stack for saving everything that has to besaved for a recursive call. On Unix, the stack can be large, and this worksfine.It turns out that on some non-Unix-like systems there are problems withprograms that use a lot of stack. (This despite the fact that every last chiphas oodles of memory these days, and techniques for extending the stack havebeen known for decades.) So....There is a fudge, triggered by defining NO_RECURSE, which avoids recursivecalls by keeping local variables that need to be preserved in blocks of memoryobtained from malloc() instead instead of on the stack. Macros are used toachieve this so that the actual code doesn't look very different to what italways used to.*******************************************************************************************************************************************************//* These versions of the macros use the stack, as normal. There are debuggingversions and production versions. */#ifndef NO_RECURSE#define REGISTER register#ifdef DEBUG#define RMATCH(rx,ra,rb,rc,rd,re,rf,rg) \  { \  printf("match() called in line %d\n", __LINE__); \  rx = match(ra,rb,rc,rd,re,rf,rg,rdepth+1); \  printf("to line %d\n", __LINE__); \  }#define RRETURN(ra) \  { \  printf("match() returned %d from line %d ", ra, __LINE__); \  return ra; \  }#else#define RMATCH(rx,ra,rb,rc,rd,re,rf,rg) \  rx = match(ra,rb,rc,rd,re,rf,rg,rdepth+1)#define RRETURN(ra) return ra#endif#else/* These versions of the macros manage a private stack on the heap. Notethat the rd argument of RMATCH isn't actually used. It's the md argument ofmatch(), which never changes. */#define REGISTER#define RMATCH(rx,ra,rb,rc,rd,re,rf,rg)\  {\  heapframe *newframe = (pcre_stack_malloc)(sizeof(heapframe));\  if (setjmp(frame->Xwhere) == 0)\    {\    newframe->Xeptr = ra;\    newframe->Xecode = rb;\    newframe->Xoffset_top = rc;\    newframe->Xims = re;\    newframe->Xeptrb = rf;\    newframe->Xflags = rg;\    newframe->Xrdepth = frame->Xrdepth + 1;\    newframe->Xprevframe = frame;\    frame = newframe;\    DPRINTF(("restarting from line %d\n", __LINE__));\    goto HEAP_RECURSE;\    }\  else\    {\    DPRINTF(("longjumped back to line %d\n", __LINE__));\    frame = md->thisframe;\    rx = frame->Xresult;\    }\  }#define RRETURN(ra)\  {\  heapframe *newframe = frame;\  frame = newframe->Xprevframe;\  (pcre_stack_free)(newframe);\  if (frame != NULL)\    {\    frame->Xresult = ra;\    md->thisframe = frame;\    longjmp(frame->Xwhere, 1);\    }\  return ra;\  }/* Structure for remembering the local variables in a private frame */typedef struct heapframe {  struct heapframe *Xprevframe;  /* Function arguments that may change */  const uschar *Xeptr;  const uschar *Xecode;  int Xoffset_top;  long int Xims;  eptrblock *Xeptrb;  int Xflags;  unsigned int Xrdepth;  /* Function local variables */  const uschar *Xcallpat;  const uschar *Xcharptr;  const uschar *Xdata;  const uschar *Xnext;  const uschar *Xpp;  const uschar *Xprev;  const uschar *Xsaved_eptr;  recursion_info Xnew_recursive;  BOOL Xcur_is_word;  BOOL Xcondition;  BOOL Xprev_is_word;  unsigned long int Xoriginal_ims;#ifdef SUPPORT_UCP  int Xprop_type;  int Xprop_value;  int Xprop_fail_result;  int Xprop_category;  int Xprop_chartype;  int Xprop_script;#endif  int Xctype;  unsigned int Xfc;  int Xfi;  int Xlength;  int Xmax;  int Xmin;  int Xnumber;  int Xoffset;  int Xop;  int Xsave_capture_last;  int Xsave_offset1, Xsave_offset2, Xsave_offset3;  int Xstacksave[REC_STACK_SAVE_MAX];  eptrblock Xnewptrb;  /* Place to pass back result, and where to jump back to */  int  Xresult;  jmp_buf Xwhere;} heapframe;#endif/******************************************************************************************************************************************************//**************************************************         Match from current position            **************************************************//* This function is called recursively in many circumstances. Whenever itreturns a negative (error) response, the outer incarnation must also return thesame response.Performance note: It might be tempting to extract commonly used fields from themd structure (e.g. utf8, end_subject) into individual variables to improveperformance. Tests using gcc on a SPARC disproved this; in the first case, itmade performance worse.Arguments:   eptr        pointer to current character in subject   ecode       pointer to current position in compiled code   offset_top  current top pointer   md          pointer to "static" info for the match   ims         current /i, /m, and /s options   eptrb       pointer to chain of blocks containing eptr at start of                 brackets - for testing for empty matches   flags       can contain                 match_condassert - this is an assertion condition                 match_cbegroup - this is the start of an unlimited repeat                   group that can match an empty string                 match_tail_recursed - this is a tail_recursed group   rdepth      the recursion depthReturns:       MATCH_MATCH if matched            )  these values are >= 0               MATCH_NOMATCH if failed to match  )               a negative PCRE_ERROR_xxx value if aborted by an error condition                 (e.g. stopped by repeated call or recursion limit)*/static intmatch(REGISTER USPTR eptr, REGISTER const uschar *ecode,  int offset_top, match_data *md, unsigned long int ims, eptrblock *eptrb,  int flags, unsigned int rdepth){/* These variables do not need to be preserved over recursion in this function,so they can be ordinary variables in all cases. Mark some of them with"register" because they are used a lot in loops. */register int  rrc;         /* Returns from recursive calls */register int  i;           /* Used for loops not involving calls to RMATCH() */register unsigned int c;   /* Character values not kept over RMATCH() calls */register BOOL utf8;        /* Local copy of UTF-8 flag for speed */BOOL minimize, possessive; /* Quantifier options *//* When recursion is not being used, all "local" variables that have to bepreserved over calls to RMATCH() are part of a "frame" which is obtained fromheap storage. Set up the top-level frame here; others are obtained from theheap whenever RMATCH() does a "recursion". See the macro definitions above. */#ifdef NO_RECURSEheapframe *frame = (pcre_stack_malloc)(sizeof(heapframe));frame->Xprevframe = NULL;            /* Marks the top level *//* Copy in the original argument variables */frame->Xeptr = eptr;frame->Xecode = ecode;frame->Xoffset_top = offset_top;frame->Xims = ims;frame->Xeptrb = eptrb;frame->Xflags = flags;frame->Xrdepth = rdepth;/* This is where control jumps back to to effect "recursion" */HEAP_RECURSE:/* Macros make the argument variables come from the current frame */#define eptr               frame->Xeptr#define ecode              frame->Xecode#define offset_top         frame->Xoffset_top#define ims                frame->Xims#define eptrb              frame->Xeptrb#define flags              frame->Xflags#define rdepth             frame->Xrdepth/* Ditto for the local variables */#ifdef SUPPORT_UTF8#define charptr            frame->Xcharptr#endif#define callpat            frame->Xcallpat#define data               frame->Xdata

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -