📄 perldebguts.1
字号:
\& # [Special] alternatives\& ANY no Match any one character (except newline).\& SANY no Match any one character.\& ANYOF sv Match character in (or not in) this class.\& ALNUM no Match any alphanumeric character\& ALNUML no Match any alphanumeric char in locale\& NALNUM no Match any non\-alphanumeric character\& NALNUML no Match any non\-alphanumeric char in locale\& SPACE no Match any whitespace character\& SPACEL no Match any whitespace char in locale\& NSPACE no Match any non\-whitespace character\& NSPACEL no Match any non\-whitespace char in locale\& DIGIT no Match any numeric character\& NDIGIT no Match any non\-numeric character\&\& # BRANCH The set of branches constituting a single choice are hooked\& # together with their "next" pointers, since precedence prevents\& # anything being concatenated to any individual branch. The\& # "next" pointer of the last BRANCH in a choice points to the\& # thing following the whole choice. This is also where the\& # final "next" pointer of each individual branch points; each\& # branch starts with the operand node of a BRANCH node.\& #\& BRANCH node Match this alternative, or the next...\&\& # BACK Normal "next" pointers all implicitly point forward; BACK\& # exists to make loop structures possible.\& # not used\& BACK no Match "", "next" ptr points backward.\&\& # Literals\& EXACT sv Match this string (preceded by length).\& EXACTF sv Match this string, folded (prec. by length).\& EXACTFL sv Match this string, folded in locale (w/len).\&\& # Do nothing\& NOTHING no Match empty string.\& # A variant of above which delimits a group, thus stops optimizations\& TAIL no Match empty string. Can jump here from outside.\&\& # STAR,PLUS \*(Aq?\*(Aq, and complex \*(Aq*\*(Aq and \*(Aq+\*(Aq, are implemented as circular\& # BRANCH structures using BACK. Simple cases (one character\& # per match) are implemented with STAR and PLUS for speed\& # and to minimize recursive plunges.\& #\& STAR node Match this (simple) thing 0 or more times.\& PLUS node Match this (simple) thing 1 or more times.\&\& CURLY sv 2 Match this simple thing {n,m} times.\& CURLYN no 2 Match next\-after\-this simple thing \& # {n,m} times, set parens.\& CURLYM no 2 Match this medium\-complex thing {n,m} times.\& CURLYX sv 2 Match this complex thing {n,m} times.\&\& # This terminator creates a loop structure for CURLYX\& WHILEM no Do curly processing and see if rest matches.\&\& # OPEN,CLOSE,GROUPP ...are numbered at compile time.\& OPEN num 1 Mark this point in input as start of #n.\& CLOSE num 1 Analogous to OPEN.\&\& REF num 1 Match some already matched string\& REFF num 1 Match already matched string, folded\& REFFL num 1 Match already matched string, folded in loc.\&\& # grouping assertions\& IFMATCH off 1 2 Succeeds if the following matches.\& UNLESSM off 1 2 Fails if the following matches.\& SUSPEND off 1 1 "Independent" sub\-regex.\& IFTHEN off 1 1 Switch, should be preceded by switcher .\& GROUPP num 1 Whether the group matched.\&\& # Support for long regex\& LONGJMP off 1 1 Jump far away.\& BRANCHJ off 1 1 BRANCH with long offset.\&\& # The heavy worker\& EVAL evl 1 Execute some Perl code.\&\& # Modifiers\& MINMOD no Next operator is not greedy.\& LOGICAL no Next opcode should set the flag only.\&\& # This is not used yet\& RENUM off 1 1 Group with independently numbered parens.\&\& # This is not really a node, but an optimized away piece of a "long" node.\& # To simplify debugging output, we mark it as if it were a node\& OPTIMIZED off Placeholder for dump..Ve.PPFollowing the optimizer information is a dump of the offset/lengthtable, here split across several lines:.PP.Vb 5\& Offsets: [45]\& 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]\& 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]\& 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]\& 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0].Ve.PPThe first line here indicates that the offset/length table contains 45entries. Each entry is a pair of integers, denoted by \f(CW\*(C`offset[length]\*(C'\fR.Entries are numbered starting with 1, so entry #1 here is \f(CW\*(C`1[4]\*(C'\fR andentry #12 is \f(CW\*(C`5[1]\*(C'\fR. \f(CW\*(C`1[4]\*(C'\fR indicates that the node labeled \f(CW\*(C`1:\*(C'\fR(the \f(CW\*(C`1: ANYOF[bc]\*(C'\fR) begins at character position 1 in thepre-compiled form of the regex, and has a length of 4 characters.\&\f(CW\*(C`5[1]\*(C'\fR in position 12 indicates that the node labeled \f(CW\*(C`12:\*(C'\fR(the \f(CW\*(C`12: EXACT <d>\*(C'\fR) begins at character position 5 in thepre-compiled form of the regex, and has a length of 1 character.\&\f(CW\*(C`12[1]\*(C'\fR in position 14 indicates that the node labeled \f(CW\*(C`14:\*(C'\fR(the \f(CW\*(C`14: CURLYX[0] {1,32767}\*(C'\fR) begins at character position 12 in thepre-compiled form of the regex, and has a length of 1 character\-\-\-thatis, it corresponds to the \f(CW\*(C`+\*(C'\fR symbol in the precompiled regex..PP\&\f(CW\*(C`0[0]\*(C'\fR items indicate that there is no corresponding node..Sh "Run-time output".IX Subsection "Run-time output"First of all, when doing a match, one may get no run-time output evenif debugging is enabled. This means that the regex engine was neverentered and that all of the job was therefore done by the optimizer..PPIf the regex engine was entered, the output may look like this:.PP.Vb 10\& Matching \`[bc]d(ef*g)+h[ij]k$\*(Aq against \`abcdefg_\|_gh_\|_\*(Aq\& Setting an EVAL scope, savestack=3\& 2 <ab> <cdefg_\|_gh_> | 1: ANYOF\& 3 <abc> <defg_\|_gh_> | 11: EXACT <d>\& 4 <abcd> <efg_\|_gh_> | 13: CURLYX {1,32767}\& 4 <abcd> <efg_\|_gh_> | 26: WHILEM\& 0 out of 1..32767 cc=effff31c\& 4 <abcd> <efg_\|_gh_> | 15: OPEN1\& 4 <abcd> <efg_\|_gh_> | 17: EXACT <e>\& 5 <abcde> <fg_\|_gh_> | 19: STAR\& EXACT <f> can match 1 times out of 32767...\& Setting an EVAL scope, savestack=3\& 6 <bcdef> <g_\|_gh_\|_> | 22: EXACT <g>\& 7 <bcdefg> <_\|_gh_\|_> | 24: CLOSE1\& 7 <bcdefg> <_\|_gh_\|_> | 26: WHILEM\& 1 out of 1..32767 cc=effff31c\& Setting an EVAL scope, savestack=12\& 7 <bcdefg> <_\|_gh_\|_> | 15: OPEN1\& 7 <bcdefg> <_\|_gh_\|_> | 17: EXACT <e>\& restoring \e1 to 4(4)..7\& failed, try continuation...\& 7 <bcdefg> <_\|_gh_\|_> | 27: NOTHING\& 7 <bcdefg> <_\|_gh_\|_> | 28: EXACT <h>\& failed...\& failed....Ve.PPThe most significant information in the output is about the particular \fInode\fRof the compiled regex that is currently being tested against the target string.The format of these lines is.PP\&\f(CW\*(C`\*(C'\fR.PPThe \fI\s-1TYPE\s0\fR info is indented with respect to the backtracking level.Other incidental information appears interspersed within..SH "Debugging Perl memory usage".IX Header "Debugging Perl memory usage"Perl is a profligate wastrel when it comes to memory use. Thereis a saying that to estimate memory usage of Perl, assume a reasonablealgorithm for memory allocation, multiply that estimate by 10, andwhile you still may miss the mark, at least you won't be quite soastonished. This is not absolutely true, but may provide a goodgrasp of what happens..PPAssume that an integer cannot take less than 20 bytes of memory, afloat cannot take less than 24 bytes, a string cannot take lessthan 32 bytes (all these examples assume 32\-bit architectures, theresult are quite a bit worse on 64\-bit architectures). If a variableis accessed in two of three different ways (which require an integer,a float, or a string), the memory footprint may increase yet another20 bytes. A sloppy \fImalloc\fR\|(3) implementation can inflate thesenumbers dramatically..PPOn the opposite end of the scale, a declaration like.PP.Vb 1\& sub foo;.Ve.PPmay take up to 500 bytes of memory, depending on which release of Perlyou're running..PPAnecdotal estimates of source-to-compiled code bloat suggest aneightfold increase. This means that the compiled form of reasonable(normally commented, properly indented etc.) code will takeabout eight times more space in memory than the code tookon disk..PPThe \fB\-DL\fR command-line switch is obsolete since circa Perl 5.6.0(it was available only if Perl was built with \f(CW\*(C`\-DDEBUGGING\*(C'\fR).The switch was used to track Perl's memory allocations and possiblememory leaks. These days the use of malloc debugging tools like\&\fIPurify\fR or \fIvalgrind\fR is suggested instead. See also\&\*(L"\s-1PERL_MEM_LOG\s0\*(R" in perlhack..PPOne way to find out how much memory is being used by Perl datastructures is to install the Devel::Size module from \s-1CPAN:\s0 it givesyou the minimum number of bytes required to store a particular datastructure. Please be mindful of the difference between the \fIsize()\fRand \fItotal_size()\fR..PPIf Perl has been compiled using Perl's malloc you can analyze Perlmemory usage by setting the \f(CW$ENV\fR{\s-1PERL_DEBUG_MSTATS\s0}..ie n .Sh "Using $ENV{PERL_DEBUG_MSTATS}".el .Sh "Using \f(CW$ENV{PERL_DEBUG_MSTATS}\fP".IX Subsection "Using $ENV{PERL_DEBUG_MSTATS}"If your perl is using Perl's \fImalloc()\fR and was compiled with thenecessary switches (this is the default), then it will print memoryusage statistics after compiling your code when \f(CW\*(C`$ENV{PERL_DEBUG_MSTATS}> 1\*(C'\fR, and before termination of the program when \f(CW\*(C`$ENV{PERL_DEBUG_MSTATS} >= 1\*(C'\fR. The report format is similar tothe following example:.PP.Vb 10\& $ PERL_DEBUG_MSTATS=2 perl \-e "require Carp"\& Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)\& 14216 free: 130 117 28 7 9 0 2 2 1 0 0\& 437 61 36 0 5\& 60924 used: 125 137 161 55 7 8 6 16 2 0 1\& 74 109 304 84 20\& Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.\& Memory allocation statistics after execution: (buckets 4(4)..8188(8192)\& 30888 free: 245 78 85 13 6 2 1 3 2 0 1\& 315 162 39 42 11\& 175816 used: 265 176 1112 111 26 22 11 27 2 1 1\& 196 178 1066 798 39\& Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144..Ve.PPIt is possible to ask for such a statistic at arbitrary points inyour execution using the \fImstat()\fR function out of the standardDevel::Peek module..PPHere is some explanation of that format:.ie n .IP """buckets SMALLEST(APPROX)..GREATEST(APPROX)""" 4.el .IP "\f(CWbuckets SMALLEST(APPROX)..GREATEST(APPROX)\fR" 4.IX Item "buckets SMALLEST(APPROX)..GREATEST(APPROX)"Perl's \fImalloc()\fR uses bucketed allocations. Every request is roundedup to the closest bucket size available, and a bucket is taken fromthe pool of buckets of that size..SpThe line above describes the limits of buckets currently in use.Each bucket has two sizes: memory footprint and the maximal sizeof user data that can fit into this bucket. Suppose in the aboveexample that the smallest bucket were size 4. The biggest bucketwould have usable size 8188, and the memory footprint would be 8192..SpIn a Perl built for debugging, some buckets may have negative usablesize. This means that these buckets cannot (and will not) be used.For larger buckets, the memory footprint may be one page greaterthan a power of 2. If so, case the corresponding power of two isprinted in the \f(CW\*(C`APPROX\*(C'\fR field above..IP "Free/Used" 4.IX Item "Free/Used"The 1 or 2 rows of numbers following that correspond to the numberof buckets of each size between \f(CW\*(C`SMALLEST\*(C'\fR and \f(CW\*(C`GREATEST\*(C'\fR. Inthe first row, the sizes (memory footprints) of buckets are powersof two\*(--or possibly one page greater. In the second row, if present,the memory footprints of the buckets are between the memory footprintsof two buckets \*(L"above\*(R"..SpFor example, suppose under the previous example, the memory footprintswere.Sp.Vb 2\& free: 8 16 32 64 128 256 512 1024 2048 4096 8192\& 4 12 24 48 80.Ve.SpWith non\-\f(CW\*(C`DEBUGGING\*(C'\fR perl, the buckets starting from \f(CW128\fR havea 4\-byte overhead, and thus an 8192\-long bucket may take up to8188\-byte allocations..ie n .IP """Total sbrk(): SBRKed/SBRKs:CONTINUOUS""" 4.el .IP "\f(CWTotal sbrk(): SBRKed/SBRKs:CONTINUOUS\fR" 4.IX Item "Total sbrk(): SBRKed/SBRKs:CONTINUOUS"The first two fields give the total amount of memory perl \fIsbrk\fR\|(2)ed(ess-broken? :\-) and number of \fIsbrk\fR\|(2)s used. The third number iswhat perl thinks about continuity of returned chunks. So long asthis number is positive, \fImalloc()\fR will assume that it is probablethat \fIsbrk\fR\|(2) will provide continuous memory..SpMemory allocated by external libraries is not counted..ie n .IP """pad: 0""" 4.el .IP "\f(CWpad: 0\fR" 4.IX Item "pad: 0"The amount of \fIsbrk\fR\|(2)ed memory needed to keep buckets aligned..ie n .IP """heads: 2192""" 4.el .IP "\f(CWheads: 2192\fR" 4.IX Item "heads: 2192"Although memory overhead of bigger buckets is kept inside the bucket, forsmaller buckets, it is kept in separate areas. This field gives thetotal size of these areas..ie n .IP """chain: 0""" 4.el .IP "\f(CWchain: 0\fR" 4.IX Item "chain: 0"\&\fImalloc()\fR may want to subdivide a bigger bucket into smaller buckets.If only a part of the deceased bucket is left unsubdivided, the restis kept as an element of a linked list. This field gives the totalsize of these chunks..ie n .IP """tail: 6144""" 4.el .IP "\f(CWtail: 6144\fR" 4.IX Item "tail: 6144"To minimize the number of \fIsbrk\fR\|(2)s, \fImalloc()\fR asks for more memory. Thisfield gives the size of the yet unused part, which is \fIsbrk\fR\|(2)ed, butnever touched..SH "SEE ALSO".IX Header "SEE ALSO"perldebug,perlguts,perlrunre,andDevel::DProf..SH "POD ERRORS".IX Header "POD ERRORS"Hey! \fBThe above document had some coding errors, which are explained below:\fR.IP "Around line 531:" 4.IX Item "Around line 531:"Unterminated C<...> sequence.IP "Around line 715:" 4.IX Item "Around line 715:"Unterminated C<...> sequence
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -