📄 perlhack.1
字号:
global variables in Perl start with \f(CW\*(C`PL_\*(C'\fR. This tells you whether thecurrent running program was created with the \f(CW\*(C`\-u\*(C'\fR flag to perl and then\&\fIundump\fR, which means it's going to be false in any sane context..SpLine 4 calls a function in \fIperl.c\fR to allocate memory for a Perlinterpreter. It's quite a simple function, and the guts of it looks likethis:.Sp.Vb 1\& my_perl = (PerlInterpreter*)PerlMem_malloc(sizeof(PerlInterpreter));.Ve.SpHere you see an example of Perl's system abstraction, which we'll seelater: \f(CW\*(C`PerlMem_malloc\*(C'\fR is either your system's \f(CW\*(C`malloc\*(C'\fR, or Perl'sown \f(CW\*(C`malloc\*(C'\fR as defined in \fImalloc.c\fR if you selected that option atconfigure time..SpNext, in line 7, we construct the interpreter; this sets up all thespecial variables that Perl needs, the stacks, and so on..SpNow we pass Perl the command line options, and tell it to go:.Sp.Vb 4\& exitstatus = perl_parse(my_perl, xs_init, argc, argv, (char **)NULL);\& if (!exitstatus) {\& exitstatus = perl_run(my_perl);\& }.Ve.Sp\&\f(CW\*(C`perl_parse\*(C'\fR is actually a wrapper around \f(CW\*(C`S_parse_body\*(C'\fR, as definedin \fIperl.c\fR, which processes the command line options, sets up anystatically linked \s-1XS\s0 modules, opens the program and calls \f(CW\*(C`yyparse\*(C'\fR toparse it..IP "Parsing" 3.IX Item "Parsing"The aim of this stage is to take the Perl source, and turn it into an optree. We'll see what one of those looks like later. Strictly speaking,there's three things going on here..Sp\&\f(CW\*(C`yyparse\*(C'\fR, the parser, lives in \fIperly.c\fR, although you're better offreading the original \s-1YACC\s0 input in \fIperly.y\fR. (Yes, Virginia, there\&\fBis\fR a \s-1YACC\s0 grammar for Perl!) The job of the parser is to take yourcode and \*(L"understand\*(R" it, splitting it into sentences, deciding whichoperands go with which operators and so on..SpThe parser is nobly assisted by the lexer, which chunks up your inputinto tokens, and decides what type of thing each token is: a variablename, an operator, a bareword, a subroutine, a core function, and so on.The main point of entry to the lexer is \f(CW\*(C`yylex\*(C'\fR, and that and itsassociated routines can be found in \fItoke.c\fR. Perl isn't much likeother computer languages; it's highly context sensitive at times, it canbe tricky to work out what sort of token something is, or where a tokenends. As such, there's a lot of interplay between the tokeniser and theparser, which can get pretty frightening if you're not used to it..SpAs the parser understands a Perl program, it builds up a tree ofoperations for the interpreter to perform during execution. The routineswhich construct and link together the various operations are to be foundin \fIop.c\fR, and will be examined later..IP "Optimization" 3.IX Item "Optimization"Now the parsing stage is complete, and the finished tree representsthe operations that the Perl interpreter needs to perform to execute ourprogram. Next, Perl does a dry run over the tree looking foroptimisations: constant expressions such as \f(CW\*(C`3 + 4\*(C'\fR will be computednow, and the optimizer will also see if any multiple operations can bereplaced with a single one. For instance, to fetch the variable \f(CW$foo\fR,instead of grabbing the glob \f(CW*foo\fR and looking at the scalarcomponent, the optimizer fiddles the op tree to use a function whichdirectly looks up the scalar in question. The main optimizer is \f(CW\*(C`peep\*(C'\fRin \fIop.c\fR, and many ops have their own optimizing functions..IP "Running" 3.IX Item "Running"Now we're finally ready to go: we have compiled Perl byte code, and allthat's left to do is run it. The actual execution is done by the\&\f(CW\*(C`runops_standard\*(C'\fR function in \fIrun.c\fR; more specifically, it's done bythese three innocent looking lines:.Sp.Vb 3\& while ((PL_op = CALL_FPTR(PL_op\->op_ppaddr)(aTHX))) {\& PERL_ASYNC_CHECK();\& }.Ve.SpYou may be more comfortable with the Perl version of that:.Sp.Vb 1\& PERL_ASYNC_CHECK() while $Perl::op = &{$Perl::op\->{function}};.Ve.SpWell, maybe not. Anyway, each op contains a function pointer, whichstipulates the function which will actually carry out the operation.This function will return the next op in the sequence \- this allows forthings like \f(CW\*(C`if\*(C'\fR which choose the next op dynamically at run time.The \f(CW\*(C`PERL_ASYNC_CHECK\*(C'\fR makes sure that things like signals interruptexecution if required..SpThe actual functions called are known as \s-1PP\s0 code, and they're spreadbetween four files: \fIpp_hot.c\fR contains the \*(L"hot\*(R" code, which is mostoften used and highly optimized, \fIpp_sys.c\fR contains all thesystem-specific functions, \fIpp_ctl.c\fR contains the functions whichimplement control structures (\f(CW\*(C`if\*(C'\fR, \f(CW\*(C`while\*(C'\fR and the like) and \fIpp.c\fRcontains everything else. These are, if you like, the C code for Perl'sbuilt-in functions and operators..SpNote that each \f(CW\*(C`pp_\*(C'\fR function is expected to return a pointer to the nextop. Calls to perl subs (and eval blocks) are handled within the samerunops loop, and do not consume extra space on the C stack. For example,\&\f(CW\*(C`pp_entersub\*(C'\fR and \f(CW\*(C`pp_entertry\*(C'\fR just push a \f(CW\*(C`CxSUB\*(C'\fR or \f(CW\*(C`CxEVAL\*(C'\fR blockstruct onto the context stack which contain the address of the opfollowing the sub call or eval. They then return the first op of that subor eval block, and so execution continues of that sub or block. Later, a\&\f(CW\*(C`pp_leavesub\*(C'\fR or \f(CW\*(C`pp_leavetry\*(C'\fR op pops the \f(CW\*(C`CxSUB\*(C'\fR or \f(CW\*(C`CxEVAL\*(C'\fR,retrieves the return op from it, and returns it..IP "Exception handing" 3.IX Item "Exception handing"Perl's exception handing (i.e. \f(CW\*(C`die\*(C'\fR etc.) is built on top of the low-level\&\f(CW\*(C`setjmp()\*(C'\fR/\f(CW\*(C`longjmp()\*(C'\fR C\-library functions. These basically provide away to capture the current \s-1PC\s0 and \s-1SP\s0 registers and later restore them; i.e.a \f(CW\*(C`longjmp()\*(C'\fR continues at the point in code where a previous \f(CW\*(C`setjmp()\*(C'\fRwas done, with anything further up on the C stack being lost. This is whycode should always save values using \f(CW\*(C`SAVE_FOO\*(C'\fR rather than in autovariables..SpThe perl core wraps \f(CW\*(C`setjmp()\*(C'\fR etc in the macros \f(CW\*(C`JMPENV_PUSH\*(C'\fR and\&\f(CW\*(C`JMPENV_JUMP\*(C'\fR. The basic rule of perl exceptions is that \f(CW\*(C`exit\*(C'\fR, and\&\f(CW\*(C`die\*(C'\fR (in the absence of \f(CW\*(C`eval\*(C'\fR) perform a \f(CWJMPENV_JUMP(2)\fR, while\&\f(CW\*(C`die\*(C'\fR within \f(CW\*(C`eval\*(C'\fR does a \f(CWJMPENV_JUMP(3)\fR..SpAt entry points to perl, such as \f(CW\*(C`perl_parse()\*(C'\fR, \f(CW\*(C`perl_run()\*(C'\fR and\&\f(CW\*(C`call_sv(cv, G_EVAL)\*(C'\fR each does a \f(CW\*(C`JMPENV_PUSH\*(C'\fR, then enter a runopsloop or whatever, and handle possible exception returns. For a 2 return,final cleanup is performed, such as popping stacks and calling \f(CW\*(C`CHECK\*(C'\fR or\&\f(CW\*(C`END\*(C'\fR blocks. Amongst other things, this is how scope cleanup stilloccurs during an \f(CW\*(C`exit\*(C'\fR..SpIf a \f(CW\*(C`die\*(C'\fR can find a \f(CW\*(C`CxEVAL\*(C'\fR block on the context stack, then thestack is popped to that level and the return op in that block is assignedto \f(CW\*(C`PL_restartop\*(C'\fR; then a \f(CWJMPENV_JUMP(3)\fR is performed. This normallypasses control back to the guard. In the case of \f(CW\*(C`perl_run\*(C'\fR and\&\f(CW\*(C`call_sv\*(C'\fR, a non-null \f(CW\*(C`PL_restartop\*(C'\fR triggers re-entry to the runopsloop. The is the normal way that \f(CW\*(C`die\*(C'\fR or \f(CW\*(C`croak\*(C'\fR is handled within an\&\f(CW\*(C`eval\*(C'\fR..SpSometimes ops are executed within an inner runops loop, such as tie, sortor overload code. In this case, something like.Sp.Vb 1\& sub FETCH { eval { die } }.Ve.Spwould cause a longjmp right back to the guard in \f(CW\*(C`perl_run\*(C'\fR, popping bothrunops loops, which is clearly incorrect. One way to avoid this is for thetie code to do a \f(CW\*(C`JMPENV_PUSH\*(C'\fR before executing \f(CW\*(C`FETCH\*(C'\fR in the innerrunops loop, but for efficiency reasons, perl in fact just sets a flag,using \f(CW\*(C`CATCH_SET(TRUE)\*(C'\fR. The \f(CW\*(C`pp_require\*(C'\fR, \f(CW\*(C`pp_entereval\*(C'\fR and\&\f(CW\*(C`pp_entertry\*(C'\fR ops check this flag, and if true, they call \f(CW\*(C`docatch\*(C'\fR,which does a \f(CW\*(C`JMPENV_PUSH\*(C'\fR and starts a new runops level to execute thecode, rather than doing it on the current loop..SpAs a further optimisation, on exit from the eval block in the \f(CW\*(C`FETCH\*(C'\fR,execution of the code following the block is still carried on in the innerloop. When an exception is raised, \f(CW\*(C`docatch\*(C'\fR compares the \f(CW\*(C`JMPENV\*(C'\fRlevel of the \f(CW\*(C`CxEVAL\*(C'\fR with \f(CW\*(C`PL_top_env\*(C'\fR and if they differ, justre-throws the exception. In this way any inner loops get popped..SpHere's an example..Sp.Vb 5\& 1: eval { tie @a, \*(AqA\*(Aq };\& 2: sub A::TIEARRAY {\& 3: eval { die };\& 4: die;\& 5: }.Ve.SpTo run this code, \f(CW\*(C`perl_run\*(C'\fR is called, which does a \f(CW\*(C`JMPENV_PUSH\*(C'\fR thenenters a runops loop. This loop executes the eval and tie ops on line 1,with the eval pushing a \f(CW\*(C`CxEVAL\*(C'\fR onto the context stack..SpThe \f(CW\*(C`pp_tie\*(C'\fR does a \f(CW\*(C`CATCH_SET(TRUE)\*(C'\fR, then starts a second runops loopto execute the body of \f(CW\*(C`TIEARRAY\*(C'\fR. When it executes the entertry op online 3, \f(CW\*(C`CATCH_GET\*(C'\fR is true, so \f(CW\*(C`pp_entertry\*(C'\fR calls \f(CW\*(C`docatch\*(C'\fR whichdoes a \f(CW\*(C`JMPENV_PUSH\*(C'\fR and starts a third runops loop, which then executesthe die op. At this point the C call stack looks like this:.Sp.Vb 10\& Perl_pp_die\& Perl_runops # third loop\& S_docatch_body\& S_docatch\& Perl_pp_entertry\& Perl_runops # second loop\& S_call_body\& Perl_call_sv\& Perl_pp_tie\& Perl_runops # first loop\& S_run_body\& perl_run\& main.Ve.Spand the context and data stacks, as shown by \f(CW\*(C`\-Dstv\*(C'\fR, look like:.Sp.Vb 9\& STACK 0: MAIN\& CX 0: BLOCK =>\& CX 1: EVAL => AV() PV("A"\e0)\& retop=leave\& STACK 1: MAGIC\& CX 0: SUB =>\& retop=(null)\& CX 1: EVAL => *\& retop=nextstate.Ve.SpThe die pops the first \f(CW\*(C`CxEVAL\*(C'\fR off the context stack, sets\&\f(CW\*(C`PL_restartop\*(C'\fR from it, does a \f(CWJMPENV_JUMP(3)\fR, and control returns tothe top \f(CW\*(C`docatch\*(C'\fR. This then starts another third-level runops level,which executes the nextstate, pushmark and die ops on line 4. At the pointthat the second \f(CW\*(C`pp_die\*(C'\fR is called, the C call stack looks exactly likethat above, even though we are no longer within an inner eval; this isbecause of the optimization mentioned earlier. However, the context stacknow looks like this, ie with the top CxEVAL popped:.Sp.Vb 7\& STACK 0: MAIN\& CX 0: BLOCK =>\& CX 1: EVAL => AV() PV("A"\e0)\& retop=leave\& STACK 1: MAGIC\& CX 0: SUB =>\& retop=(null).Ve.SpThe die on line 4 pops the context stack back down to the CxEVAL, leavingit as:.Sp.Vb 2\& STACK 0: MAIN\& CX 0: BLOCK =>.Ve.SpAs usual, \f(CW\*(C`PL_restartop\*(C'\fR is extracted from the \f(CW\*(C`CxEVAL\*(C'\fR, and a\&\f(CWJMPENV_JUMP(3)\fR done, which pops the C stack back to the docatch:.Sp.Vb 10\& S_docatch\& Perl_pp_entertry\& Perl_runops # second loop\& S_call_body\& Perl_call_sv\& Perl_pp_tie\& Perl_runops # first loop\& S_run_body\& perl_run\& main.Ve.SpIn this case, because the \f(CW\*(C`JMPENV\*(C'\fR level recorded in the \f(CW\*(C`CxEVAL\*(C'\fRdiffers from the current one, \f(CW\*(C`docatch\*(C'\fR just does a \f(CWJMPENV_JUMP(3)\fRand the C stack unwinds to:.Sp.Vb 2\& perl_run\& main.Ve.SpBecause \f(CW\*(C`PL_restartop\*(C'\fR is non-null, \f(CW\*(C`run_body\*(C'\fR starts a new runops loopand execution continues..Sh "Internal Variable Types".IX Subsection "Internal Variable Types"You should by now have had a look at perlguts, which tells you aboutPerl's internal variable types: SVs, HVs, AVs and the rest. If not, dothat now..PPThese variables are used not only to represent Perl-space variables, butalso any constants in the code, as well as some structures completelyinternal to Perl. The symbol table, for instance, is an ordinary Perlhash. Your code is represented by an \s-1SV\s0 as it's read into the parser;any program files you call are opened via ordinary Perl filehandles, andso on..PPThe core Devel::Peek module lets us examine SVs from aPerl program. Let's see, for instance, how Perl treats the constant\&\f(CW"hello"\fR..PP.Vb 7\& % perl \-MDevel::Peek \-e \*(AqDump("hello")\*(Aq\& 1 SV = PV(0xa041450) at 0xa04ecbc\& 2 REFCNT = 1\& 3 FLAGS = (POK,READONLY,pPOK)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -