📄 perlhack.pod
字号:
SETi(-TOPi);Just set the integer value of the top stack entry to its negation.Argument stack manipulation in the core is exactly the same as it is inXSUBs - see L<perlxstut>, L<perlxs> and L<perlguts> for a longerdescription of the macros used in stack manipulation.=item Mark stackI say `your portion of the stack' above because PP code doesn'tnecessarily get the whole stack to itself: if your function callsanother function, you'll only want to expose the arguments aimed for thecalled function, and not (necessarily) let it get at your own data. Theway we do this is to have a `virtual' bottom-of-stack, exposed to eachfunction. The mark stack keeps bookmarks to locations in the argumentstack usable by each function. For instance, when dealing with a tiedvariable, (internally, something with `P' magic) Perl has to callmethods for accesses to the tied variables. However, we need to separatethe arguments exposed to the method to the argument exposed to theoriginal function - the store or fetch or whatever it may be. Here's howthe tied C<push> is implemented; see C<av_push> in F<av.c>: 1 PUSHMARK(SP); 2 EXTEND(SP,2); 3 PUSHs(SvTIED_obj((SV*)av, mg)); 4 PUSHs(val); 5 PUTBACK; 6 ENTER; 7 call_method("PUSH", G_SCALAR|G_DISCARD); 8 LEAVE; 9 POPSTACK;The lines which concern the mark stack are the first, fifth and lastlines: they save away, restore and remove the current position of theargument stack. Let's examine the whole implementation, for practice: 1 PUSHMARK(SP);Push the current state of the stack pointer onto the mark stack. This isso that when we've finished adding items to the argument stack, Perlknows how many things we've added recently. 2 EXTEND(SP,2); 3 PUSHs(SvTIED_obj((SV*)av, mg)); 4 PUSHs(val);We're going to add two more items onto the argument stack: when you havea tied array, the C<PUSH> subroutine receives the object and the valueto be pushed, and that's exactly what we have here - the tied object,retrieved with C<SvTIED_obj>, and the value, the SV C<val>. 5 PUTBACK;Next we tell Perl to make the change to the global stack pointer: C<dSP>only gave us a local copy, not a reference to the global. 6 ENTER; 7 call_method("PUSH", G_SCALAR|G_DISCARD); 8 LEAVE;C<ENTER> and C<LEAVE> localise a block of code - they make sure that allvariables are tidied up, everything that has been localised getsits previous value returned, and so on. Think of them as the C<{> andC<}> of a Perl block.To actually do the magic method call, we have to call a subroutine inPerl space: C<call_method> takes care of that, and it's described inL<perlcall>. We call the C<PUSH> method in scalar context, and we'regoing to discard its return value. 9 POPSTACK;Finally, we remove the value we placed on the mark stack, since wedon't need it any more.=item Save stackC doesn't have a concept of local scope, so perl provides one. We'veseen that C<ENTER> and C<LEAVE> are used as scoping braces; the savestack implements the C equivalent of, for example: { local $foo = 42; ... }See L<perlguts/Localising Changes> for how to use the save stack.=back=head2 Millions of MacrosOne thing you'll notice about the Perl source is that it's full ofmacros. Some have called the pervasive use of macros the hardest thingto understand, others find it adds to clarity. Let's take an example,the code which implements the addition operator: 1 PP(pp_add) 2 { 3 dSP; dATARGET; tryAMAGICbin(add,opASSIGN); 4 { 5 dPOPTOPnnrl_ul; 6 SETn( left + right ); 7 RETURN; 8 } 9 }Every line here (apart from the braces, of course) contains a macro. Thefirst line sets up the function declaration as Perl expects for PP code;line 3 sets up variable declarations for the argument stack and thetarget, the return value of the operation. Finally, it tries to see ifthe addition operation is overloaded; if so, the appropriate subroutineis called.Line 5 is another variable declaration - all variable declarations startwith C<d> - which pops from the top of the argument stack two NVs (henceC<nn>) and puts them into the variables C<right> and C<left>, hence theC<rl>. These are the two operands to the addition operator. Next, wecall C<SETn> to set the NV of the return value to the result of addingthe two values. This done, we return - the C<RETURN> macro makes surethat our return value is properly handled, and we pass the next operatorto run back to the main run loop.Most of these macros are explained in L<perlapi>, and some of the moreimportant ones are explained in L<perlxs> as well. Pay special attentionto L<perlguts/Background and PERL_IMPLICIT_CONTEXT> for information onthe C<[pad]THX_?> macros.=head2 Poking at PerlTo really poke around with Perl, you'll probably want to build Perl fordebugging, like this: ./Configure -d -D optimize=-g makeC<-g> is a flag to the C compiler to have it produce debugginginformation which will allow us to step through a running program.F<Configure> will also turn on the C<DEBUGGING> compilation symbol whichenables all the internal debugging code in Perl. There are a whole bunchof things you can debug with this: L<perlrun> lists them all, and thebest way to find out about them is to play about with them. The mostuseful options are probably l Context (loop) stack processing t Trace execution o Method and overloading resolution c String/numeric conversionsSome of the functionality of the debugging code can be achieved using XSmodules. -Dr => use re 'debug' -Dx => use O 'Debug'=head2 Using a source-level debuggerIf the debugging output of C<-D> doesn't help you, it's time to stepthrough perl's execution with a source-level debugger.=over 3=item *We'll use C<gdb> for our examples here; the principles will apply to anydebugger, but check the manual of the one you're using.=backTo fire up the debugger, type gdb ./perlYou'll want to do that in your Perl source tree so the debugger can readthe source code. You should see the copyright message, followed by theprompt. (gdb)C<help> will get you into the documentation, but here are the mostuseful commands:=over 3=item run [args]Run the program with the given arguments.=item break function_name=item break source.c:xxxTells the debugger that we'll want to pause execution when we reacheither the named function (but see L<perlguts/Internal Functions>!) or the givenline in the named source file.=item stepSteps through the program a line at a time.=item nextSteps through the program a line at a time, without descending intofunctions.=item continueRun until the next breakpoint.=item finishRun until the end of the current function, then stop again.=item 'enter'Just pressing Enter will do the most recent operation again - it's ablessing when stepping through miles of source code.=item printExecute the given C code and print its results. B<WARNING>: Perl makesheavy use of macros, and F<gdb> is not aware of macros. You'll have tosubstitute them yourself. So, for instance, you can't say print SvPV_nolen(sv)but you have to say print Perl_sv_2pv_nolen(sv)You may find it helpful to have a "macro dictionary", which you canproduce by saying C<cpp -dM perl.c | sort>. Even then, F<cpp> won'trecursively apply the macros for you. =back=head2 Dumping Perl Data StructuresOne way to get around this macro hell is to use the dumping functions inF<dump.c>; these work a little like an internalL<Devel::Peek|Devel::Peek>, but they also cover OPs and other structuresthat you can't get at from Perl. Let's take an example. We'll use theC<$a = $b + $c> we used before, but give it a bit of context: C<$b = "6XXXX"; $c = 2.3;>. Where's a good place to stop and poke around?What about C<pp_add>, the function we examined earlier to implement theC<+> operator: (gdb) break Perl_pp_add Breakpoint 1 at 0x46249f: file pp_hot.c, line 309.Notice we use C<Perl_pp_add> and not C<pp_add> - see L<perlguts/Internal Functions>.With the breakpoint in place, we can run our program: (gdb) run -e '$b = "6XXXX"; $c = 2.3; $a = $b + $c'Lots of junk will go past as gdb reads in the relevant source files andlibraries, and then: Breakpoint 1, Perl_pp_add () at pp_hot.c:309 309 dSP; dATARGET; tryAMAGICbin(add,opASSIGN); (gdb) step 311 dPOPTOPnnrl_ul; (gdb)We looked at this bit of code before, and we said that C<dPOPTOPnnrl_ul>arranges for two C<NV>s to be placed into C<left> and C<right> - let'sslightly expand it: #define dPOPTOPnnrl_ul NV right = POPn; \ SV *leftsv = TOPs; \ NV left = USE_LEFT(leftsv) ? SvNV(leftsv) : 0.0C<POPn> takes the SV from the top of the stack and obtains its NV eitherdirectly (if C<SvNOK> is set) or by calling the C<sv_2nv> function.C<TOPs> takes the next SV from the top of the stack - yes, C<POPn> usesC<TOPs> - but doesn't remove it. We then use C<SvNV> to get the NV fromC<leftsv> in the same way as before - yes, C<POPn> uses C<SvNV>. Since we don't have an NV for C<$b>, we'll have to use C<sv_2nv> toconvert it. If we step again, we'll find ourselves there: Perl_sv_2nv (sv=0xa0675d0) at sv.c:1669 1669 if (!sv) (gdb)We can now use C<Perl_sv_dump> to investigate the SV: SV = PV(0xa057cc0) at 0xa0675d0 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0xa06a510 "6XXXX"\0 CUR = 5 LEN = 6 $1 = voidWe know we're going to get C<6> from this, so let's finish thesubroutine: (gdb) finish Run till exit from #0 Perl_sv_2nv (sv=0xa0675d0) at sv.c:1671 0x462669 in Perl_pp_add () at pp_hot.c:311 311 dPOPTOPnnrl_ul;We can also dump out this op: the current op is always stored inC<PL_op>, and we can dump it with C<Perl_op_dump>. This'll give ussimilar output to L<B::Debug|B::Debug>. { 13 TYPE = add ===> 14 TARG = 1 FLAGS = (SCALAR,KIDS) { TYPE = null ===> (12) (was rv2sv) FLAGS = (SCALAR,KIDS) { 11 TYPE = gvsv ===> 12 FLAGS = (SCALAR) GV = main::b } }< finish this later >=head2 PatchingAll right, we've now had a look at how to navigate the Perl sources andsome things you'll need to know when fiddling with them. Let's now geton and create a simple patch. Here's something Larry suggested: if aC<U> is the first active format during a C<pack>, (for example, C<pack "U3C8", @stuff>) then the resulting string should be treated asUTF8 encoded.How do we prepare to fix this up? First we locate the code in question -the C<pack> happens at runtime, so it's going to be in one of the F<pp>files. Sure enough, C<pp_pack> is in F<pp.c>. Since we're going to bealtering this file, let's copy it to F<pp.c~>.Now let's look over C<pp_pack>: we take a pattern into C<pat>, and thenloop over the pattern, taking each format character in turn into
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -