📄 perlcompile.pod
字号:
=head1 NAMEperlcompile - Introduction to the Perl Compiler-Translator =head1 DESCRIPTIONPerl has always had a compiler: your source is compiled into aninternal form (a parse tree) which is then optimized before beingrun. Since version 5.005, Perl has shipped with a modulecapable of inspecting the optimized parse tree (C<B>), and this hasbeen used to write many useful utilities, including a module that letsyou turn your Perl into C source code that can be compiled into annative executable.The C<B> module provides access to the parse tree, and other modules("back ends") do things with the tree. Some write it out asbytecode, C source code, or a semi-human-readable text. Anothertraverses the parse tree to build a cross-reference of whichsubroutines, formats, and variables are used where. Another checksyour code for dubious constructs. Yet another back end dumps theparse tree back out as Perl source, acting as a source code beautifieror deobfuscator.Because its original purpose was to be a way to produce C codecorresponding to a Perl program, and in turn a native executable, theC<B> module and its associated back ends are known as "thecompiler", even though they don't really compile anything.Different parts of the compiler are more accurately a "translator",or an "inspector", but people want Perl to have a "compileroption" not an "inspector gadget". What can you do?This document covers the use of the Perl compiler: which modulesit comprises, how to use the most important of the back end modules,what problems there are, and how to work around them.=head2 LayoutThe compiler back ends are in the C<B::> hierarchy, and the front-end(the module that you, the user of the compiler, will sometimesinteract with) is the O module. Some back ends (e.g., C<B::C>) haveprograms (e.g., I<perlcc>) to hide the modules' complexity.Here are the important back ends to know about, with their statusexpressed as a number from 0 (outline for later implementation) to10 (if there's a bug in it, we're very surprised):=over 4=item B::BytecodeStores the parse tree in a machine-independent format, suitablefor later reloading through the ByteLoader module. Status: 5 (somethings work, some things don't, some things are untested).=item B::CCreates a C source file containing code to rebuild the parse treeand resume the interpreter. Status: 6 (many things work adequately,including programs using Tk).=item B::CCCreates a C source file corresponding to the run time code path inthe parse tree. This is the closest to a Perl-to-C translator thereis, but the code it generates is almost incomprehensible because ittranslates the parse tree into a giant switch structure thatmanipulates Perl structures. Eventual goal is to reduce (givensufficient type information in the Perl program) some of thePerl data structure manipulations into manipulations of C-levelints, floats, etc. Status: 5 (some things work, includinguncomplicated Tk examples).=item B::LintComplains if it finds dubious constructs in your source code. Status:6 (it works adequately, but only has a very limited number of areasthat it checks).=item B::DeparseRecreates the Perl source, making an attempt to format it coherently.Status: 8 (it works nicely, but a few obscure things are missing).=item B::XrefReports on the declaration and use of subroutines and variables.Status: 8 (it works nicely, but still has a few lingering bugs).=back=head1 Using The Back EndsThe following sections describe how to use the various compiler backends. They're presented roughly in order of maturity, so that themost stable and proven back ends are described first, and the mostexperimental and incomplete back ends are described last.The O module automatically enabled the B<-c> flag to Perl, whichprevents Perl from executing your code once it has been compiled.This is why all the back ends print: myperlprogram syntax OKbefore producing any other output.=head2 The Cross Referencing Back EndThe cross referencing back end (B::Xref) produces a report on your program,breaking down declarations and uses of subroutines and variables (andformats) by file and subroutine. For instance, here's part of thereport from the I<pod2man> program that comes with Perl: Subroutine clear_noremap Package (lexical) $ready_to_print i1069, 1079 Package main $& 1086 $. 1086 $0 1086 $1 1087 $2 1085, 1085 $3 1085, 1085 $ARGV 1086 %HTML_Escapes 1085, 1085This shows the variables used in the subroutine C<clear_noremap>. Thevariable C<$ready_to_print> is a my() (lexical) variable,B<i>ntroduced (first declared with my()) on line 1069, and used online 1079. The variable C<$&> from the main package is used on 1086,and so on.A line number may be prefixed by a single letter:=over 4=item iLexical variable introduced (declared with my()) for the first time.=item &Subroutine or method call.=item sSubroutine defined.=item rFormat defined.=backThe most useful option the cross referencer has is to save the reportto a separate file. For instance, to save the report onI<myperlprogram> to the file I<report>: $ perl -MO=Xref,-oreport myperlprogram=head2 The Decompiling Back EndThe Deparse back end turns your Perl source back into Perl source. Itcan reformat along the way, making it useful as a de-obfuscator. Themost basic way to use it is: $ perl -MO=Deparse myperlprogramYou'll notice immediately that Perl has no idea of how to paragraphyour code. You'll have to separate chunks of code from each otherwith newlines by hand. However, watch what it will do withone-liners: $ perl -MO=Deparse -e '$op=shift||die "usage: $0 code [...]";chomp(@ARGV=<>)unless@ARGV; for(@ARGV){$was=$_;eval$op; die$@ if$@; rename$was,$_ unless$was eq $_}' -e syntax OK $op = shift @ARGV || die("usage: $0 code [...]"); chomp(@ARGV = <ARGV>) unless @ARGV; foreach $_ (@ARGV) { $was = $_; eval $op; die $@ if $@; rename $was, $_ unless $was eq $_; }The decompiler has several options for the code it generates. Forinstance, you can set the size of each indent from 4 (as above) to2 with: $ perl -MO=Deparse,-si2 myperlprogramThe B<-p> option adds parentheses where normally they are omitted: $ perl -MO=Deparse -e 'print "Hello, world\n"' -e syntax OK print "Hello, world\n"; $ perl -MO=Deparse,-p -e 'print "Hello, world\n"' -e syntax OK print("Hello, world\n");See L<B::Deparse> for more information on the formatting options.=head2 The Lint Back EndThe lint back end (B::Lint) inspects programs for poor style. Oneprogrammer's bad style is another programmer's useful tool, so optionslet you select what is complained about.To run the style checker across your source code: $ perl -MO=Lint myperlprogramTo disable context checks and undefined subroutines: $ perl -MO=Lint,-context,-undefined-subs myperlprogramSee L<B::Lint> for information on the options.=head2 The Simple C Back EndThis module saves the internal compiled state of your Perl programto a C source file, which can be turned into a native executablefor that particular platform using a C compiler. The resultingprogram links against the Perl interpreter library, so itwill not save you disk space (unless you build Perl with a sharedlibrary) or program size. It may, however, save you startup time.The C<perlcc> tool generates such executables by default. perlcc myperlprogram.pl=head2 The Bytecode Back EndThis back end is only useful if you also have a way to load andexecute the bytecode that it produces. The ByteLoader module providesthis functionality.To turn a Perl program into executable byte code, you can use C<perlcc>with the C<-b> switch: perlcc -b myperlprogram.plThe byte code is machine independent, so once you have a compiledmodule or program, it is as portable as Perl source (assuming thatthe user of the module or program has a modern-enough Perl interpreterto decode the byte code).See B<B::Bytecode> for information on options to control theoptimization and nature of the code generated by the Bytecode module.=head2 The Optimized C Back EndThe optimized C back end will turn your Perl program's run timecode-path into an equivalent (but optimized) C program that manipulatesthe Perl data structures directly. The program will still link againstthe Perl interpreter library, to allow for eval(), C<s///e>,C<require>, etc.The C<perlcc> tool generates such executables when using the -optswitch. To compile a Perl program (ending in C<.pl>or C<.p>): perlcc -opt myperlprogram.plTo produce a shared library from a Perl module (ending in C<.pm>): perlcc -opt Myperlmodule.pmFor more information, see L<perlcc> and L<B::CC>.=over 4=item BThis module is the introspective ("reflective" in Java terms)module, which allows a Perl program to inspect its innards. Theback end modules all use this module to gain access to the compiledparse tree. You, the user of a back end module, will not need tointeract with B.=item OThis module is the front-end to the compiler's back ends. Normallycalled something like this: $ perl -MO=Deparse myperlprogramThis is like saying C<use O 'Deparse'> in your Perl program.=item B::AsmdataThis module is used by the B::Assembler module, which is in turn usedby the B::Bytecode module, which stores a parse-tree asbytecode for later loading. It's not a back end itself, but rather acomponent of a back end.=item B::AssemblerThis module turns a parse-tree into data suitable for storingand later decoding back into a parse-tree. It's not a back enditself, but rather a component of a back end. It's used by theI<assemble> program that produces bytecode.=item B::BblockThis module is used by the B::CC back end. It walks "basic blocks".A basic block is a series of operations which is known to execute fromstart to finish, with no possibility of branching or halting.=item B::BytecodeThis module is a back end that generates bytecode from aprogram's parse tree. This bytecode is written to a file, from whereit can later be reconstructed back into a parse tree. The goal is todo the expensive program compilation once, save the interpreter'sstate into a file, and then restore the state from the file when theprogram is to be executed. See L</"The Bytecode Back End">for details about usage.=item B::CThis module writes out C code corresponding to the parse tree andother interpreter internal structures. You compile the correspondingC file, and get an executable file that will restore the internalstructures and the Perl interpreter will begin running theprogram. See L</"The Simple C Back End"> for details about usage.=item B::CCThis module writes out C code corresponding to your program'soperations. Unlike the B::C module, which merely stores theinterpreter and its state in a C program, the B::CC module makes aC program that does not involve the interpreter. As a consequence,programs translated into C by B::CC can execute faster than normalinterpreted programs. See L</"The Optimized C Back End"> fordetails about usage.=item B::DebugThis module dumps the Perl parse tree in verbose detail to STDOUT.It's useful for people who are writing their own back end, or whoare learning about the Perl internals. It's not useful to theaverage programmer.=item B::DeparseThis module produces Perl source code from the compiled parse tree.It is useful in debugging and deconstructing other people's code,also as a pretty-printer for your own source. SeeL</"The Decompiling Back End"> for details about usage.=item B::DisassemblerThis module turns bytecode back into a parse tree. It's not a backend itself, but rather a component of a back end. It's used by theI<disassemble> program that comes with the bytecode.=item B::LintThis module inspects the compiled form of your source code for thingswhich, while some people frown on them, aren't necessarily bad enoughto justify a warning. For instance, use of an array in scalar contextwithout explicitly saying C<scalar(@array)> is something that Lintcan identify. See L</"The Lint Back End"> for details about usage.=item B::ShowlexThis module prints out the my() variables used in a function or afile. To get a list of the my() variables used in the subroutinemysub() defined in the file myperlprogram: $ perl -MO=Showlex,mysub myperlprogramTo get a list of the my() variables used in the file myperlprogram: $ perl -MO=Showlex myperlprogram[BROKEN]=item B::StackobjThis module is used by the B::CC module. It's not a back end itself,but rather a component of a back end.=item B::StashThis module is used by the L<perlcc> program, which compiles a moduleinto an executable. B::Stash prints the symbol tables in use by aprogram, and is used to prevent B::CC from producing C code for theB::* and O modules. It's not a back end itself, but rather acomponent of a back end.=item B::TerseThis module prints the contents of the parse tree, but without as muchinformation as B::Debug. For comparison, C<print "Hello, world.">produced 96 lines of output from B::Debug, but only 6 from B::Terse.This module is useful for people who are writing their own back end,or who are learning about the Perl internals. It's not useful to theaverage programmer.=item B::XrefThis module prints a report on where the variables, subroutines, andformats are defined and used within a program and the modules itloads. See L</"The Cross Referencing Back End"> for details aboutusage.=back=head1 KNOWN PROBLEMSThe simple C backend currently only saves typeglobs with alphanumericnames.The optimized C backend outputs code for more modules than it should(e.g., DirHandle). It also has little hope of properly handlingC<goto LABEL> outside the running subroutine (C<goto &sub> is okay).C<goto LABEL> currently does not work at all in this backend.It also creates a huge initialization function that givesC compilers headaches. Splitting the initialization function givesbetter results. Other problems include: unsigned math does notwork correctly; some opcodes are handled incorrectly by defaultopcode handling mechanism.BEGIN{} blocks are executed while compiling your code. Any externalstate that is initialized in BEGIN{}, such as opening files, initiatingdatabase connections etc., do not behave properly. To work aroundthis, Perl has an INIT{} block that corresponds to code being executedbefore your program begins running but after your program has finishedbeing compiled. Execution order: BEGIN{}, (possible save of statethrough compiler back-end), INIT{}, program runs, END{}.=head1 AUTHORThis document was originally written by Nathan Torkington, and is nowmaintained by the perl5-porters mailing listI<perl5-porters@perl.org>.=cut
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -