perlport.pod

来自「MSYS在windows下模拟了一个类unix的终端」· POD 代码 · 共 1,913 行 · 第 1/5 页

POD
1,913
字号
=head1 NAMEperlport - Writing portable Perl=head1 DESCRIPTIONPerl runs on numerous operating systems.  While most of them sharemuch in common, they also have their own unique features.This document is meant to help you to find out what constitutes portablePerl code.  That way once you make a decision to write portably,you know where the lines are drawn, and you can stay within them.There is a tradeoff between taking full advantage of one particulartype of computer and taking advantage of a full range of them.Naturally, as you broaden your range and become more diverse, thecommon factors drop, and you are left with an increasingly smallerarea of common ground in which you can operate to accomplish aparticular task.  Thus, when you begin attacking a problem, it isimportant to consider under which part of the tradeoff curve youwant to operate.  Specifically, you must decide whether it isimportant that the task that you are coding have the full generalityof being portable, or whether to just get the job done right now.This is the hardest choice to be made.  The rest is easy, becausePerl provides many choices, whichever way you want to approach yourproblem.Looking at it another way, writing portable code is usually aboutwillfully limiting your available choices.  Naturally, it takesdiscipline and sacrifice to do that.  The product of portabilityand convenience may be a constant.  You have been warned.Be aware of two important points:=over 4=item Not all Perl programs have to be portableThere is no reason you should not use Perl as a language to glue Unixtools together, or to prototype a Macintosh application, or to manage theWindows registry.  If it makes no sense to aim for portability for onereason or another in a given program, then don't bother.=item Nearly all of Perl already I<is> portableDon't be fooled into thinking that it is hard to create portable Perlcode.  It isn't.  Perl tries its level-best to bridge the gaps betweenwhat's available on different platforms, and all the means available touse those features.  Thus almost all Perl code runs on any machinewithout modification.  But there are some significant issues inwriting portable code, and this document is entirely about those issues.=backHere's the general rule: When you approach a task commonly doneusing a whole range of platforms, think about writing portablecode.  That way, you don't sacrifice much by way of the implementationchoices you can avail yourself of, and at the same time you can giveyour users lots of platform choices.  On the other hand, when you have totake advantage of some unique feature of a particular platform, as isoften the case with systems programming (whether for Unix, Windows,S<Mac OS>, VMS, etc.), consider writing platform-specific code.When the code will run on only two or three operating systems, youmay need to consider only the differences of those particular systems.The important thing is to decide where the code will run and to bedeliberate in your decision.The material below is separated into three main sections: main issues ofportability (L<"ISSUES">, platform-specific issues (L<"PLATFORMS">, andbuilt-in perl functions that behave differently on various ports(L<"FUNCTION IMPLEMENTATIONS">.This information should not be considered complete; it includes possiblytransient information about idiosyncrasies of some of the ports, almostall of which are in a state of constant evolution.  Thus, this materialshould be considered a perpetual work in progress(<IMG SRC="yellow_sign.gif" ALT="Under Construction">).=head1 ISSUES=head2 NewlinesIn most operating systems, lines in files are terminated by newlines.Just what is used as a newline may vary from OS to OS.  Unixtraditionally uses C<\012>, one type of DOSish I/O uses C<\015\012>,and S<Mac OS> uses C<\015>.Perl uses C<\n> to represent the "logical" newline, where what islogical may depend on the platform in use.  In MacPerl, C<\n> alwaysmeans C<\015>.  In DOSish perls, C<\n> usually means C<\012>, butwhen accessing a file in "text" mode, STDIO translates it to (orfrom) C<\015\012>, depending on whether you're reading or writing.Unix does the same thing on ttys in canonical mode.  C<\015\012>is commonly referred to as CRLF.A common cause of unportable programs is the misuse of chop() to trimnewlines:    # XXX UNPORTABLE!    while(<FILE>) {        chop;        @array = split(/:/);        #...    }You can get away with this on Unix and MacOS (they have a singlecharacter end-of-line), but the same program will break under DOSishperls because you're only chop()ing half the end-of-line.  Instead,chomp() should be used to trim newlines.  The Dunce::Files module canhelp audit your code for misuses of chop().When dealing with binary files (or text files in binary mode) be sureto explicitly set $/ to the appropriate value for your file formatbefore using chomp().Because of the "text" mode translation, DOSish perls have limitationsin using C<seek> and C<tell> on a file accessed in "text" mode.Stick to C<seek>-ing to locations you got from C<tell> (and noothers), and you are usually free to use C<seek> and C<tell> evenin "text" mode.  Using C<seek> or C<tell> or other file operationsmay be non-portable.  If you use C<binmode> on a file, however, youcan usually C<seek> and C<tell> with arbitrary values in safety.A common misconception in socket programming is that C<\n> eq C<\012>everywhere.  When using protocols such as common Internet protocols,C<\012> and C<\015> are called for specifically, and the values ofthe logical C<\n> and C<\r> (carriage return) are not reliable.    print SOCKET "Hi there, client!\r\n";      # WRONG    print SOCKET "Hi there, client!\015\012";  # RIGHTHowever, using C<\015\012> (or C<\cM\cJ>, or C<\x0D\x0A>) can be tediousand unsightly, as well as confusing to those maintaining the code.  Assuch, the Socket module supplies the Right Thing for those who want it.    use Socket qw(:DEFAULT :crlf);    print SOCKET "Hi there, client!$CRLF"      # RIGHTWhen reading from a socket, remember that the default input recordseparator C<$/> is C<\n>, but robust socket code will recognize aseither C<\012> or C<\015\012> as end of line:    while (<SOCKET>) {        # ...    }Because both CRLF and LF end in LF, the input record separator canbe set to LF and any CR stripped later.  Better to write:    use Socket qw(:DEFAULT :crlf);    local($/) = LF;      # not needed if $/ is already \012    while (<SOCKET>) {        s/$CR?$LF/\n/;   # not sure if socket uses LF or CRLF, OK    #   s/\015?\012/\n/; # same thing    }This example is preferred over the previous one--even for Unixplatforms--because now any C<\015>'s (C<\cM>'s) are stripped out(and there was much rejoicing).Similarly, functions that return text data--such as a function thatfetches a web page--should sometimes translate newlines beforereturning the data, if they've not yet been translated to the localnewline representation.  A single line of code will often suffice:    $data =~ s/\015?\012/\n/g;    return $data;Some of this may be confusing.  Here's a handy reference to the ASCII CRand LF characters.  You can print it out and stick it in your wallet.    LF  ==  \012  ==  \x0A  ==  \cJ  ==  ASCII 10    CR  ==  \015  ==  \x0D  ==  \cM  ==  ASCII 13             | Unix | DOS  | Mac  |        ---------------------------        \n   |  LF  |  LF  |  CR  |        \r   |  CR  |  CR  |  LF  |        \n * |  LF  | CRLF |  CR  |        \r * |  CR  |  CR  |  LF  |        ---------------------------        * text-mode STDIOThe Unix column assumes that you are not accessing a serial line(like a tty) in canonical mode.  If you are, then CR on input becomes"\n", and "\n" on output becomes CRLF.These are just the most common definitions of C<\n> and C<\r> in Perl.There may well be others.=head2 Numbers endianness and WidthDifferent CPUs store integers and floating point numbers in differentorders (called I<endianness>) and widths (32-bit and 64-bit being themost common today).  This affects your programs when they attempt to transfernumbers in binary format from one CPU architecture to another,usually either "live" via network connection, or by storing thenumbers to secondary storage such as a disk file or tape.Conflicting storage orders make utter mess out of the numbers.  If alittle-endian host (Intel, VAX) stores 0x12345678 (305419896 indecimal), a big-endian host (Motorola, Sparc, PA) reads it as0x78563412 (2018915346 in decimal).  Alpha and MIPS can be either:Digital/Compaq used/uses them in little-endian mode; SGI/Cray usesthem in big-endian mode.  To avoid this problem in network (socket)connections use the C<pack> and C<unpack> formats C<n> and C<N>, the"network" orders.  These are guaranteed to be portable.You can explore the endianness of your platform by unpacking adata structure packed in native format such as:    print unpack("h*", pack("s2", 1, 2)), "\n";    # '10002000' on e.g. Intel x86 or Alpha 21064 in little-endian mode    # '00100020' on e.g. Motorola 68040If you need to distinguish between endian architectures you could useeither of the variables set like so:    $is_big_endian   = unpack("h*", pack("s", 1)) =~ /01/;    $is_little_endian = unpack("h*", pack("s", 1)) =~ /^1/;Differing widths can cause truncation even between platforms of equalendianness.  The platform of shorter width loses the upper parts of thenumber.  There is no good solution for this problem except to avoidtransferring or storing raw binary numbers.One can circumnavigate both these problems in two ways.  Eithertransfer and store numbers always in text format, instead of rawbinary, or else consider using modules like Data::Dumper (included inthe standard distribution as of Perl 5.005) and Storable (included asof perl 5.8).  Keeping all data as text significantly simplifies matters.=head2 Files and FilesystemsMost platforms these days structure files in a hierarchical fashion.So, it is reasonably safe to assume that all platforms support thenotion of a "path" to uniquely identify a file on the system.  Howthat path is really written, though, differs considerably.Although similar, file path specifications differ between Unix,Windows, S<Mac OS>, OS/2, VMS, VOS, S<RISC OS>, and probably others.Unix, for example, is one of the few OSes that has the elegant ideaof a single root directory.DOS, OS/2, VMS, VOS, and Windows can work similarly to Unix with C</>as path separator, or in their own idiosyncratic ways (such as havingseveral root directories and various "unrooted" device files such NIL:and LPT:).S<Mac OS> uses C<:> as a path separator instead of C</>.The filesystem may support neither hard links (C<link>) norsymbolic links (C<symlink>, C<readlink>, C<lstat>).The filesystem may support neither access timestamp nor changetimestamp (meaning that about the only portable timestamp is themodification timestamp), or one second granularity of any timestamps(e.g. the FAT filesystem limits the time granularity to two seconds).VOS perl can emulate Unix filenames with C</> as path separator.  Thenative pathname characters greater-than, less-than, number-sign, andpercent-sign are always accepted.S<RISC OS> perl can emulate Unix filenames with C</> as pathseparator, or go native and use C<.> for path separator and C<:> tosignal filesystems and disk names.If all this is intimidating, have no (well, maybe only a little)fear.  There are modules that can help.  The File::Spec modulesprovide methods to do the Right Thing on whatever platform happensto be running the program.    use File::Spec::Functions;    chdir(updir());        # go up one directory    $file = catfile(curdir(), 'temp', 'file.txt');    # on Unix and Win32, './temp/file.txt'    # on Mac OS, ':temp:file.txt'    # on VMS, '[.temp]file.txt'File::Spec is available in the standard distribution as of version5.004_05.  File::Spec::Functions is only in File::Spec 0.7 and later,and some versions of perl come with version 0.6.  If File::Specis not updated to 0.7 or later, you must use the object-orientedinterface from File::Spec (or upgrade File::Spec).In general, production code should not have file paths hardcoded.Making them user-supplied or read from a configuration file isbetter, keeping in mind that file path syntax varies on differentmachines.This is especially noticeable in scripts like Makefiles and test suites,which often assume C</> as a path separator for subdirectories.Also of use is File::Basename from the standard distribution, whichsplits a pathname into pieces (base filename, full path to directory,and file suffix).Even when on a single platform (if you can call Unix a single platform),remember not to count on the existence or the contents of particularsystem-specific files or directories, like F</etc/passwd>,F</etc/sendmail.conf>, F</etc/resolv.conf>, or even F</tmp/>.  Forexample, F</etc/passwd> may exist but not contain the encryptedpasswords, because the system is using some form of enhanced security.Or it may not contain all the accounts, because the system is using NIS. If code does need to rely on such a file, include a description of thefile and its format in the code's documentation, then make it easy forthe user to override the default location of the file.Don't assume a text file will end with a newline.  They should,but people forget.Do not have two files of the same name with different case, likeF<test.pl> and F<Test.pl>, as many platforms have case-insensitivefilenames.  Also, try not to have non-word characters (except for C<.>)in the names, and keep them to the 8.3 convention, for maximumportability, onerous a burden though this may appear.Likewise, when using the AutoSplit module, try to keep your functions to8.3 naming and case-insensitive conventions; or, at the least,make it so the resulting files have a unique (case-insensitively)first 8 characters.Whitespace in filenames is tolerated on most systems, but not all.Many systems (DOS, VMS) cannot have more than one C<.> in their filenames.Don't assume C<< > >> won't be the first character of a filename.Always use C<< < >> explicitly to open a file for reading,unless you want the user to be able to specify a pipe open.    open(FILE, "< $existing_file") or die $!;If filenames might use strange characters, it is safest to open itwith C<sysopen> instead of C<open>.  C<open> is magic and cantranslate characters like C<< > >>, C<< < >>, and C<|>, which maybe the wrong thing to do.  (Sometimes, though, it's the right thing.)=head2 System InteractionNot all platforms provide a command line.  These are usually platformsthat rely primarily on a Graphical User Interface (GUI) for userinteraction.  A program requiring a command line interface mightnot work everywhere.  This is probably for the user of the programto deal with, so don't stay up late worrying about it.Some platforms can't delete or rename files held open by the system.Remember to C<close> files when you are done with them.  Don'tC<unlink> or C<rename> an open file.  Don't C<tie> or C<open> afile already tied or opened; C<untie> or C<close> it first.Don't open the same file more than once at a time for writing, as someoperating systems put mandatory locks on such files.Don't count on a specific environment variable existing in C<%ENV>.Don't count on C<%ENV> entries being case-sensitive, or evencase-preserving.  Don't try to clear %ENV by saying C<%ENV = ();>, or,if you really have to, make it conditional on C<$^O ne 'VMS'> since inVMS the C<%ENV> table is much more than a per-process key-value stringtable.Don't count on signals or C<%SIG> for anything.Don't count on filename globbing.  Use C<opendir>, C<readdir>, andC<closedir> instead.Don't count on per-program environment variables, or per-program currentdirectories.Don't count on specific values of C<$!>.=head2 Interprocess Communication (IPC)In general, don't directly access the system in code meant to beportable.  That means, no C<system>, C<exec>, C<fork>, C<pipe>,C<``>, C<qx//>, C<open> with a C<|>, nor any of the other thingsthat makes being a perl hacker worth being.Commands that launch external processes are generally supported onmost platforms (though many of them do not support any type offorking).  The problem with using them arises from what you invokethem on.  External tools are often named differently on differentplatforms, may not be available in the same location, might accept

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?