📄 perlpodspec.1
字号:
pointing guillemet\*(R". (These look like little \*(L"<<\*(R" and \*(L">>\*(R", and theyare now preferably expressed with the \s-1HTML/XHTML\s0 codes \*(L"E<laquo>\*(R"and \*(L"E<raquo>\*(R".).IP "\(bu" 4Pod parsers should understand all \*(L"E<html>\*(R" codes as definedin the entity declarations in the most recent \s-1XHTML\s0 specification at\&\f(CW\*(C`www.W3.org\*(C'\fR. Pod parsers must understand at least the entitiesthat define characters in the range 160\-255 (Latin\-1). Pod parsers,when faced with some unknown "E<\fIidentifier\fR>" code,shouldn't simply replace it with nullstring (by default, at least),but may pass it through as a string consisting of the literal charactersE, less-than, \fIidentifier\fR, greater-than. Or Pod parsers may offer thealternative option of processing such unknown"E<\fIidentifier\fR>\*(L" codes by firing an event especiallyfor such codes, or by adding a special node-type to the in-memorydocument tree. Such \*(R"E<\fIidentifier\fR>" may have special meaningto some processors, or some processors may choose to add them toa special error report..IP "\(bu" 4Pod parsers must also support the \s-1XHTML\s0 codes \*(L"E<quot>\*(R" forcharacter 34 (doublequote, \*(L"), \*(R"E<amp>\*(L" for character 38(ampersand, &), and \*(R"E<apos>" for character 39 (apostrophe, ')..IP "\(bu" 4Note that in all cases of \*(L"E<whatever>\*(R", \fIwhatever\fR (whetheran htmlname, or a number in any base) must consist only ofalphanumeric characters \*(-- that is, \fIwhatever\fR must watch\&\f(CW\*(C`m/\eA\ew+\ez/\*(C'\fR. So \*(L"E< 0 1 2 3 >\*(R" is invalid, becauseit contains spaces, which aren't alphanumeric characters. Thispresumably does not \fIneed\fR special treatment by a Pod processor;\&\*(L" 0 1 2 3 \*(R" doesn't look like a number in any base, so it wouldpresumably be looked up in the table of HTML-like names. Sincethere isn't (and cannot be) an HTML-like entity called \*(L" 0 1 2 3 \*(R",this will be treated as an error. However, Pod processors maytreat \*(L"E< 0 1 2 3 >\*(R" or \*(L"E<e\-acute>\*(R" as \fIsyntactically\fRinvalid, potentially earning a different error message than theerror message (or warning, or event) generated by a merely unknown(but theoretically valid) htmlname, as in \*(L"E<qacute>\*(R"[sic]. However, Pod parsers are not required to make thisdistinction..IP "\(bu" 4Note that E<number> \fImust not\fR be interpreted as simply"codepoint \fInumber\fR in the current/native character set\*(L". It alwaysmeans only \*(R"the character represented by codepoint \fInumber\fR inUnicode." (This is identical to the semantics of &#\fInumber\fR; in \s-1XML\s0.).SpThis will likely require many formatters to have tables mapping fromtreatable Unicode codepoints (such as the \*(L"\exE9\*(R" for the e\-acutecharacter) to the escape sequences or codes necessary for conveyingsuch sequences in the target output format. A converter to *roffwould, for example know that \*(L"\exE9\*(R" (whether conveyed literally, or viaa E<...> sequence) is to be conveyed as \*(L"e\e\e*'\*(R".Similarly, a program rendering Pod in a Mac \s-1OS\s0 application window, wouldpresumably need to know that \*(L"\exE9\*(R" maps to codepoint 142 in MacRomanencoding that (at time of writing) is native for Mac \s-1OS\s0. SuchUnicode2whatever mappings are presumably already widely available forcommon output formats. (Such mappings may be incomplete! Implementersare not expected to bend over backwards in an attempt to renderCherokee syllabics, Etruscan runes, Byzantine musical symbols, or anyof the other weird things that Unicode can encode.) Andif a Pod document uses a character not found in such a mapping, theformatter should consider it an unrenderable character..IP "\(bu" 4If, surprisingly, the implementor of a Pod formatter can't find asatisfactory pre-existing table mapping from Unicode characters toescapes in the target format (e.g., a decent table of Unicodecharacters to *roff escapes), it will be necessary to build such atable. If you are in this circumstance, you should begin with thecharacters in the range 0x00A0 \- 0x00FF, which is mostly the heavilyused accented characters. Then proceed (as patience permits andfastidiousness compels) through the characters that the (X)HTMLstandards groups judged important enough to merit mnemonicsfor. These are declared in the (X)HTML specifications at thewww.W3.org site. At time of writing (September 2001), the most recententity declaration files are:.Sp.Vb 3\& http://www.w3.org/TR/xhtml1/DTD/xhtml\-lat1.ent\& http://www.w3.org/TR/xhtml1/DTD/xhtml\-special.ent\& http://www.w3.org/TR/xhtml1/DTD/xhtml\-symbol.ent.Ve.SpThen you can progress through any remaining notable Unicode charactersin the range 0x2000\-0x204D (consult the character tables atwww.unicode.org), and whatever else strikes your fancy. For example,in \fIxhtml\-symbol.ent\fR, there is the entry:.Sp.Vb 1\& <!ENTITY infin "∞"> <!\-\- infinity, U+221E ISOtech \-\->.Ve.SpWhile the mapping \*(L"infin\*(R" to the character \*(L"\ex{221E}\*(R" will (hopefully)have been already handled by the Pod parser, the presence of thecharacter in this file means that it's reasonably important enough toinclude in a formatter's table that maps from notable Unicode charactersto the codes necessary for rendering them. So for a Unicode\-to\-*roffmapping, for example, this would merit the entry:.Sp.Vb 1\& "\ex{221E}" => \*(Aq\e(in\*(Aq,.Ve.SpIt is eagerly hoped that in the future, increasing numbers of formats(and formatters) will support Unicode characters directly (as (X)HTMLdoes with \f(CW\*(C`∞\*(C'\fR, \f(CW\*(C`∞\*(C'\fR, or \f(CW\*(C`∞\*(C'\fR), reducing the needfor idiosyncratic mappings of Unicode\-to\-\fImy_escapes\fR..IP "\(bu" 4It is up to individual Pod formatter to display good judgement whenconfronted with an unrenderable character (which is distinct from anunknown E<thing> sequence that the parser couldn't resolve toanything, renderable or not). It is good practice to map Latin letterswith diacritics (like \*(L"E<eacute>\*(R"/\*(L"E<233>\*(R") to the correspondingunaccented US-ASCII letters (like a simple character 101, \*(L"e\*(R"), butclearly this is often not feasible, and an unrenderable character maybe represented as \*(L"?\*(R", or the like. In attempting a sane fallback(as from E<233> to \*(L"e\*(R"), Pod formatters may use the\&\f(CW%Latin1Code_to_fallback\fR table in Pod::Escapes, orText::Unidecode, if available..SpFor example, this Pod text:.Sp.Vb 1\& magic is enabled if you set C<$Currency> to \*(AqE<euro>\*(Aq..Ve.Spmay be rendered as:"magic is enabled if you set \f(CW$Currency\fR to '\fI?\fR'\*(L" or as\&\*(R"magic is enabled if you set \f(CW$Currency\fR to '\fB[euro]\fR'\*(L", or as\&\*(R"magic is enabled if you set \f(CW$Currency\fR to '[x20AC]', etc..SpA Pod formatter may also note, in a comment or warning, a list of whatunrenderable characters were encountered..IP "\(bu" 4E<...> may freely appear in any formatting code (other thanin another E<...> or in an Z<>). That is, \*(L"X<TheE<euro>1,000,000 Solution>\*(R" is valid, as is \*(L"L<TheE<euro>1,000,000 Solution|Million::Euros>\*(R"..IP "\(bu" 4Some Pod formatters output to formats that implement non-breakingspaces as an individual character (which I'll call \*(L"\s-1NBSP\s0\*(R"), andothers output to formats that implement non-breaking spaces just asspaces wrapped in a \*(L"don't break this across lines\*(R" code. Note thatat the level of Pod, both sorts of codes can occur: Pod can contain a\&\s-1NBSP\s0 character (whether as a literal, or as a \*(L"E<160>\*(R" or\&\*(L"E<nbsp>\*(R" code); and Pod can contain \*(L"S<fooI<bar> baz>\*(R" codes, where \*(L"mere spaces\*(R" (character 32) insuch codes are taken to represent non-breaking spaces. Podparsers should consider supporting the optional parsing of \*(L"S<fooI<bar> baz>\*(R" as if it were"foo\fI\s-1NBSP\s0\fRI<bar>\fI\s-1NBSP\s0\fRbaz", and, going the other way, theoptional parsing of groups of words joined by \s-1NBSP\s0's as if each groupwere in a S<...> code, so that formatters may use therepresentation that maps best to what the output format demands..IP "\(bu" 4Some processors may find that the \f(CW\*(C`S<...>\*(C'\fR code is easiest toimplement by replacing each space in the parse tree under the contentof the S, with an \s-1NBSP\s0. But note: the replacement should apply \fInot\fR tospaces in \fIall\fR text, but \fIonly\fR to spaces in \fIprintable\fR text. (Thisdistinction may or may not be evident in the particular tree/eventmodel implemented by the Pod parser.) For example, consider thisunusual case:.Sp.Vb 1\& S<L</Autoloaded Functions>>.Ve.SpThis means that the space in the middle of the visible link text mustnot be broken across lines. In other words, it's the same as this:.Sp.Vb 1\& L<"AutoloadedE<160>Functions"/Autoloaded Functions>.Ve.SpHowever, a misapplied space-to-NBSP replacement could (wrongly)produce something equivalent to this:.Sp.Vb 1\& L<"AutoloadedE<160>Functions"/AutoloadedE<160>Functions>.Ve.Sp\&...which is almost definitely not going to work as a hyperlink (assumingthis formatter outputs a format supporting hypertext)..SpFormatters may choose to just not support the S format code,especially in cases where the output format simply has no \s-1NBSP\s0character/code and no code for \*(L"don't break this stuff across lines\*(R"..IP "\(bu" 4Besides the \s-1NBSP\s0 character discussed above, implementors are remindedof the existence of the other \*(L"special\*(R" character in Latin\-1, the\&\*(L"soft hyphen\*(R" character, also known as \*(L"discretionary hyphen\*(R",i.e. \f(CW\*(C`E<173>\*(C'\fR = \f(CW\*(C`E<0xAD>\*(C'\fR =\&\f(CW\*(C`E<shy>\*(C'\fR). This character expresses an optional hyphenationpoint. That is, it normally renders as nothing, but may render as a\&\*(L"\-\*(R" if a formatter breaks the word at that point. Pod formattersshould, as appropriate, do one of the following: 1) render this witha code with the same meaning (e.g., \*(L"\e\-\*(R" in \s-1RTF\s0), 2) pass it throughin the expectation that the formatter understands this character assuch, or 3) delete it..SpFor example:.Sp.Vb 3\& sigE<shy>action\& manuE<shy>script\& JarkE<shy>ko HieE<shy>taE<shy>nieE<shy>mi.Ve.SpThese signal to a formatter that if it is to hyphenate \*(L"sigaction\*(R"or \*(L"manuscript\*(R", then it should be done as"sig\-\fI[linebreak]\fRaction\*(L" or \*(R"manu\-\fI[linebreak]\fRscript"(and if it doesn't hyphenate it, then the \f(CW\*(C`E<shy>\*(C'\fR doesn'tshow up at all). And if it isto hyphenate \*(L"Jarkko\*(R" and/or \*(L"Hietaniemi\*(R", it can doso only at the points where there is a \f(CW\*(C`E<shy>\*(C'\fR code..SpIn practice, it is anticipated that this character will not be usedoften, but formatters should either support it, or delete it..IP "\(bu" 4If you think that you want to add a new command to Pod (like, say, a\&\*(L"=biblio\*(R" command), consider whether you could get the sameeffect with a for or begin/end sequence: \*(L"=for biblio ...\*(R" or \*(L"=beginbiblio\*(R" ... \*(L"=end biblio\*(R". Pod processors that don't understand\&\*(L"=for biblio\*(R", etc, will simply ignore it, whereas they may complainloudly if they see \*(L"=biblio\*(R"..IP "\(bu" 4Throughout this document, \*(L"Pod\*(R" has been the preferred spelling forthe name of the documentation format. One may also use \*(L"\s-1POD\s0\*(R" or\&\*(L"pod\*(R". For the documentation that is (typically) in the Podformat, you may use \*(L"pod\*(R", or \*(L"Pod\*(R", or \*(L"\s-1POD\s0\*(R". Understanding thesedistinctions is useful; but obsessing over how to spell them, usuallyis not..SH "About L<...> Codes".IX Header "About L<...> Codes"As you can tell from a glance at perlpod, the L<...>code is the most complex of the Pod formatting codes. The points belowwill hopefully clarify what it means and how processors should dealwith it..IP "\(bu" 4In parsing an L<...> code, Pod parsers must distinguish at leastfour attributes:.RS 4.IP "First:" 4.IX Item "First:"The link-text. If there is none, this must be undef. (E.g., in\&\*(L"L<Perl Functions|perlfunc>\*(R", the link-text is \*(L"Perl Functions\*(R".In \*(L"L<Time::HiRes>\*(R" and even \*(L"L<|Time::HiRes>\*(R", there is nolink text. Note that link text may contain formatting.).IP "Second:" 4.IX Item "Second:"The possibly inferred link-text \*(-- i.e., if there was no real linktext, then this is the text that we'll infer in its place. (E.g., for\&\*(L"L<Getopt::Std>\*(R", the inferred link text is \*(L"Getopt::Std\*(R".).IP "Third:" 4.IX Item "Third:"The name or \s-1URL\s0, or undef if none. (E.g., in \*(L"L<PerlFunctions|perlfunc>\*(R", the name \*(-- also sometimes called the page \*(--is \*(L"perlfunc\*(R". In \*(L"L</CAVEATS>\*(R", the name is undef.).IP "Fourth:" 4.IX Item "Fourth:"The section (\s-1AKA\s0 \*(L"item\*(R" in older perlpods), or undef if none. E.g.,in \*(L"L<Getopt::Std/DESCRIPTION>\*(R", \*(L"\s-1DESCRIPTION\s0\*(R" is the section. (Notethat this is not the same as a manpage section like the \*(L"5\*(R" in \*(L"man 5crontab\*(R". \*(L"Section Foo\*(R" in the Pod sense means the part of the textthat's introduced by the heading or item whose text is \*(L"Foo\*(R".).RE.RS 4.SpPod parsers may also note additional attributes including:.IP "Fifth:" 4.IX Item "Fifth:"A flag for whether item 3 (if present) is a \s-1URL\s0 (like\&\*(L"http://lists.perl.org\*(R" is), in which case there should be no sectionattribute; a Pod name (like \*(L"perldoc\*(R" and \*(L"Getopt::Std\*(R" are); orpossibly a man page name (like \*(L"\fIcrontab\fR\|(5)\*(R" is)..IP "Sixth:" 4.IX Item "Sixth:"The raw original L<...> content, before text is split on\&\*(L"|\*(R", \*(L"/\*(R", etc, and before E<...> codes are expanded..RE.RS 4.Sp(The above were numbered only for concise reference below. It is nota requirement that these be passed as an actual list or array.).SpFor example:.Sp.Vb 7\& L<Foo::Bar>\& => undef, # link text\& "Foo::Bar", # possibly inferred link text\& "Foo::Bar", # name\& undef, # section\& \*(Aqpod\*(Aq, # what sort of link\& "Foo::Bar" # original content\&\& L<Perlport\*(Aqs section on NL\*(Aqs|perlport/Newlines>\& => "Perlport\*(Aqs section on NL\*(Aqs", # link text\& "Perlport\*(Aqs section on NL\*(Aqs", # possibly inferred link text\& "perlport", # name
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -