📄 perlpodspec.pod
字号:
in CE<lt>...> formatting codes, and never I<ever> to text in verbatimparagraphs.=item *When rendering Pod to a format that has two kinds of hyphens (-), onethat's a non-breaking hyphen, and another that's a breakable hyphen(as in "object-oriented", which can be split across lines as"object-", newline, "oriented"), formatters are encouraged togenerally translate "-" to non-breaking hyphen, but may applyheuristics to convert some of these to breaking hyphens.=item *Pod formatters should make reasonable efforts to keep words of Perlcode from being broken across lines. For example, "Foo::Bar" in someformatting systems is seen as eligible for being broken across linesas "Foo::" newline "Bar" or even "Foo::-" newline "Bar". This shouldbe avoided where possible, either by disabling all line-breaking inmid-word, or by wrapping particular words with internal punctuationin "don't break this across lines" codes (which in some formats maynot be a single code, but might be a matter of inserting non-breakingzero-width spaces between every pair of characters in a word.)=item *Pod parsers should, by default, expand tabs in verbatim paragraphs asthey are processed, before passing them to the formatter or otherprocessor. Parsers may also allow an option for overriding this.=item *Pod parsers should, by default, remove newlines from the end ofordinary and verbatim paragraphs before passing them to theformatter. For example, while the paragraph you're reading nowcould be considered, in Pod source, to end with (and contain)the newline(s) that end it, it should be processed as ending with(and containing) the period character that ends this sentence.=item *Pod parsers, when reporting errors, should make some effort to reportan approximate line number ("Nested EE<lt>>'s in Paragraph #52, nearline 633 of Thing/Foo.pm!"), instead of merely noting the paragraphnumber ("Nested EE<lt>>'s in Paragraph #52 of Thing/Foo.pm!"). Wherethis is problematic, the paragraph number should at least beaccompanied by an excerpt from the paragraph ("Nested EE<lt>>'s inParagraph #52 of Thing/Foo.pm, which begins 'Read/write accessor forthe CE<lt>interest rate> attribute...'").=item *Pod parsers, when processing a series of verbatim paragraphs oneafter another, should consider them to be one large verbatimparagraph that happens to contain blank lines. I.e., these twolines, which have a blank line between them: use Foo; print Foo->VERSIONshould be unified into one paragraph ("\tuse Foo;\n\n\tprintFoo->VERSION") before being passed to the formatter or otherprocessor. Parsers may also allow an option for overriding this.While this might be too cumbersome to implement in event-based Podparsers, it is straightforward for parsers that return parse trees.=item *Pod formatters, where feasible, are advised to avoid splitting shortverbatim paragraphs (under twelve lines, say) across pages.=item *Pod parsers must treat a line with only spaces and/or tabs on it as a"blank line" such as separates paragraphs. (Some older parsersrecognized only two adjacent newlines as a "blank line" but would notrecognize a newline, a space, and a newline, as a blank line. Thisis noncompliant behavior.)=item *Authors of Pod formatters/processors should make every effort toavoid writing their own Pod parser. There are already several inCPAN, with a wide range of interface styles -- and one of them,Pod::Parser, comes with modern versions of Perl.=item *Characters in Pod documents may be conveyed either as literals, or bynumber in EE<lt>n> codes, or by an equivalent mnemonic, as inEE<lt>eacute> which is exactly equivalent to EE<lt>233>.Characters in the range 32-126 refer to those well known US-ASCIIcharacters (also defined there by Unicode, with the same meaning),which all Pod formatters must render faithfully. Charactersin the ranges 0-31 and 127-159 should not be used (neither asliterals, nor as EE<lt>number> codes), except for theliteral byte-sequences for newline (13, 13 10, or 10), and tab (9).Characters in the range 160-255 refer to Latin-1 characters (alsodefined there by Unicode, with the same meaning). Characters above255 should be understood to refer to Unicode characters.=item *Be warnedthat some formatters cannot reliably render characters outside 32-126;and many are able to handle 32-126 and 160-255, but nothing above255.=item *Besides the well-known "EE<lt>lt>" and "EE<lt>gt>" codes forless-than and greater-than, Pod parsers must understand "EE<lt>sol>"for "/" (solidus, slash), and "EE<lt>verbar>" for "|" (vertical bar,pipe). Pod parsers should also understand "EE<lt>lchevron>" and"EE<lt>rchevron>" as legacy codes for characters 171 and 187, i.e.,"left-pointing double angle quotation mark" = "left pointingguillemet" and "right-pointing double angle quotation mark" = "rightpointing guillemet". (These look like little "<<" and ">>", and theyare now preferably expressed with the HTML/XHTML codes "EE<lt>laquo>"and "EE<lt>raquo>".)=item *Pod parsers should understand all "EE<lt>html>" codes as definedin the entity declarations in the most recent XHTML specification atC<www.W3.org>. Pod parsers must understand at least the entitiesthat define characters in the range 160-255 (Latin-1). Pod parsers,when faced with some unknown "EE<lt>I<identifier>>" code,shouldn't simply replace it with nullstring (by default, at least),but may pass it through as a string consisting of the literal charactersE, less-than, I<identifier>, greater-than. Or Pod parsers may offer thealternative option of processing such unknown"EE<lt>I<identifier>>" codes by firing an event especiallyfor such codes, or by adding a special node-type to the in-memorydocument tree. Such "EE<lt>I<identifier>>" may have special meaningto some processors, or some processors may choose to add them toa special error report.=item *Pod parsers must also support the XHTML codes "EE<lt>quot>" forcharacter 34 (doublequote, "), "EE<lt>amp>" for character 38(ampersand, &), and "EE<lt>apos>" for character 39 (apostrophe, ').=item *Note that in all cases of "EE<lt>whatever>", I<whatever> (whetheran htmlname, or a number in any base) must consist only ofalphanumeric characters -- that is, I<whatever> must watchC<m/\A\w+\z/>. So "EE<lt> 0 1 2 3 >" is invalid, becauseit contains spaces, which aren't alphanumeric characters. Thispresumably does not I<need> special treatment by a Pod processor;" 0 1 2 3 " doesn't look like a number in any base, so it wouldpresumably be looked up in the table of HTML-like names. Sincethere isn't (and cannot be) an HTML-like entity called " 0 1 2 3 ",this will be treated as an error. However, Pod processors maytreat "EE<lt> 0 1 2 3 >" or "EE<lt>e-acute>" as I<syntactically>invalid, potentially earning a different error message than theerror message (or warning, or event) generated by a merely unknown(but theoretically valid) htmlname, as in "EE<lt>qacute>"[sic]. However, Pod parsers are not required to make thisdistinction.=item *Note that EE<lt>number> I<must not> be interpreted as simply"codepoint I<number> in the current/native character set". It alwaysmeans only "the character represented by codepoint I<number> inUnicode." (This is identical to the semantics of &#I<number>; in XML.)This will likely require many formatters to have tables mapping fromtreatable Unicode codepoints (such as the "\xE9" for the e-acutecharacter) to the escape sequences or codes necessary for conveyingsuch sequences in the target output format. A converter to *roffwould, for example know that "\xE9" (whether conveyed literally, or viaa EE<lt>...> sequence) is to be conveyed as "e\\*'".Similarly, a program rendering Pod in a Mac OS application window, wouldpresumably need to know that "\xE9" maps to codepoint 142 in MacRomanencoding that (at time of writing) is native for Mac OS. SuchUnicode2whatever mappings are presumably already widely available forcommon output formats. (Such mappings may be incomplete! Implementersare not expected to bend over backwards in an attempt to renderCherokee syllabics, Etruscan runes, Byzantine musical symbols, or anyof the other weird things that Unicode can encode.) Andif a Pod document uses a character not found in such a mapping, theformatter should consider it an unrenderable character.=item *If, surprisingly, the implementor of a Pod formatter can't find asatisfactory pre-existing table mapping from Unicode characters toescapes in the target format (e.g., a decent table of Unicodecharacters to *roff escapes), it will be necessary to build such atable. If you are in this circumstance, you should begin with thecharacters in the range 0x00A0 - 0x00FF, which is mostly the heavilyused accented characters. Then proceed (as patience permits andfastidiousness compels) through the characters that the (X)HTMLstandards groups judged important enough to merit mnemonicsfor. These are declared in the (X)HTML specifications at thewww.W3.org site. At time of writing (September 2001), the most recententity declaration files are: http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.entThen you can progress through any remaining notable Unicode charactersin the range 0x2000-0x204D (consult the character tables atwww.unicode.org), and whatever else strikes your fancy. For example,in F<xhtml-symbol.ent>, there is the entry: <!ENTITY infin "∞"> <!-- infinity, U+221E ISOtech -->While the mapping "infin" to the character "\x{221E}" will (hopefully)have been already handled by the Pod parser, the presence of thecharacter in this file means that it's reasonably important enough toinclude in a formatter's table that maps from notable Unicode charactersto the codes necessary for rendering them. So for a Unicode-to-*roffmapping, for example, this would merit the entry: "\x{221E}" => '\(in',It is eagerly hoped that in the future, increasing numbers of formats(and formatters) will support Unicode characters directly (as (X)HTMLdoes with C<∞>, C<∞>, or C<∞>), reducing the needfor idiosyncratic mappings of Unicode-to-I<my_escapes>.=item *It is up to individual Pod formatter to display good judgement whenconfronted with an unrenderable character (which is distinct from anunknown EE<lt>thing> sequence that the parser couldn't resolve toanything, renderable or not). It is good practice to map Latin letterswith diacritics (like "EE<lt>eacute>"/"EE<lt>233>") to the correspondingunaccented US-ASCII letters (like a simple character 101, "e"), butclearly this is often not feasible, and an unrenderable character maybe represented as "?", or the like. In attempting a sane fallback(as from EE<lt>233> to "e"), Pod formatters may use the%Latin1Code_to_fallback table in L<Pod::Escapes|Pod::Escapes>, orL<Text::Unidecode|Text::Unidecode>, if available.For example, this Pod text: magic is enabled if you set C<$Currency> to 'E<euro>'.may be rendered as:"magic is enabled if you set C<$Currency> to 'I<?>'" or as"magic is enabled if you set C<$Currency> to 'B<[euro]>'", or as"magic is enabled if you set C<$Currency> to '[x20AC]', etc.A Pod formatter may also note, in a comment or warning, a list of whatunrenderable characters were encountered.=item *EE<lt>...> may freely appear in any formatting code (other thanin another EE<lt>...> or in an ZE<lt>>). That is, "XE<lt>TheEE<lt>euro>1,000,000 Solution>" is valid, as is "LE<lt>TheEE<lt>euro>1,000,000 Solution|Million::Euros>".=item *Some Pod formatters output to formats that implement non-breakingspaces as an individual character (which I'll call "NBSP"), andothers output to formats that implement non-breaking spaces just asspaces wrapped in a "don't break this across lines" code. Note thatat the level of Pod, both sorts of codes can occur: Pod can contain aNBSP character (whether as a literal, or as a "EE<lt>160>" or"EE<lt>nbsp>" code); and Pod can contain "SE<lt>fooIE<lt>barE<gt> baz>" codes, where "mere spaces" (character 32) insuch codes are taken to represent non-breaking spaces. Podparsers should consider supporting the optional parsing of "SE<lt>fooIE<lt>barE<gt> baz>" as if it were"fooI<NBSP>IE<lt>barE<gt>I<NBSP>baz", and, going the other way, theoptional parsing of groups of words joined by NBSP's as if each groupwere in a SE<lt>...> code, so that formatters may use therepresentation that maps best to what the output format demands.=item *Some processors may find that the C<SE<lt>...E<gt>> code is easiest toimplement by replacing each space in the parse tree under the contentof the S, with an NBSP. But note: the replacement should apply I<not> tospaces in I<all> text, but I<only> to spaces in I<printable> text. (Thisdistinction may or may not be evident in the particular tree/eventmodel implemented by the Pod parser.) For example, consider thisunusual case: S<L</Autoloaded Functions>>This means that the space in the middle of the visible link text mustnot be broken across lines. In other words, it's the same as this: L<"AutoloadedE<160>Functions"/Autoloaded Functions>However, a misapplied space-to-NBSP replacement could (wrongly)produce something equivalent to this: L<"AutoloadedE<160>Functions"/AutoloadedE<160>Functions>...which is almost definitely not going to work as a hyperlink (assumingthis formatter outputs a format supporting hypertext).Formatters may choose to just not support the S format code,especially in cases where the output format simply has no NBSPcharacter/code and no code for "don't break this stuff across lines".=item *Besides the NBSP character discussed above, implementors are remindedof the existence of the other "special" character in Latin-1, the"soft hyphen" character, also known as "discretionary hyphen",i.e. C<EE<lt>173E<gt>> = C<EE<lt>0xADE<gt>> =C<EE<lt>shyE<gt>>). This character expresses an optional hyphenationpoint. That is, it normally renders as nothing, but may render as a"-" if a formatter breaks the word at that point. Pod formattersshould, as appropriate, do one of the following: 1) render this witha code with the same meaning (e.g., "\-" in RTF), 2) pass it throughin the expectation that the formatter understands this character assuch, or 3) delete it.For example: sigE<shy>action manuE<shy>script JarkE<shy>ko HieE<shy>taE<shy>nieE<shy>miThese signal to a formatter that if it is to hyphenate "sigaction"or "manuscript", then it should be done as"sig-I<[linebreak]>action" or "manu-I<[linebreak]>script"(and if it doesn't hyphenate it, then the C<EE<lt>shyE<gt>> doesn'tshow up at all). And if it isto hyphenate "Jarkko" and/or "Hietaniemi", it can doso only at the points where there is a C<EE<lt>shyE<gt>> code.In practice, it is anticipated that this character will not be usedoften, but formatters should either support it, or delete it.=item *If you think that you want to add a new command to Pod (like, say, a"=biblio" command), consider whether you could get the sameeffect with a for or begin/end sequence: "=for biblio ..." or "=beginbiblio" ... "=end biblio". Pod processors that don't understand"=for biblio", etc, will simply ignore it, whereas they may complainloudly if they see "=biblio".=item *Throughout this document, "Pod" has been the preferred spelling forthe name of the documentation format. One may also use "POD" or"pod". For the documentation that is (typically) in the Podformat, you may use "pod", or "Pod", or "POD". Understanding thesedistinctions is useful; but obsessing over how to spell them, usuallyis not.=back
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -