📄 encode::supported.3
字号:
.\" Automatically generated by Pod::Man 2.16 (Pod::Simple 3.05).\".\" Standard preamble:.\" ========================================================================.de Sh \" Subsection heading.br.if t .Sp.ne 5.PP\fB\\$1\fR.PP...de Sp \" Vertical space (when we can't use .PP).if t .sp .5v.if n .sp...de Vb \" Begin verbatim text.ft CW.nf.ne \\$1...de Ve \" End verbatim text.ft R.fi...\" Set up some character translations and predefined strings. \*(-- will.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left.\" double quote, and \*(R" will give a right double quote. \*(C+ will.\" give a nicer C++. Capital omega is used to do unbreakable dashes and.\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff,.\" nothing in troff, for use with C<>..tr \(*W-.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'.ie n \{\. ds -- \(*W-. ds PI pi. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch. ds L" "". ds R" "". ds C` "". ds C' ""'br\}.el\{\. ds -- \|\(em\|. ds PI \(*p. ds L" ``. ds R" '''br\}.\".\" Escape single quotes in literal strings from groff's Unicode transform..ie \n(.g .ds Aq \(aq.el .ds Aq '.\".\" If the F register is turned on, we'll generate index entries on stderr for.\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index.\" entries marked with X<> in POD. Of course, you'll have to process the.\" output yourself in some meaningful fashion..ie \nF \{\. de IX. tm Index:\\$1\t\\n%\t"\\$2"... nr % 0. rr F.\}.el \{\. de IX...\}.\".\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2)..\" Fear. Run. Save yourself. No user-serviceable parts.. \" fudge factors for nroff and troff.if n \{\. ds #H 0. ds #V .8m. ds #F .3m. ds #[ \f1. ds #] \fP.\}.if t \{\. ds #H ((1u-(\\\\n(.fu%2u))*.13m). ds #V .6m. ds #F 0. ds #[ \&. ds #] \&.\}. \" simple accents for nroff and troff.if n \{\. ds ' \&. ds ` \&. ds ^ \&. ds , \&. ds ~ ~. ds /.\}.if t \{\. ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u". ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'. ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'. ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'. ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'. ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'.\}. \" troff and (daisy-wheel) nroff accents.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'.ds 8 \h'\*(#H'\(*b\h'-\*(#H'.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#].ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#].ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#].ds ae a\h'-(\w'a'u*4/10)'e.ds Ae A\h'-(\w'A'u*4/10)'E. \" corrections for vroff.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'. \" for low resolution devices (crt and lpr).if \n(.H>23 .if \n(.V>19 \\{\. ds : e. ds 8 ss. ds o a. ds d- d\h'-1'\(ga. ds D- D\h'-1'\(hy. ds th \o'bp'. ds Th \o'LP'. ds ae ae. ds Ae AE.\}.rm #[ #] #H #V #F C.\" ========================================================================.\".IX Title "Encode::Supported 3".TH Encode::Supported 3 "2007-12-18" "perl v5.10.0" "Perl Programmers Reference Guide".\" For nroff, turn off justification. Always turn off hyphenation; it makes.\" way too many mistakes in technical documents..if n .ad l.nh.SH "NAME"Encode::Supported \-\- Encodings supported by Encode.SH "DESCRIPTION".IX Header "DESCRIPTION".Sh "Encoding Names".IX Subsection "Encoding Names"Encoding names are case insensitive. White space in namesis ignored. In addition, an encoding may have aliases.Each encoding has one \*(L"canonical\*(R" name. The \*(L"canonical\*(R"name is chosen from the names of the encoding by pickingthe first in the following sequence (with a few exceptions)..IP "\(bu" 2The name used by the Perl community. That includes 'utf8' and 'ascii'.Unlike aliases, canonical names directly reach the method so suchfrequently used words like 'utf8' don't need to do alias lookups..IP "\(bu" 2The \s-1MIME\s0 name as defined in \s-1IETF\s0 RFCs. This includes all \*(L"iso\-\*(R"s..IP "\(bu" 2The name in the \s-1IANA\s0 registry..IP "\(bu" 2The name used by the organization that defined it..PPIn case \fIde jure\fR canonical names differ from that of the Encodemodule, they are always aliased if it ever be implemented. So you cansafely tell if a given encoding is implemented or not just by passing the canonical name..PPBecause of all the alias issues, and because in the general case encodings have state, \*(L"Encode\*(R" uses an encoding object internally once an operation is in progress..SH "Supported Encodings".IX Header "Supported Encodings"As of Perl 5.8.0, at least the following encodings are recognized.Note that unless otherwise specified, they are all case insensitive(via alias) and all occurrence of spaces are replaced with '\-'.In other words, \*(L"\s-1ISO\s0 8859 1\*(R" and \*(L"iso\-8859\-1\*(R" are identical..PPEncodings are categorized and implemented in several different modulesbut you don't have to \f(CW\*(C`use Encode::XX\*(C'\fR to make them available formost cases. Encode.pm will automatically load those modules on demand..Sh "Built-in Encodings".IX Subsection "Built-in Encodings"The following encodings are always available..PP.Vb 8\& Canonical Aliases Comments & References\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\& ascii US\-ascii ISO\-646\-US [ECMA]\& ascii\-ctrl Special Encoding\& iso\-8859\-1 latin1 [ISO]\& null Special Encoding\& utf8 UTF\-8 [RFC2279]\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.PP\&\fInull\fR and \fIascii-ctrl\fR are special. \*(L"null\*(R" fails for all characterso when you set fallback mode to \s-1PERLQQ\s0, \s-1HTMLCREF\s0 or \s-1XMLCREF\s0, \s-1ALL\s0\&\s-1CHARACTERS\s0 will fall back to character references. Ditto for\&\*(L"ascii-ctrl\*(R" except for control characters. For fallback modes, seeEncode..Sh "Encode::Unicode \*(-- other Unicode encodings".IX Subsection "Encode::Unicode other Unicode encodings"Unicode coding schemes other than native utf8 are supported byEncode::Unicode, which will be autoloaded on demand..PP.Vb 11\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\& UCS\-2BE UCS\-2, iso\-10646\-1 [IANA, UC]\& UCS\-2LE [UC]\& UTF\-16 [UC]\& UTF\-16BE [UC]\& UTF\-16LE [UC]\& UTF\-32 [UC]\& UTF\-32BE UCS\-4 [UC]\& UTF\-32LE [UC]\& UTF\-7 [RFC2152]\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.PPTo find how (UCS\-2|UTF\-(16|32))(LE|BE)? differ from one another,see Encode::Unicode..PP\&\s-1UTF\-7\s0 is a special encoding which \*(L"re-encodes\*(R" \s-1UTF\-16BE\s0 into a 7\-bitencoding. It is implemented seperately by Encode::Unicode::UTF7..Sh "Encode::Byte \*(-- Extended \s-1ASCII\s0".IX Subsection "Encode::Byte Extended ASCII"Encode::Byte implements most single-byte encodings except forSymbols and \s-1EBCDIC\s0. The following encodings are based on single-byteencodings implemented as extended \s-1ASCII\s0. Most of them map\&\ex80\-\exff (upper half) to non-ASCII characters..IP "\s-1ISO\-8859\s0 and corresponding vendor mappings" 2.IX Item "ISO-8859 and corresponding vendor mappings"Since there are so many, they are presented in table format withlanguages and corresponding encoding names by vendors. Note thatthe table is sorted in order of \s-1ISO\-8859\s0 and the corresponding vendormappings are slightly different from that of \s-1ISO\s0. See<http://czyborra.com/charsets/iso8859.html> for details..Sp.Vb 10\& Lang/Regions ISO/Other Std. DOS Windows Macintosh Others\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\& N. America (ASCII) cp437 AdobeStandardEncoding\& cp863 (DOSCanadaF)\& W. Europe iso\-8859\-1 cp850 cp1252 MacRoman nextstep\& hp\-roman8\& cp860 (DOSPortuguese)\& Cntrl. Europe iso\-8859\-2 cp852 cp1250 MacCentralEurRoman\& MacCroatian\& MacRomanian\& MacRumanian\& Latin3[1] iso\-8859\-3 \& Latin4[2] iso\-8859\-4 \& Cyrillics iso\-8859\-5 cp855 cp1251 MacCyrillic\& (See also next section) cp866 MacUkrainian\& Arabic iso\-8859\-6 cp864 cp1256 MacArabic\& cp1006 MacFarsi\& Greek iso\-8859\-7 cp737 cp1253 MacGreek\& cp869 (DOSGreek2)\& Hebrew iso\-8859\-8 cp862 cp1255 MacHebrew\& Turkish iso\-8859\-9 cp857 cp1254 MacTurkish\& Nordics iso\-8859\-10 cp865\& cp861 MacIcelandic\& MacSami\& Thai iso\-8859\-11[3] cp874 MacThai\& (iso\-8859\-12 is nonexistent. Reserved for Indics?)\& Baltics iso\-8859\-13 cp775 cp1257\& Celtics iso\-8859\-14\& Latin9 [4] iso\-8859\-15\& Latin10 iso\-8859\-16\& Vietnamese viscii cp1258 MacVietnamese\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&\& [1] Esperanto, Maltese, and Turkish. Turkish is now on 8859\-9.\& [2] Baltics. Now on 8859\-10, except for Latvian.\& [3] TIS 620 + Non\-Breaking Space (0xA0 / U+00A0)\& [4] Nicknamed Latin0; the Euro sign as well as French and Finnish\& letters that are missing from 8859\-1 were added..Ve.SpAll cp* are also available as ibm\-*, ms\-*, and windows\-* . See also<http://czyborra.com/charsets/codepages.html>..SpMacintosh encodings don't seem to be registered in such entities as\&\s-1IANA\s0. \*(L"Canonical\*(R" names in Encode are based upon Apple's Tech Note1150. See <http://developer.apple.com/technotes/tn/tn1150.html> for details..IP "\s-1KOI8\s0 \- De Facto Standard for the Cyrillic world" 2.IX Item "KOI8 - De Facto Standard for the Cyrillic world"Though \s-1ISO\-8859\s0 does have \s-1ISO\-8859\-5\s0, the \s-1KOI8\s0 series is far morepopular in the Net. Encode comes with the following \s-1KOI\s0 charsets.For gory details, see <http://czyborra.com/charsets/cyrillic.html>.Sp.Vb 5\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\& koi8\-f \& koi8\-r cp878 [RFC1489]\& koi8\-u [RFC2319]\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.Sh "gsm0338 \- Hentai Latin 1".IX Subsection "gsm0338 - Hentai Latin 1"\&\s-1GSM0338\s0 is for \s-1GSM\s0 handsets. Though it shares alphanumerals with\&\s-1ASCII\s0, control character ranges and other parts are mapped verydifferently, mainly to store Greek characters. There are also escapesequences (starting with 0x1B) to cover e.g. the Euro sign..PPThis was once handled by Encode::Bytes but because of all thoseunusual specifications, Encode 2.20 has relocated the support toEncode::GSM0338. See Encode::GSM0338 for details..IP "gsm0338 support before 2.19" 2.IX Item "gsm0338 support before 2.19"Some special cases like a trailing 0x00 byte or a lone 0x1B byte are notwell-defined and \fIdecode()\fR will return an empty string for them.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -