⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 encode::supported.3

📁 视频监控网络部分的协议ddns,的模块的实现代码,请大家大胆指正.
💻 3
📖 第 1 页 / 共 3 页
字号:
One possible workaround is.Sp.Vb 3\&   $gsm =~ s/\ex00\ez/\ex00\ex00/;\&   $uni = decode("gsm0338", $gsm);\&   $uni .= "\exA0" if $gsm =~ /\ex1B\ez/;.Ve.SpNote that the Encode implementation of \s-1GSM0338\s0 does not implement thereuse of Latin capital letters as Greek capital letters (for example,the 0x5A is U+005A (\s-1LATIN\s0 \s-1CAPITAL\s0 \s-1LETTER\s0 Z), not U+0396 (\s-1GREEK\s0 \s-1CAPITAL\s0\&\s-1LETTER\s0 \s-1ZETA\s0)..SpThe \s-1GSM0338\s0 is also covered in Encode::Byte even though it is notan \*(L"extended \s-1ASCII\s0\*(R" encoding..Sh "\s-1CJK:\s0 Chinese, Japanese, Korean (Multibyte)".IX Subsection "CJK: Chinese, Japanese, Korean (Multibyte)"Note that Vietnamese is listed above.  Also read \*(L"Encoding vs Charset\*(R"below.  Also note that these are implemented in distinct modules bycountries, due to the size concerns (simplified Chinese is mappedto '\s-1CN\s0', continental China, while traditional Chinese is mapped to\&'\s-1TW\s0', Taiwan).  Please refer to their respective documentation pages..IP "Encode::CN \*(-- Continental China" 2.IX Item "Encode::CN  Continental China".Vb 9\&  Standard      DOS/Win Macintosh                Comment/Reference\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  euc\-cn [1]            MacChineseSimp\&  (gbk)         cp936 [2]\&  gb12345\-raw                      { GB12345 without CES }\&  gb2312\-raw                       { GB2312  without CES }\&  hz\&  iso\-ir\-165\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&\&  [1] GB2312 is aliased to this.  See L<Microsoft\-related naming mess>\&  [2] gbk is aliased to this.  See L<Microsoft\-related naming mess>.Ve.IP "Encode::JP \*(-- Japan" 2.IX Item "Encode::JP  Japan".Vb 11\&  Standard      DOS/Win Macintosh                Comment/Reference\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  euc\-jp\&  shiftjis      cp932   macJapanese\&  7bit\-jis\&  iso\-2022\-jp                                            [RFC1468]\&  iso\-2022\-jp\-1                                          [RFC2237]\&  jis0201\-raw  { JIS X 0201 (roman + halfwidth kana) without CES }\&  jis0208\-raw  { JIS X 0208 (Kanji + fullwidth kana) without CES }\&  jis0212\-raw  { JIS X 0212 (Extended Kanji)         without CES }\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.IP "Encode::KR \*(-- Korea" 2.IX Item "Encode::KR  Korea".Vb 8\&  Standard      DOS/Win Macintosh                Comment/Reference\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  euc\-kr                MacKorean                        [RFC1557]\&                cp949 [1]                    \&  iso\-2022\-kr                                            [RFC1557]\&  johab                                  [KS X 1001:1998, Annex 3]\&  ksc5601\-raw                              { KSC5601 without CES }\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&\&  [1] ks_c_5601\-1987, (x\-)?windows\-949, and uhc are aliased to this.\&  See below..Ve.IP "Encode::TW \*(-- Taiwan" 2.IX Item "Encode::TW  Taiwan".Vb 5\&  Standard      DOS/Win Macintosh                Comment/Reference\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  big5\-eten     cp950   MacChineseTrad {big5 aliased to big5\-eten}\&  big5\-hkscs                              \&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.IP "Encode::HanExtra \*(-- More Chinese via \s-1CPAN\s0" 2.IX Item "Encode::HanExtra  More Chinese via CPAN"Due to the size concerns, additional Chinese encodings below aredistributed separately on \s-1CPAN\s0, under the name Encode::HanExtra..Sp.Vb 8\&  Standard      DOS/Win Macintosh                Comment/Reference\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  big5ext                                   CMEX\*(Aqs Big5e Extension\&  big5plus                                  CMEX\*(Aqs Big5+ Extension\&  cccii         Chinese Character Code for Information Interchange\&  euc\-tw                             EUC (Extended Unix Character)\&  gb18030                          GBK with Traditional Characters\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.IP "Encode::JIS2K \*(-- \s-1JIS\s0 X 0213 encodings via \s-1CPAN\s0" 2.IX Item "Encode::JIS2K  JIS X 0213 encodings via CPAN"Due to size concerns, additional Japanese encodings below aredistributed separately on \s-1CPAN\s0, under the name Encode::JIS2K..Sp.Vb 8\&  Standard      DOS/Win Macintosh                Comment/Reference\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  euc\-jisx0213\&  shiftjisx0123\&  iso\-2022\-jp\-3\&  jis0213\-1\-raw\&  jis0213\-2\-raw\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.Sh "Miscellaneous encodings".IX Subsection "Miscellaneous encodings".IP "Encode::EBCDIC" 2.IX Item "Encode::EBCDIC"See perlebcdic for details..Sp.Vb 8\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  cp37\&  cp500  \&  cp875  \&  cp1026  \&  cp1047  \&  posix\-bc\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.IP "Encode::Symbols" 2.IX Item "Encode::Symbols"For symbols  and dingbats..Sp.Vb 7\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  symbol\&  dingbats\&  MacDingbats\&  AdobeZdingbat\&  AdobeSymbol\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.IP "Encode::MIME::Header" 2.IX Item "Encode::MIME::Header"Strictly speaking, \s-1MIME\s0 header encoding documented in \s-1RFC\s0 2047 is moreof encapsulation than encoding.  However, their support in modernworld is imperative so they are supported..Sp.Vb 5\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\&  MIME\-Header                                            [RFC2047]\&  MIME\-B                                                 [RFC2047]\&  MIME\-Q                                                 [RFC2047]\&  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-.Ve.IP "Encode::Guess" 2.IX Item "Encode::Guess"This one is not a name of encoding but a utility that lets you pick upthe most appropriate encoding for a data out of given \fIsuspects\fR.  SeeEncode::Guess for details..SH "Unsupported encodings".IX Header "Unsupported encodings"The following encodings are not supported as yet; some because theyare rarely used, some because of technical difficulties.  They maybe supported by external modules via \s-1CPAN\s0 in the future, however..IP "\s-1ISO\-2022\-JP\-2\s0 [\s-1RFC1554\s0]" 2.IX Item "ISO-2022-JP-2 [RFC1554]"Not very popular yet.  Needs Unicode Database or equivalent toimplement \fIencode()\fR (because it includes \s-1JIS\s0 X 0208/0212, \s-1KSC5601\s0, and\&\s-1GB2312\s0 simultaneously, whose code points in Unicode overlap.  So youneed to lookup the database to determine to what character set a givenUnicode character should belong)..IP "\s-1ISO\-2022\-CN\s0 [\s-1RFC1922\s0]" 2.IX Item "ISO-2022-CN [RFC1922]"Not very popular.  Needs \s-1CNS\s0 11643\-1 and \-2 which are not available inthis module.  \s-1CNS\s0 11643 is supported (via euc-tw) in Encode::HanExtra.Autrijus Tang may add support for this encoding in his module in future..IP "Various HP-UX encodings" 2.IX Item "Various HP-UX encodings"The following are unsupported due to the lack of mapping data..Sp.Vb 2\&  \*(Aq8\*(Aq  \- arabic8, greek8, hebrew8, kana8, thai8, and turkish8\&  \*(Aq15\*(Aq \- japanese15, korean15, and roi15.Ve.IP "Cyrillic encoding \s-1ISO\-IR\-111\s0" 2.IX Item "Cyrillic encoding ISO-IR-111"Anton Tagunov doubts its usefulness..IP "\s-1ISO\-8859\-8\-1\s0 [Hebrew]" 2.IX Item "ISO-8859-8-1 [Hebrew]"None of the Encode team knows Hebrew enough (\s-1ISO\-8859\-8\s0, cp1255 andMacHebrew are supported because and just because there were mappingsavailable at <http://www.unicode.org/>).  Contributions welcome..IP "\s-1ISIRI\s0 3342, Iran System, \s-1ISIRI\s0 2900 [Farsi]" 2.IX Item "ISIRI 3342, Iran System, ISIRI 2900 [Farsi]"Ditto..IP "Thai encoding \s-1TCVN\s0" 2.IX Item "Thai encoding TCVN"Ditto..IP "Vietnamese encodings \s-1VPS\s0" 2.IX Item "Vietnamese encodings VPS"Though Jungshik Shin has reported that Mozilla supports this encoding,it was too late before 5.8.0 for us to add it.  In the future, itmay be available via a separate module.  See<http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvlatin/vps.uf>and<http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvlatin/vps.ut>if you are interested in helping us..IP "Various Mac encodings" 2.IX Item "Various Mac encodings"The following are unsupported due to the lack of mapping data..Sp.Vb 5\&  MacArmenian,  MacBengali,   MacBurmese,   MacEthiopic\&  MacExtArabic, MacGeorgian,  MacKannada,   MacKhmer\&  MacLaotian,   MacMalayalam, MacMongolian, MacOriya\&  MacSinhalese, MacTamil,     MacTelugu,    MacTibetan\&  MacVietnamese.Ve.SpThe rest which are already available are based upon the vendor mappingsat <http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/> ..IP "(Mac) Indic encodings" 2.IX Item "(Mac) Indic encodings"The maps for the following are available at <http://www.unicode.org/>but remain unsupport because those encodings need algorithmicalapproach, currently unsupported by \fIenc2xs\fR:.Sp.Vb 3\&  MacDevanagari\&  MacGurmukhi\&  MacGujarati.Ve.SpFor details, please see \f(CW\*(C`Unicode mapping issues and notes:\*(C'\fR at<http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/DEVANAGA.TXT> ..SpI believe this issue is prevalent not only for Mac Indics but also inother Indic encodings, but the above were the only Indic encodingsmaps that I could find at <http://www.unicode.org/> ..SH "Encoding vs. Charset \*(-- terminology".IX Header "Encoding vs. Charset  terminology"We are used to using the term (character) \fIencoding\fR and \fIcharacterset\fR interchangeably.  But just as confusing the terms byte andcharacter is dangerous and the terms should be differentiated whenneeded, we need to differentiate \fIencoding\fR and \fIcharacter set\fR..PPTo understand that, here is a description of how we make computersgrok our characters..IP "\(bu" 2First we start with which characters to include.  We call thiscollection of characters \fIcharacter repertoire\fR..IP "\(bu" 2Then we have to give each character a unique \s-1ID\s0 so your computer cantell the difference between 'a' and 'A'.  This itemized characterrepertoire is now a \fIcharacter set\fR..IP "\(bu" 2If your computer can grow the character set without furtherprocessing, you can go ahead and use it.  This is called a \fIcodedcharacter set\fR (\s-1CCS\s0) or \fIraw character encoding\fR.  \s-1ASCII\s0 is used thisway for most cases..IP "\(bu" 2But in many cases, especially multi-byte \s-1CJK\s0 encodings, you have totweak a little more.  Your network connection may not accept any datawith the Most Significant Bit set, and your computer may not be able totell if a given byte is a whole character or just half of it.  So youhave to \fIencode\fR the character set to use it..SpA \fIcharacter encoding scheme\fR (\s-1CES\s0) determines how to encode a givencharacter set, or a set of multiple character sets.  7bit \s-1ISO\-2022\s0 isan example of a \s-1CES\s0.  You switch between character sets via \fIescapesequences\fR..PPTechnically, or mathematically, speaking, a character set encoded insuch a \s-1CES\s0 that maps character by character may form a \s-1CCS\s0.  \s-1EUC\s0 is suchan example.  The \s-1CES\s0 of \s-1EUC\s0 is as follows:.IP "\(bu" 2Map \s-1ASCII\s0 unchanged..IP "\(bu" 2Map such a character set that consists of 94 or 96 powered by Nmembers by adding 0x80 to each byte..IP "\(bu" 2You can also use 0x8e and 0x8f to indicate that the following sequence ofcharacters belongs to yet another character set.  To each following byteis added the value 0x80..PPBy carefully looking at the encoded byte sequence, you can find that thebyte sequence conforms a unique number.  In that sense, \s-1EUC\s0 is a \s-1CCS\s0generated by a \s-1CES\s0 above from up to four \s-1CCS\s0 (complicated?).  \s-1UTF\-8\s0falls into this category.  See \*(L"\s-1UTF\-8\s0\*(R" in perlUnicode to find out how\&\s-1UTF\-8\s0 maps Unicode to a byte sequence..PPYou may also have found out by now why 7bit \s-1ISO\-2022\s0 cannot comprisea \s-1CCS\s0.  If you look at a byte sequence \ex21\ex21, you can't tell ifit is two !'s or \s-1IDEOGRAPHIC\s0 \s-1SPACE\s0.  \s-1EUC\s0 maps the latter to \exA1\exA1so you have no trouble differentiating between \*(L"!!\*(R". and \*(L"\ \ \*(R"..SH "Encoding Classification (by Anton Tagunov and Dan Kogai)".IX Header "Encoding Classification (by Anton Tagunov and Dan Kogai)"This section tries to classify the supported encodings by their applicability for information exchange over the Internet and to choose the most suitable aliases to name them in the context of such communication..IP "\(bu" 2To (en|de)code encodings marked by \f(CW\*(C`(**)\*(C'\fR, you need \&\f(CW\*(C`Encode::HanExtra\*(C'\fR, available from \s-1CPAN\s0..PPEncoding names.PP.Vb 3\&  US\-ASCII    UTF\-8    ISO\-8859\-*  KOI8\-R\&  Shift_JIS   EUC\-JP   ISO\-2022\-JP ISO\-2022\-JP\-1\&  EUC\-KR      Big5     GB2312.Ve.PPare registered with \s-1IANA\s0 as preferred \s-1MIME\s0 names and maybe used over the Internet..PP\&\f(CW\*(C`Shift_JIS\*(C'\fR has been officialized by \s-1JIS\s0 X 0208:1997.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -