📄 character-sets.txt
字号:
Source: ISO See (http://www.iana.org/assignments/charset-reg/iso-8859-14) [Simonsen]
Alias: iso-ir-199
Alias: ISO_8859-14:1998
Alias: ISO_8859-14
Alias: latin8
Alias: iso-celtic
Alias: l8
Name: ISO-8859-15
MIBenum: 111
Source: ISO
Please see: <http://www.iana.org/assignments/charset-reg/ISO-8859-15>
Alias: ISO_8859-15
Alias: Latin-9
Name: ISO-8859-16
MIBenum: 112
Source: ISO
Alias: iso-ir-226
Alias: ISO_8859-16:2001
Alias: ISO_8859-16
Alias: latin10
Alias: l10
Name: GBK
MIBenum: 113
Source: Chinese IT Standardization Technical Committee
Please see: <http://www.iana.org/assignments/charset-reg/GBK>
Alias: CP936
Alias: MS936
Alias: windows-936
Name: GB18030
MIBenum: 114
Source: Chinese IT Standardization Technical Committee
Please see: <http://www.iana.org/assignments/charset-reg/GB18030>
Alias: None
Name: JIS_Encoding
MIBenum: 16
Source: JIS X 0202-1991. Uses ISO 2022 escape sequences to
shift code sets as documented in JIS X 0202-1991.
Alias: csJISEncoding
Name: Shift_JIS (preferred MIME name)
MIBenum: 17
Source: This charset is an extension of csHalfWidthKatakana by
adding graphic characters in JIS X 0208. The CCS's are
JIS X0201:1997 and JIS X0208:1997. The
complete definition is shown in Appendix 1 of JIS
X0208:1997.
This charset can be used for the top-level media type "text".
Alias: MS_Kanji
Alias: csShiftJIS
Name: Extended_UNIX_Code_Packed_Format_for_Japanese
MIBenum: 18
Source: Standardized by OSF, UNIX International, and UNIX Systems
Laboratories Pacific. Uses ISO 2022 rules to select
code set 0: US-ASCII (a single 7-bit byte set)
code set 1: JIS X0208-1990 (a double 8-bit byte set)
restricted to A0-FF in both bytes
code set 2: Half Width Katakana (a single 7-bit byte set)
requiring SS2 as the character prefix
code set 3: JIS X0212-1990 (a double 7-bit byte set)
restricted to A0-FF in both bytes
requiring SS3 as the character prefix
Alias: csEUCPkdFmtJapanese
Alias: EUC-JP (preferred MIME name)
Name: Extended_UNIX_Code_Fixed_Width_for_Japanese
MIBenum: 19
Source: Used in Japan. Each character is 2 octets.
code set 0: US-ASCII (a single 7-bit byte set)
1st byte = 00
2nd byte = 20-7E
code set 1: JIS X0208-1990 (a double 7-bit byte set)
restricted to A0-FF in both bytes
code set 2: Half Width Katakana (a single 7-bit byte set)
1st byte = 00
2nd byte = A0-FF
code set 3: JIS X0212-1990 (a double 7-bit byte set)
restricted to A0-FF in
the first byte
and 21-7E in the second byte
Alias: csEUCFixWidJapanese
Name: ISO-10646-UCS-Basic
MIBenum: 1002
Source: ASCII subset of Unicode. Basic Latin = collection 1
See ISO 10646, Appendix A
Alias: csUnicodeASCII
Name: ISO-10646-Unicode-Latin1
MIBenum: 1003
Source: ISO Latin-1 subset of Unicode. Basic Latin and Latin-1
Supplement = collections 1 and 2. See ISO 10646,
Appendix A. See RFC 1815.
Alias: csUnicodeLatin1
Alias: ISO-10646
Name: ISO-10646-J-1
Source: ISO 10646 Japanese, see RFC 1815.
Name: ISO-Unicode-IBM-1261
MIBenum: 1005
Source: IBM Latin-2, -3, -5, Extended Presentation Set, GCSGID: 1261
Alias: csUnicodeIBM1261
Name: ISO-Unicode-IBM-1268
MIBenum: 1006
Source: IBM Latin-4 Extended Presentation Set, GCSGID: 1268
Alias: csUnicodeIBM1268
Name: ISO-Unicode-IBM-1276
MIBenum: 1007
Source: IBM Cyrillic Greek Extended Presentation Set, GCSGID: 1276
Alias: csUnicodeIBM1276
Name: ISO-Unicode-IBM-1264
MIBenum: 1008
Source: IBM Arabic Presentation Set, GCSGID: 1264
Alias: csUnicodeIBM1264
Name: ISO-Unicode-IBM-1265
MIBenum: 1009
Source: IBM Hebrew Presentation Set, GCSGID: 1265
Alias: csUnicodeIBM1265
Name: ISO-8859-1-Windows-3.0-Latin-1 [HP-PCL5]
MIBenum: 2000
Source: Extended ISO 8859-1 Latin-1 for Windows 3.0.
PCL Symbol Set id: 9U
Alias: csWindows30Latin1
Name: ISO-8859-1-Windows-3.1-Latin-1 [HP-PCL5]
MIBenum: 2001
Source: Extended ISO 8859-1 Latin-1 for Windows 3.1.
PCL Symbol Set id: 19U
Alias: csWindows31Latin1
Name: ISO-8859-2-Windows-Latin-2 [HP-PCL5]
MIBenum: 2002
Source: Extended ISO 8859-2. Latin-2 for Windows 3.1.
PCL Symbol Set id: 9E
Alias: csWindows31Latin2
Name: ISO-8859-9-Windows-Latin-5 [HP-PCL5]
MIBenum: 2003
Source: Extended ISO 8859-9. Latin-5 for Windows 3.1
PCL Symbol Set id: 5T
Alias: csWindows31Latin5
Name: Adobe-Standard-Encoding [Adobe]
MIBenum: 2005
Source: PostScript Language Reference Manual
PCL Symbol Set id: 10J
Alias: csAdobeStandardEncoding
Name: Ventura-US [HP-PCL5]
MIBenum: 2006
Source: Ventura US. ASCII plus characters typically used in
publishing, like pilcrow, copyright, registered, trade mark,
section, dagger, and double dagger in the range A0 (hex)
to FF (hex).
PCL Symbol Set id: 14J
Alias: csVenturaUS
Name: Ventura-International [HP-PCL5]
MIBenum: 2007
Source: Ventura International. ASCII plus coded characters similar
to Roman8.
PCL Symbol Set id: 13J
Alias: csVenturaInternational
Name: PC8-Danish-Norwegian [HP-PCL5]
MIBenum: 2012
Source: PC Danish Norwegian
8-bit PC set for Danish Norwegian
PCL Symbol Set id: 11U
Alias: csPC8DanishNorwegian
Name: PC8-Turkish [HP-PCL5]
MIBenum: 2014
Source: PC Latin Turkish. PCL Symbol Set id: 9T
Alias: csPC8Turkish
Name: IBM-Symbols [IBM-CIDT]
MIBenum: 2015
Source: Presentation Set, CPGID: 259
Alias: csIBMSymbols
Name: IBM-Thai [IBM-CIDT]
MIBenum: 2016
Source: Presentation Set, CPGID: 838
Alias: csIBMThai
Name: HP-Legal [HP-PCL5]
MIBenum: 2017
Source: PCL 5 Comparison Guide, Hewlett-Packard,
HP part number 5961-0510, October 1992
PCL Symbol Set id: 1U
Alias: csHPLegal
Name: HP-Pi-font [HP-PCL5]
MIBenum: 2018
Source: PCL 5 Comparison Guide, Hewlett-Packard,
HP part number 5961-0510, October 1992
PCL Symbol Set id: 15U
Alias: csHPPiFont
Name: HP-Math8 [HP-PCL5]
MIBenum: 2019
Source: PCL 5 Comparison Guide, Hewlett-Packard,
HP part number 5961-0510, October 1992
PCL Symbol Set id: 8M
Alias: csHPMath8
Name: Adobe-Symbol-Encoding [Adobe]
MIBenum: 2020
Source: PostScript Language Reference Manual
PCL Symbol Set id: 5M
Alias: csHPPSMath
Name: HP-DeskTop [HP-PCL5]
MIBenum: 2021
Source: PCL 5 Comparison Guide, Hewlett-Packard,
HP part number 5961-0510, October 1992
PCL Symbol Set id: 7J
Alias: csHPDesktop
Name: Ventura-Math [HP-PCL5]
MIBenum: 2022
Source: PCL 5 Comparison Guide, Hewlett-Packard,
HP part number 5961-0510, October 1992
PCL Symbol Set id: 6M
Alias: csVenturaMath
Name: Microsoft-Publishing [HP-PCL5]
MIBenum: 2023
Source: PCL 5 Comparison Guide, Hewlett-Packard,
HP part number 5961-0510, October 1992
PCL Symbol Set id: 6J
Alias: csMicrosoftPublishing
Name: Windows-31J
MIBenum: 2024
Source: Windows Japanese. A further extension of Shift_JIS
to include NEC special characters (Row 13), NEC
selection of IBM extensions (Rows 89 to 92), and IBM
extensions (Rows 115 to 119). The CCS's are
JIS X0201:1997, JIS X0208:1997, and these extensions.
This charset can be used for the top-level media type "text",
but it is of limited or specialized use (see RFC2278).
PCL Symbol Set id: 19K
Alias: csWindows31J
Name: GB2312 (preferred MIME name)
MIBenum: 2025
Source: Chinese for People's Republic of China (PRC) mixed one byte,
two byte set:
20-7E = one byte ASCII
A1-FE = two byte PRC Kanji
See GB 2312-80
PCL Symbol Set Id: 18C
Alias: csGB2312
Name: Big5 (preferred MIME name)
MIBenum: 2026
Source: Chinese for Taiwan Multi-byte set.
PCL Symbol Set Id: 18T
Alias: csBig5
Name: windows-1250
MIBenum: 2250
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1250) [Lazhintseva]
Alias: None
Name: windows-1251
MIBenum: 2251
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1251) [Lazhintseva]
Alias: None
Name: windows-1252
MIBenum: 2252
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1252) [Wendt]
Alias: None
Name: windows-1253
MIBenum: 2253
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1253) [Lazhintseva]
Alias: None
Name: windows-1254
MIBenum: 2254
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1254) [Lazhintseva]
Alias: None
Name: windows-1255
MIBenum: 2255
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1255) [Lazhintseva]
Alias: None
Name: windows-1256
MIBenum: 2256
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1256) [Lazhintseva]
Alias: None
Name: windows-1257
MIBenum: 2257
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1257) [Lazhintseva]
Alias: None
Name: windows-1258
MIBenum: 2258
Source: Microsoft (http://www.iana.org/assignments/charset-reg/windows-1258) [Lazhintseva]
Alias: None
Name: TIS-620
MIBenum: 2259
Source: Thai Industrial Standards Institute (TISI) [Tantsetthi]
Name: HZ-GB-2312
MIBenum: 2085
Source: RFC 1842, RFC 1843 [RFC1842, RFC1843]
REFERENCES
----------
[RFC1345] Simonsen, K., "Character Mnemonics & Character Sets",
RFC 1345, Rationel Almen Planlaegning, Rationel Almen
Planlaegning, June 1992.
[RFC1428] Vaudreuil, G., "Transition of Internet Mail from
Just-Send-8 to 8bit-SMTP/MIME", RFC1428, CNRI, February
1993.
[RFC1456] Vietnamese Standardization Working Group, "Conventions for
Encoding the Vietnamese Language VISCII: VIetnamese
Standard Code for Information Interchange VIQR: VIetnamese
Quoted-Readable Specification Revision 1.1", RFC 1456, May
1993.
[RFC1468] Murai, J., Crispin, M., and E. van der Poel, "Japanese
Character Encoding for Internet Messages", RFC 1468,
Keio University, Panda Programming, June 1993.
[RFC1489] Chernov, A., "Registration of a Cyrillic Character Set",
RFC1489, RELCOM Development Team, July 1993.
[RFC1554] Ohta, M., and K. Handa, "ISO-2022-JP-2: Multilingual
Extension of ISO-2022-JP", RFC1554, Tokyo Institute of
Technology, ETL, December 1993.
[RFC1556] Nussbacher, H., "Handling of Bi-directional Texts in MIME",
RFC1556, Israeli Inter-University, December 1993.
[RFC1557] Choi, U., Chon, K., and H. Park, "Korean Character Encoding
for Internet Messages", KAIST, Solvit Chosun Media,
December 1993.
[RFC1641] Goldsmith, D., and M. Davis, "Using Unicode with MIME",
RFC1641, Taligent, Inc., July 1994.
[RFC1642] Goldsmith, D., and M. Davis, "UTF-7", RFC1642, Taligent,
Inc., July 1994.
[RFC1815] Ohta, M., "Character Sets ISO-10646 and ISO-10646-J-1",
RFC 1815, Tokyo Institute of Technology, July 1995.
[Adobe] Adobe Systems Incorporated, PostScript Language Reference
Manual, second edition, Addison-Wesley Publishing Company,
Inc., 1990.
[ECMA Registry] ISO-IR: International Register of Escape Sequences
http://www.itscj.ipsj.or.jp/ISO-IE/ Note: The current
registration authority is IPSJ/ITSCJ, Japan.
[HP-PCL5] Hewlett-Packard Company, "HP PCL 5 Comparison Guide",
(P/N 5021-0329) pp B-13, 1996.
[IBM-CIDT] IBM Corporation, "ABOUT TYPE: IBM's Technical Reference
for Core Interchange Digitized Type", Publication number
S544-3708-01
[RFC1842] Wei, Y., J. Li, and Y. Jiang, "ASCII Printable
Characters-Based Chinese Character Encoding for Internet
Messages", RFC 1842, Harvard University, Rice University,
University of Maryland, August 1995.
[RFC1843] Lee, F., "HZ - A Data Format for Exchanging Files of
Arbitrarily Mixed Chinese and ASCII Characters", RFC 1843,
Stanford University, August 1995.
[RFC2152] Goldsmith, D., M. Davis, "UTF-7: A Mail-Safe Transformation
Format of Unicode", RFC 2152, Apple Computer, Inc.,
Taligent Inc., May 1997.
[RFC2279] Yergeau, F., "UTF-8, A Transformation Format of ISO 10646",
RFC 2279, Alis Technologies, January, 1998.
[RFC2781] Hoffman, P., Yergeau, F., "UTF-16, an encoding of ISO 10646",
RFC 2781, February 2000.
PEOPLE
------
[KXS2] Keld Simonsen <Keld.Simonsen@dkuug.dk>
[Choi] Woohyong Choi <whchoi@cosmos.kaist.ac.kr>
[Davis] Mark Davis, <mark@unicode.org>, April 2002.
[Lazhintseva] Katya Lazhintseva, <katyal@MICROSOFT.com>, May 1996.
[Mahdi] Tamer Mahdi, <tamer@ca.ibm.com>, August 2000.
[Murai] Jun Murai <jun@wide.ad.jp>
[Nussbacher] Hank Nussbacher, <hank@vm.tau.ac.il>
[Ohta] Masataka Ohta, <mohta@cc.titech.ac.jp>, July 1995.
[Phipps] Toby Phipps, <tphipps@peoplesoft.com>, March 2002.
[Pond] Rick Pond, <rickpond@vnet.ibm.com>, March 1997.
[Robrigado] Reuel Robrigado, <reuelr@ca.ibm.com>, September 2002.
[Scherer] Markus Scherer, <markus.scherer@jtcsv.com>, August 2000,
September 2002.
[Simonsen] Keld Simonsen, <Keld.Simonsen@rap.dk>, August 2000.
[Tantsetthi] Trin Tantsetthi, <trin@mozart.inet.co.th>, September 1998.
[Tumasonis] Vladas Tumasonis, <vladas.tumasonis@maf.vu.lt>, August 2000.
[Uskov] Alexander Uskov, <auskov@idc.kz>, September 2002.
[Wendt] Chris Wendt, <christw@microsoft.com>, December 1999.
[Yick] Nicky Yick, <cliac@itsd.gcn.gov.hk>, October 2000.
[]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -