📄 mbstring.encodings.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html> <head> <title>Summaries of supported encodings</title> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> </head> <body><div style="text-align: center;"> <div class="prev" style="text-align: left; float: left;"><a href="mbstring.constants.html">Predefined Constants</a></div> <div class="next" style="text-align: right; float: right;"><a href="mbstring.ja-basic.html">Basics of Japanese multi-byte encodings</a></div> <div class="up"><a href="book.mbstring.html">Multibyte String</a></div> <div class="home"><a href="index.html">PHP Manual</a></div></div><hr /><div> <h1>Summaries of supported encodings</h1> <div class="segmentedlist"> <strong class="title">Summaries of supported encodings</strong> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>ISO-10646-UCS-4</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>ISO 10646</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> The Universal Character Set with 31-bit code space, standardized as UCS-4 by ISO/IEC 10646. It is kept synchronized with the latest version of the Unicode code map. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> If this name is used in the encoding conversion facility, the converter attempts to identify by the preceding BOM (byte order mark)in which endian the subsequent bytes are represented. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>ISO-10646-UCS-4</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>UCS-4</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> See above. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> In contrast to <i>UCS-4</i>, strings are always assumed to be in big endian form. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>ISO-10646-UCS-4</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>UCS-4</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> See above. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> In contrast to <i>UCS-4</i>, strings are always assumed to be in little endian form. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>ISO-10646-UCS-2</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>UCS-2</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> The Universal Character Set with 16-bit code space, standardized as UCS-2 by ISO/IEC 10646. It is kept synchronized with the latest version of the unicode code map. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> If this name is used in the encoding conversion facility, the converter attempts to identify by the preceding BOM (byte order mark)in which endian the subsequent bytes are represented. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>ISO-10646-UCS-2</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>UCS-2</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> See above. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> In contrast to <i>UCS-2</i>, strings are always assumed to be in big endian form. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>ISO-10646-UCS-2</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>UCS-2</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> See above. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> In contrast to <i>UCS-2</i>, strings are always assumed to be in little endian form. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>UTF-32</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>Unicode</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> Unicode Transformation Format of 32-bit unit width, whose encoding space refers to the Unicode's codeset standard. This encoding scheme wasn't identical to UCS-4 because the code space of Unicode were limited to a 21-bit value. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> If this name is used in the encoding conversion facility, the converter attempts to identify by the preceding BOM (byte order mark)in which endian the subsequent bytes are represented. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>UTF-32BE</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>Unicode</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong>See above</div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> In contrast to <i>UTF-32</i>, strings are always assumed to be in big endian form. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>UTF-32LE</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>Unicode</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong>See above</div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> In contrast to <i>UTF-32</i>, strings are always assumed to be in little endian form. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>UTF-16</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>Unicode</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> Unicode Transformation Format of 16-bit unit width. It's worth a note that UTF-16 is no longer the same specification as UCS-2 because the surrogate mechanism has been introduced since Unicode 2.0 and UTF-16 now refers to a 21-bit code space. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> If this name is used in the encoding conversion facility, the converter attempts to identify by the preceding BOM (byte order mark)in which endian the subsequent bytes are represented. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>UTF-16BE</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>Unicode</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> See above. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> In contrast to <i>UTF-16</i>, strings are always assumed to be in big endian form. </div> </div> <div class="seglistitem"> <div class="seg"><strong><span class="segtitle">Name in the IANA character set registry:</span></strong>UTF-16LE</div> <div class="seg"><strong><span class="segtitle">Underlying character set:</span></strong>Unicode</div> <div class="seg"><strong><span class="segtitle">Description:</span></strong> See above. </div> <div class="seg"><strong><span class="segtitle">Additional note:</span></strong> In contrast to <i>UTF-16</i>, strings are always assumed
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -