⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 charset.html

📁 unix 下的C开发手册,还用详细的例程。
💻 HTML
📖 第 1 页 / 共 2 页
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html><head><!-- Copyright 1997 The Open Group, All Rights Reserved --><title>Character Set</title></head><body bgcolor=white><center><font size=2>The Single UNIX &reg; Specification, Version 2<br>Copyright &copy; 1997 The Open Group</font></center><hr size=2 noshade><center><h2><a name = "tag_001">&nbsp;</a>Character Set</h2></center><xref type="1" name="chars"></xref><p><h3><a name = "tag_001_001">&nbsp;</a>Portable Character Set</h3><xref type="2" name="charset"></xref>Conforming implementations support one or more coded character sets.Each supported locale includes the<i>portable character set</i>specified in the following table.<pre><table  bordercolor=#000000 border=1<tr valign=top><th align=center><b>Symbolic Name</b><th align=center><b>Glyph</b><th align=center><b>Symbolic Name</b><th align=center><b>Glyph</b><th align=center><b>Symbolic Name</b><th align=center><b>Glyph</b><tr valign=top><td align=left>&nbsp;<td align=center>&nbsp;<td align=left>&nbsp;<td align=center>&nbsp;<td align=left>&lt;circumflex&gt;<td align=center>^<tr valign=top><td align=left>&lt;NUL&gt;<td align=center> &nbsp;<td align=left>&lt;colon&gt;<td align=center>:<td align=left>&lt;circumflex-accent&gt;<td align=center>^<tr valign=top><td align=left>&lt;alert&gt;<td align=center>&nbsp;<td align=left>&lt;semicolon&gt;<td align=center>;<td align=left>&lt;underscore&gt;<td align=center>_<tr valign=top><td align=left>&lt;backspace&gt;<td align=center>&nbsp;<td align=left>&lt;less-than-sign&gt;<td align=center>&lt;<td align=left>&lt;underline&gt;<td align=center>_<tr valign=top><td align=left>&lt;tab&gt;<td align=center>&nbsp; <td align=left>&lt;equals-sign&gt;<td align=center>=<td align=left>&lt;low-line&gt;<td align=center>_<tr valign=top><td align=left>&lt;newline&gt;<td align=center>&nbsp; <td align=left>&lt;greater-than-sign&gt;<td align=center>&gt;<td align=left>&lt;grave-accent&gt;<td align=center>`<tr valign=top><td align=left>&lt;vertical-tab&gt;<td align=center>&nbsp; <td align=left>&lt;question-mark&gt;<td align=center>?<td align=left>&lt;a&gt;<td align=center>a<tr valign=top><td align=left>&lt;form-feed&gt;<td align=center>&nbsp; <td align=left>&lt;commercial-at&gt;<td align=center>@<td align=left>&lt;b&gt;<td align=center>b<tr valign=top><td align=left>&lt;carriage-return&gt;<td align=center>&nbsp; <td align=left>&lt;A&gt;<td align=center>A<td align=left>&lt;c&gt;<td align=center>c<tr valign=top><td align=left>&lt;space&gt;<td align=center>&nbsp; <td align=left>&lt;B&gt;<td align=center>B<td align=left>&lt;d&gt;<td align=center>d<tr valign=top><td align=left>&lt;exclamation-mark&gt;<td align=center>!<td align=left>&lt;C&gt;<td align=center>C<td align=left>&lt;e&gt;<td align=center>e<tr valign=top><td align=left>&lt;quotation-mark&gt;<td align=center>"<td align=left>&lt;D&gt;<td align=center>D<td align=left>&lt;f&gt;<td align=center>f<tr valign=top><td align=left>&lt;number-sign&gt;<td align=center>#<td align=left>&lt;E&gt;<td align=center>E<td align=left>&lt;g&gt;<td align=center>g<tr valign=top><td align=left>&lt;dollar-sign&gt;<td align=center>$<td align=left>&lt;F&gt;<td align=center>F<td align=left>&lt;h&gt;<td align=center>h<tr valign=top><td align=left>&lt;percent-sign&gt;<td align=center>%<td align=left>&lt;G&gt;<td align=center>G<td align=left>&lt;i&gt;<td align=center>i<tr valign=top><td align=left>&lt;ampersand&gt;<td align=center>&amp;<td align=left>&lt;H&gt;<td align=center>H<td align=left>&lt;j&gt;<td align=center>j<tr valign=top><td align=left>&lt;apostrophe&gt;<td align=center>'<td align=left>&lt;I&gt;<td align=center>I<td align=left>&lt;k&gt;<td align=center>k<tr valign=top><td align=left>&lt;left-parenthesis&gt;<td align=center>(<td align=left>&lt;J&gt;<td align=center>J<td align=left>&lt;l&gt;<td align=center>l<tr valign=top><td align=left>&lt;right-parenthesis&gt;<td align=center>)<td align=left>&lt;K&gt;<td align=center>K<td align=left>&lt;m&gt;<td align=center>m<tr valign=top><td align=left>&lt;asterisk&gt;<td align=center>*<td align=left>&lt;L&gt;<td align=center>L<td align=left>&lt;n&gt;<td align=center>n<tr valign=top><td align=left>&lt;plus-sign&gt;<td align=center>+<td align=left>&lt;M&gt;<td align=center>M<td align=left>&lt;o&gt;<td align=center>o<tr valign=top><td align=left>&lt;comma&gt;<td align=center>,<td align=left>&lt;N&gt;<td align=center>N<td align=left>&lt;p&gt;<td align=center>p<tr valign=top><td align=left>&lt;hyphen&gt;<td align=center>-<td align=left>&lt;O&gt;<td align=center>O<td align=left>&lt;q&gt;<td align=center>q<tr valign=top><td align=left>&lt;hyphen-minus&gt;<td align=center>-<td align=left>&lt;P&gt;<td align=center>P<td align=left>&lt;r&gt;<td align=center>r<tr valign=top><td align=left>&lt;period&gt;<td align=center>.<td align=left>&lt;Q&gt;<td align=center>Q<td align=left>&lt;s&gt;<td align=center>s<tr valign=top><td align=left>&lt;full-stop&gt;<td align=center>.<td align=left>&lt;R&gt;<td align=center>R<td align=left>&lt;t&gt;<td align=center>t<tr valign=top><td align=left>&lt;slash&gt;<td align=center>/<td align=left>&lt;S&gt;<td align=center>S<td align=left>&lt;u&gt;<td align=center>u<tr valign=top><td align=left>&lt;solidus&gt;<td align=center>/<td align=left>&lt;T&gt;<td align=center>T<td align=left>&lt;v&gt;<td align=center>v<tr valign=top><td align=left>&lt;zero&gt;<td align=center>0<td align=left>&lt;U&gt;<td align=center>U<td align=left>&lt;w&gt;<td align=center>w<tr valign=top><td align=left>&lt;one&gt;<td align=center>1<td align=left>&lt;V&gt;<td align=center>V<td align=left>&lt;x&gt;<td align=center>x<tr valign=top><td align=left>&lt;two&gt;<td align=center>2<td align=left>&lt;W&gt;<td align=center>W<td align=left>&lt;y&gt;<td align=center>y<tr valign=top><td align=left>&lt;three&gt;<td align=center>3<td align=left>&lt;X&gt;<td align=center>X<td align=left>&lt;z&gt;<td align=center>z<tr valign=top><td align=left>&lt;four&gt;<td align=center>4<td align=left>&lt;Y&gt;<td align=center>Y<td align=left>&lt;left-brace&gt;<td align=center>{<tr valign=top><td align=left>&lt;five&gt;<td align=center>5<td align=left>&lt;Z&gt;<td align=center>Z<td align=left>&lt;left-curly-bracket&gt;<td align=center>{<tr valign=top><td align=left>&lt;six&gt;<td align=center>6<td align=left>&lt;left-square-bracket&gt;<td align=center>[<td align=left>&lt;vertical-line&gt;<td align=center>|<tr valign=top><td align=left>&lt;seven&gt;<td align=center>7<td align=left>&lt;backslash&gt;<td align=center>\<td align=left>&lt;right-brace&gt;<td align=center>}<tr valign=top><td align=left>&lt;eight&gt;<td align=center>8<td align=left>&lt;reverse-solidus&gt;<td align=center>\<td align=left>&lt;right-curly-bracket&gt;<td align=center>}<tr valign=top><td align=left>&lt;nine&gt;<td align=center>9<td align=left>&lt;right-square-bracket&gt;<td align=center>]<td align=left>&lt;tilde&gt;<td align=center>~</table><h6 align=center><xref table="Portable Character Set"><a name="tagt_1">&nbsp;</a></xref>Table: Portable Character Set</h6><xref type="7" name="portchar"></xref></pre><p><xref href=portchar><a href="#tagt_1">Portable Character Set</a></xref>defines the characters in the portable characterset and the corresponding symbolic character names used toidentify each character in a character set description file.The table containsmore than one symbolic character name for characters whosetraditional name differs from the chosen name.<p>This specification set places only the following requirementson the encoded values of the characters in the portable character set:<ul><p><li>If the encoded values associated with each member of theportable character set are not invariant across alllocales supported by the implementation, the resultsachieved by an application accessing those locales are unspecified.<p><li>The encoded values associated with the digits0to9will be such that the value of each character after0will be one greater than the value of the previous character.<p><li>A null character, NUL,which has all bits set to zero, will be in theset of characters.<p><li>The encoded values associated with the members of the portablecharacter set are each represented in a single byte.Moreover, if thevalue is stored in an object of C-language type<b>char</b>,it is guaranteed to bepositive (except the NUL, which is always zero).<p></ul><h3><a name = "tag_001_002">&nbsp;</a>Character Encoding</h3><xref type="2" name="char_enc"></xref>The POSIX locale contains the characters in<xref href=portchar><a href="#tagt_1">Portable Character Set</a></xref>,which have the properties listed in<xref href=lc_ctype><a href="locale.html#tag_005_003_001">LC_CTYPE</a></xref>.Implementations may also add other characters.In other locales, the presence, meaning and representation ofany additional characters is locale-specific.<p>In locales other than the POSIX locale, a character may have astate-dependent encoding.There are two types of these encodings:<ul><p><li>A single-shift encoding (where each character not in the initial shiftstate is preceded by a shift code) can be definedif each shift-code and character sequence is considered a multi-bytecharacter.This is done using the concatenated-constant format in a character set description file, as described in<xref href=charmap><a href="#tag_001_004">Character Set Description File</a></xref>.If the implementation supports a character encoding of this type,all of the standard utilities in the <b>XCU</b> specification will support it.Use of a single-shift encodingwith any of the functions in the <b>XSH</b> specification thatdo not specifically mention the effects of state-dependent encodingis implementation-dependent.<p><li>A locking-shift encoding (where the state of the character isdetermined by a shift code that may affect more than thesingle character following it) cannot be defined with the currentcharacter set description file format.Use of a locking-shift encoding with any of the standard utilitiesin the <b>XCU</b> specificationor with any of the functions in the <b>XSH</b> specification thatdo not specifically mention the effects of state-dependent encodingis implementation-dependent.<p></ul><p>While in the initial shift state, all characters in theportable character set retain their usual interpretation and donot alter the shift state.The interpretation for subsequent bytes in the sequence is afunction of the current shift state.A byte with all bits zero is interpreted as the null characterindependent of shift state.Thus a byte with all bits zero must never occur in the second orsubsequent bytes of a character.<p>The maximum allowable number of bytes in a character in thecurrent locale is indicated by MB_CUR_MAX, defined in the <b>XSH</b> specification<i><a href="../xsh/stdlib.h.html">&lt;stdlib.h&gt;</a></i>,and by the<b>&lt;mb_cur_max&gt;</b>value in a character set description file; see<xref href=charmap><a href="#tag_001_004">Character Set Description File</a></xref>.The implementation's maximum number of bytes in a character isdefined by the C-language macro{MB_LEN_MAX}.<h3><a name = "tag_001_003">&nbsp;</a>C Language Wide-character Codes</h3><xref type="2" name="widechar"></xref>In the shell, the standard utilities are writtenso that the encodings of characters are described by thelocale's LC_CTYPE definition (see<xref href=lc_ctype><a href="locale.html#tag_005_003_001">LC_CTYPE</a></xref>)and there is no differentiation between charactersconsisting of single octets(8-bit bytes), larger bytes,or multiple bytes.However, in the C language,a differentiation is made.To ease the handling of variable lengthcharacters, the C language has introduced the concept of widecharacter codes.<p>All wide-character codes in a given processconsist of an equal number of bits.This is in contrast to characters, which can consist of a variablenumber of bytes.The byte or byte sequence that represents a character can also berepresented as a wide-character code.Wide-character codes thus provide auniform size formanipulating text data.A wide-character code having all bits zero is thenull wide-character code(see<xref href=nullwidechar><a href="glossary.html#tag_004_000_179">null wide-character code</a></xref>),and terminateswide-character strings(see<xref href=widechar><a href="#tag_001_003">

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -