charset.sgml

来自「PostgreSQL7.4.6 for Linux」· SGML 代码 · 共 895 行 · 第 1/2 页
SGML
895 行
<screen>initdb -E EUC_JP</screen>     sets the default character set (encoding) to     <literal>EUC_JP</literal> (Extended Unix Code for Japanese).  You     can use <option>--encoding</option> instead of     <option>-E</option> if you prefer to type longer option strings.     If no <option>-E</> or <option>--encoding</option> option is     given, <literal>SQL_ASCII</> is used.    </para>    <para>     You can create a database with a different character set:<screen>createdb -E EUC_KR korean</screen>     This will create a database named <literal>korean</literal> that     uses the character set <literal>EUC_KR</literal>.  Another way to     accomplish this is to use this SQL command:<programlisting>CREATE DATABASE korean WITH ENCODING 'EUC_KR';</programlisting>     The encoding for a database is stored in the system catalog     <literal>pg_database</literal>.  You can see that by using the     <option>-l</option> option or the <command>\l</command> command     of <command>psql</command>.<screen>$ <userinput>psql -l</userinput>            List of databases   Database    |  Owner  |   Encoding    ---------------+---------+--------------- euc_cn        | t-ishii | EUC_CN euc_jp        | t-ishii | EUC_JP euc_kr        | t-ishii | EUC_KR euc_tw        | t-ishii | EUC_TW mule_internal | t-ishii | MULE_INTERNAL regression    | t-ishii | SQL_ASCII template1     | t-ishii | EUC_JP test          | t-ishii | EUC_JP unicode       | t-ishii | UNICODE(9 rows)</screen>    </para>   </sect2>   <sect2>    <title>Automatic Character Set Conversion Between Server and Client</title>    <para>     <productname>PostgreSQL</productname> supports automatic     character set conversion between server and client for certain     character sets. The conversion information is stored in the     <literal>pg_conversion</> system catalog. You can create a new     conversion by using the SQL command <command>CREATE     CONVERSION</command>. <productname>PostgreSQL</> comes with some     predefined conversions. They are listed in <xref     linkend="multibyte-translation-table">.    </para>     <table id="multibyte-translation-table">      <title>Client/Server Character Set Conversions</title>      <tgroup cols="2">       <thead>	<row>	 <entry>Server Character Set</entry>	 <entry>Available Client Character Sets</entry>	</row>       </thead>       <tbody>	<row>	 <entry><literal>SQL_ASCII</literal></entry>	 <entry><literal>SQL_ASCII</literal>, <literal>UNICODE</literal>, <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>	 <entry><literal>EUC_JP</literal></entry>	 <entry><literal>EUC_JP</literal>, <literal>SJIS</literal>,	 <literal>UNICODE</literal>, <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>	 <entry><literal>EUC_CN</literal></entry>	 <entry><literal>EUC_CN</literal>, <literal>UNICODE</literal>, <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>	 <entry><literal>EUC_KR</literal></entry>	 <entry><literal>EUC_KR</literal>, <literal>UNICODE</literal>, <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>	 <entry><literal>JOHAB</literal></entry>	 <entry><literal>JOHAB</literal>, <literal>UNICODE</literal>	 </entry>	</row>	<row>	 <entry><literal>EUC_TW</literal></entry>	 <entry><literal>EUC_TW</literal>, <literal>BIG5</literal>,	 <literal>UNICODE</literal>, <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN1</literal></entry>	 <entry><literal>LATIN1</literal>, <literal>UNICODE</literal>	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN2</literal></entry>	 <entry><literal>LATIN2</literal>, <literal>WIN1250</literal>,	 <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN3</literal></entry>	 <entry><literal>LATIN3</literal>, <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN4</literal></entry>	 <entry><literal>LATIN4</literal>, <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN5</literal></entry>	 <entry><literal>LATIN5</literal>, <literal>UNICODE</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN6</literal></entry>	 <entry><literal>LATIN6</literal>, <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN7</literal></entry>	 <entry><literal>LATIN7</literal>, <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN8</literal></entry>	 <entry><literal>LATIN8</literal>, <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN9</literal></entry>	 <entry><literal>LATIN9</literal>, <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>  	 <entry><literal>LATIN10</literal></entry>	 <entry><literal>LATIN10</literal>, <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>	 <entry><literal>ISO_8859_5</literal></entry>	 <entry><literal>ISO_8859_5</literal>,	 <literal>UNICODE</literal>,	 <literal>MULE_INTERNAL</literal>,	 <literal>WIN</literal>,	 <literal>ALT</literal>,	 <literal>KOI8</literal>	 </entry>	</row>	<row>	 <entry><literal>ISO_8859_6</literal></entry>	 <entry><literal>ISO_8859_6</literal>,	 <literal>UNICODE</literal>	 </entry>	</row>	<row>	 <entry><literal>ISO_8859_7</literal></entry>	 <entry><literal>ISO_8859_7</literal>,	 <literal>UNICODE</literal>	 </entry>	</row>	<row>	 <entry><literal>ISO_8859_8</literal></entry>	 <entry><literal>ISO_8859_8</literal>,	 <literal>UNICODE</literal>	 </entry>	</row>	<row>	 <entry><literal>UNICODE</literal></entry>	 <entry>	 <literal>EUC_JP</literal>, <literal>SJIS</literal>, 	 <literal>EUC_KR</literal>, <literal>UHC</literal>, <literal>JOHAB</literal>,	 <literal>EUC_CN</literal>, <literal>GBK</literal>,	 <literal>EUC_TW</literal>, <literal>BIG5</literal>, 	 <literal>LATIN1</literal> to <literal>LATIN10</literal>, 	 <literal>ISO_8859_5</literal>, 	 <literal>ISO_8859_6</literal>, 	 <literal>ISO_8859_7</literal>, 	 <literal>ISO_8859_8</literal>, 	 <literal>WIN</literal>, <literal>ALT</literal>, 	 <literal>KOI8</literal>, 	 <literal>WIN1256</literal>,	 <literal>TCVN</literal>,	 <literal>WIN874</literal>,	 <literal>GB18030</literal>,	 <literal>WIN1250</literal>	 </entry>	</row>	<row>	 <entry><literal>MULE_INTERNAL</literal></entry>	 <entry><literal>EUC_JP</literal>, <literal>SJIS</literal>, <literal>EUC_KR</literal>, <literal>EUC_CN</literal>, 	  <literal>EUC_TW</literal>, <literal>BIG5</literal>, <literal>LATIN1</literal> to <literal>LATIN5</literal>, 	  <literal>WIN</literal>, <literal>ALT</literal>,	 <literal>WIN1250</literal>,	  <literal>BIG5</literal>, <literal>ISO_8859_5</literal>, <literal>KOI8</literal></entry>	</row>	<row>	 <entry><literal>KOI8</literal></entry>	 <entry><literal>ISO_8859_5</literal>, <literal>WIN</literal>, 	 <literal>ALT</literal>, <literal>KOI8</literal>,	 <literal>UNICODE</literal>, <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>	 <entry><literal>WIN</literal></entry>	 <entry><literal>ISO_8859_5</literal>, <literal>WIN</literal>, 	 <literal>ALT</literal>, <literal>KOI8</literal>,	 <literal>UNICODE</literal>, <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>	 <entry><literal>ALT</literal></entry>	 <entry><literal>ISO_8859_5</literal>, <literal>WIN</literal>, 	 <literal>ALT</literal>, <literal>KOI8</literal>,	 <literal>UNICODE</literal>, <literal>MULE_INTERNAL</literal>	 </entry>	</row>	<row>	 <entry><literal>WIN1256</literal></entry>	 <entry><literal>WIN1256</literal>,	 <literal>UNICODE</literal>	 </entry>	</row>	<row>	 <entry><literal>TCVN</literal></entry>	 <entry><literal>TCVN</literal>,	 <literal>UNICODE</literal>	 </entry>	</row>	<row>	 <entry><literal>WIN874</literal></entry>	 <entry><literal>WIN874</literal>,	 <literal>UNICODE</literal>	 </entry>	</row>       </tbody>      </tgroup>     </table>    <para>     To enable the automatic character set conversion, you have to     tell <productname>PostgreSQL</productname> the character set     (encoding) you would like to use in the client. There are several     ways to accomplish this:     <itemizedlist>      <listitem>       <para>	Using the <command>\encoding</command> command in	<application>psql</application>.	<command>\encoding</command> allows you to change client	encoding on the fly. For	example, to change the encoding to <literal>SJIS</literal>, type:<programlisting>\encoding SJIS</programlisting>       </para>      </listitem>      <listitem>       <para>	Using <application>libpq</> functions.	<command>\encoding</command> actually calls	<function>PQsetClientEncoding()</function> for its purpose.<synopsis>int PQsetClientEncoding(PGconn *<replaceable>conn</replaceable>, const char *<replaceable>encoding</replaceable>);</synopsis>	where <replaceable>conn</replaceable> is a connection to the server,	and <replaceable>encoding</replaceable> is the encoding you	want to use. If the function successfully sets the encoding, it returns 0,	otherwise -1. The current encoding for this connection can be determined by	using:<synopsis>int PQclientEncoding(const PGconn *<replaceable>conn</replaceable>);</synopsis>	Note that it returns the encoding ID, not a symbolic string	such as <literal>EUC_JP</literal>. To convert an encoding ID to an encoding name, you	can use:<synopsis>char *pg_encoding_to_char(int <replaceable>encoding_id</replaceable>);</synopsis>       </para>      </listitem>      <listitem>       <para>	Using <command>SET client_encoding TO</command>.	Setting the client encoding can be done with this SQL command:<programlisting>SET CLIENT_ENCODING TO '<replaceable>value</>';</programlisting>	Also you can use the more standard SQL syntax <literal>SET NAMES</literal> for this purpose:<programlisting>SET NAMES '<replaceable>value</>';</programlisting>	To query the current client encoding:<programlisting>SHOW client_encoding;</programlisting>	To return to the default encoding:<programlisting>RESET client_encoding;</programlisting>       </para>      </listitem>      <listitem>       <para>	Using <envar>PGCLIENTENCODING</envar>.	If environment variable <envar>PGCLIENTENCODING</envar> is defined	in the client's environment, that client encoding is automatically	selected when a connection to the server is made.  (This can subsequently	be overridden using any of the other methods mentioned above.)       </para>      </listitem>      <listitem>      <para>       Using the configuration variable <varname>client_encoding</varname>.      If the <varname>client_encoding</> variable in <filename>postgresql.conf</> is set, that      client encoding is automatically selected when a connection to the      server is made.  (This can subsequently be overridden using any of the      other methods mentioned above.)       </para>      </listitem>     </itemizedlist>    </para>    <para>     If the conversion of a particular character is not possible --     suppose you chose <literal>EUC_JP</literal> for the server and     <literal>LATIN1</literal> for the client, then some Japanese     characters cannot be converted to <literal>LATIN1</literal> -- it     is transformed to its hexadecimal byte values in parentheses,     e.g., <literal>(826C)</literal>.    </para>   </sect2>   <sect2>    <title>Further Reading</title>    <para>     These are good sources to start learning about various kinds of encoding     systems.     <variablelist>      <varlistentry>       <term><ulink url="ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf"></ulink></term>       <listitem>        <para>         Detailed explanations of <literal>EUC_JP</literal>,         <literal>EUC_CN</literal>, <literal>EUC_KR</literal>,         <literal>EUC_TW</literal> appear in section 3.2.        </para>       </listitem>      </varlistentry>      <varlistentry>       <term><ulink url="http://www.unicode.org/"></ulink></term>       <listitem>        <para>         The web site of the Unicode Consortium        </para>       </listitem>      </varlistentry>      <varlistentry>       <term>RFC 2044</term>       <listitem>        <para>	 <acronym>UTF</acronym>-8 is defined here.        </para>       </listitem>      </varlistentry>     </variablelist>    </para>   </sect2>  </sect1></chapter><!-- Keep this comment at the end of the fileLocal variables:mode:sgmlsgml-omittag:nilsgml-shorttag:tsgml-minimize-attributes:nilsgml-always-quote-attributes:tsgml-indent-step:1sgml-indent-data:tsgml-parent-document:nilsgml-default-dtd-file:"./reference.ced"sgml-exposed-tags:nilsgml-local-catalogs:("/usr/lib/sgml/catalog")sgml-local-ecat-files:nilEnd:-->
charset.sgml - 源码说明

本页面展示了「PostgreSQL7.4.6 for Linux」中的 charset.sgml 源码文件，采用 SGML 编程语言编写，共 895 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与PostgreSQL相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?