⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 unicode.sgml

📁 GLib是GTK+和GNOME工程的基础底层核心程序库
💻 SGML
📖 第 1 页 / 共 5 页
字号:
                                             <link linkend="gssize">gssize</link> max_len,                                             const <link linkend="gchar">gchar</link> **end);</programlisting><para>Validates UTF-8 encoded text. <parameter>str</parameter> is the text to validate;if <parameter>str</parameter> is nul-terminated, then <parameter>max_len</parameter> can be -1, otherwise<parameter>max_len</parameter> should be the number of bytes to validate.If <parameter>end</parameter> is non-<literal>NULL</literal>, then the end of the valid rangewill be stored there (i.e. the address of the first invalid byteif some bytes were invalid, or the end of the text being validatedotherwise).</para><para>Returns <literal>TRUE</literal> if all of <parameter>str</parameter> was valid. Many GLib and GTK+routines <emphasis>require</emphasis> valid UTF-8 as input;so data read from a file or the network should be checkedwith <link linkend="g-utf8-validate">g_utf8_validate</link>() before doing anything else with it.</para><para></para><informaltable pgwide="1" frame="none" role="params"><tgroup cols="2"><colspec colwidth="2*"><colspec colwidth="8*"><tbody><row><entry align="right"><parameter>str</parameter>&nbsp;:</entry><entry> a pointer to character data</entry></row><row><entry align="right"><parameter>max_len</parameter>&nbsp;:</entry><entry> max bytes to validate, or -1 to go until nul</entry></row><row><entry align="right"><parameter>end</parameter>&nbsp;:</entry><entry> return location for end of valid data</entry></row><row><entry align="right"><emphasis>Returns</emphasis> :</entry><entry> <literal>TRUE</literal> if the text was valid UTF-8</entry></row></tbody></tgroup></informaltable></refsect2><refsect2><title><anchor id="g-utf8-strup">g_utf8_strup ()</title><programlisting><link linkend="gchar">gchar</link>*      g_utf8_strup                    (const <link linkend="gchar">gchar</link> *str,                                             <link linkend="gssize">gssize</link> len);</programlisting><para>Converts all Unicode characters in the string that have a caseto uppercase. The exact manner that this is done dependson the current locale, and may result in the number ofcharacters in the string increasing. (For instance, theGerman ess-zet will be changed to SS.)</para><para></para><informaltable pgwide="1" frame="none" role="params"><tgroup cols="2"><colspec colwidth="2*"><colspec colwidth="8*"><tbody><row><entry align="right"><parameter>str</parameter>&nbsp;:</entry><entry> a UTF-8 encoded string</entry></row><row><entry align="right"><parameter>len</parameter>&nbsp;:</entry><entry> length of <parameter>str</parameter>, in bytes, or -1 if <parameter>str</parameter> is nul-terminated.</entry></row><row><entry align="right"><emphasis>Returns</emphasis> :</entry><entry> a newly allocated string, with all characters   converted to uppercase.  </entry></row></tbody></tgroup></informaltable></refsect2><refsect2><title><anchor id="g-utf8-strdown">g_utf8_strdown ()</title><programlisting><link linkend="gchar">gchar</link>*      g_utf8_strdown                  (const <link linkend="gchar">gchar</link> *str,                                             <link linkend="gssize">gssize</link> len);</programlisting><para>Converts all Unicode characters in the string that have a caseto lowercase. The exact manner that this is done dependson the current locale, and may result in the number ofcharacters in the string changing.</para><para></para><informaltable pgwide="1" frame="none" role="params"><tgroup cols="2"><colspec colwidth="2*"><colspec colwidth="8*"><tbody><row><entry align="right"><parameter>str</parameter>&nbsp;:</entry><entry> a UTF-8 encoded string</entry></row><row><entry align="right"><parameter>len</parameter>&nbsp;:</entry><entry> length of <parameter>str</parameter>, in bytes, or -1 if <parameter>str</parameter> is nul-terminated.</entry></row><row><entry align="right"><emphasis>Returns</emphasis> :</entry><entry> a newly allocated string, with all characters   converted to lowercase.  </entry></row></tbody></tgroup></informaltable></refsect2><refsect2><title><anchor id="g-utf8-casefold">g_utf8_casefold ()</title><programlisting><link linkend="gchar">gchar</link>*      g_utf8_casefold                 (const <link linkend="gchar">gchar</link> *str,                                             <link linkend="gssize">gssize</link> len);</programlisting><para>Converts a string into a form that is independent of case. Theresult will not correspond to any particular case, but can becompared for equality or ordered with the results of calling<link linkend="g-utf8-casefold">g_utf8_casefold</link>() on other strings.</para><para>Note that calling <link linkend="g-utf8-casefold">g_utf8_casefold</link>() followed by <link linkend="g-utf8-collate">g_utf8_collate</link>() isonly an approximation to the correct linguistic case insensitiveordering, though it is a fairly good one. Getting this exactlyright would require a more sophisticated collation function thattakes case sensitivity into account. GLib does not currentlyprovide such a function.</para><para></para><informaltable pgwide="1" frame="none" role="params"><tgroup cols="2"><colspec colwidth="2*"><colspec colwidth="8*"><tbody><row><entry align="right"><parameter>str</parameter>&nbsp;:</entry><entry> a UTF-8 encoded string</entry></row><row><entry align="right"><parameter>len</parameter>&nbsp;:</entry><entry> length of <parameter>str</parameter>, in bytes, or -1 if <parameter>str</parameter> is nul-terminated.</entry></row><row><entry align="right"><emphasis>Returns</emphasis> :</entry><entry> a newly allocated string, that is a  case independent form of <parameter>str</parameter>.</entry></row></tbody></tgroup></informaltable></refsect2><refsect2><title><anchor id="g-utf8-normalize">g_utf8_normalize ()</title><programlisting><link linkend="gchar">gchar</link>*      g_utf8_normalize                (const <link linkend="gchar">gchar</link> *str,                                             <link linkend="gssize">gssize</link> len,                                             <link linkend="GNormalizeMode">GNormalizeMode</link> mode);</programlisting><para>Converts a string into canonical form, standardizingsuch issues as whether a character with an accentis represented as a base character and combiningaccent or as a single precomposed character. Youshould generally call <link linkend="g-utf8-normalize">g_utf8_normalize</link>() beforecomparing two Unicode strings.</para><para>The normalization mode <literal>G_NORMALIZE_DEFAULT</literal> onlystandardizes differences that do not affect thetext content, such as the above-mentioned accentrepresentation. <literal>G_NORMALIZE_ALL</literal> also standardizesthe "compatibility" characters in Unicode, suchas SUPERSCRIPT THREE to the standard forms(in this case DIGIT THREE). Formatting informationmay be lost but for most text operations suchcharacters should be considered the same.For example, <link linkend="g-utf8-collate">g_utf8_collate</link>() normalizeswith <literal>G_NORMALIZE_ALL</literal> as its first step.</para><para><literal>G_NORMALIZE_DEFAULT_COMPOSE</literal> and <literal>G_NORMALIZE_ALL_COMPOSE</literal>are like <literal>G_NORMALIZE_DEFAULT</literal> and <literal>G_NORMALIZE_ALL</literal>,but returned a result with composed forms ratherthan a maximally decomposed form. This is oftenuseful if you intend to convert the string toa legacy encoding or pass it to a system withless capable Unicode handling.</para><para></para><informaltable pgwide="1" frame="none" role="params"><tgroup cols="2"><colspec colwidth="2*"><colspec colwidth="8*"><tbody><row><entry align="right"><parameter>str</parameter>&nbsp;:</entry><entry> a UTF-8 encoded string.</entry></row><row><entry align="right"><parameter>len</parameter>&nbsp;:</entry><entry> length of <parameter>str</parameter>, in bytes, or -1 if <parameter>str</parameter> is nul-terminated.</entry></row><row><entry align="right"><parameter>mode</parameter>&nbsp;:</entry><entry> the type of normalization to perform.</entry></row><row><entry align="right"><emphasis>Returns</emphasis> :</entry><entry> a newly allocated string, that is the   normalized form of <parameter>str</parameter>.</entry></row></tbody></tgroup></informaltable></refsect2><refsect2><title><anchor id="GNormalizeMode">enum GNormalizeMode</title><programlisting>typedef enum {  G_NORMALIZE_DEFAULT,  G_NORMALIZE_NFD = G_NORMALIZE_DEFAULT,  G_NORMALIZE_DEFAULT_COMPOSE,  G_NORMALIZE_NFC = G_NORMALIZE_DEFAULT_COMPOSE,  G_NORMALIZE_ALL,  G_NORMALIZE_NFKD = G_NORMALIZE_ALL,  G_NORMALIZE_ALL_COMPOSE,  G_NORMALIZE_NFKC = G_NORMALIZE_ALL_COMPOSE} GNormalizeMode;</programlisting><para>Defines how a Unicode string is transformed in a canonical form, standardizing such issues as whether a character with an accent is represented as a base character and combining accent or as a single precomposedcharacter. Unicode strings should generally be normalized before comparing them.</para><informaltable pgwide="1" frame="none" role="enum"><tgroup cols="2"><colspec colwidth="2*"><colspec colwidth="8*"><tbody><row><entry><literal>G_NORMALIZE_DEFAULT</literal></entry><entry>standardize differences that do not affect the  text content, such as the above-mentioned accent representation.</entry></row><row><entry><literal>G_NORMALIZE_NFD</literal></entry><entry>another name for <literal>G_NORMALIZE_DEFAULT</literal>.</entry></row><row><entry><literal>G_NORMALIZE_DEFAULT_COMPOSE</literal></entry><entry>like <literal>G_NORMALIZE_DEFAULT</literal>, but with composed  forms rather than a maximally decomposed form.</entry></row><row><entry><literal>G_NORMALIZE_NFC</literal></entry><entry>another name for <literal>G_NORMALIZE_DEFAULT_COMPOSE</literal>.</entry></row><row><entry><literal>G_NORMALIZE_ALL</literal></entry><entry>beyond <literal>G_NORMALIZE_DEFAULT</literal> also standardize the   "compatibility" characters in Unicode, such as SUPERSCRIPT THREE to the   standard forms (in this case DIGIT THREE). Formatting information may be   lost but for most text operations such characters should be considered the   same.</entry></row><row><entry><literal>G_NORMALIZE_NFKD</literal></entry><entry>another name for <literal>G_NORMALIZE_ALL</literal>.</entry></row><row><entry><literal>G_NORMALIZE_ALL_COMPOSE</literal></entry><entry>like <literal>G_NORMALIZE_ALL</literal>, but with composed  forms rather than a maximally decomposed form.</entry></row><row><entry><literal>G_NORMALIZE_NFKC</literal></entry><entry>another name for <literal>G_NORMALIZE_ALL_COMPOSE</literal>.</entry></row></tbody></tgroup></informaltable></refsect2><refsect2><title><anchor id="g-utf8-collate">g_utf8_collate ()</title><programlisting><link linkend="gint">gint</link>        g_utf8_collate                  (const <link linkend="gchar">gchar</link> *str1,                                             const <link linkend="gchar">gchar</link> *str2);</programlisting><para>Compares two strings for ordering using the linguisticallycorrect rules for the current locale. When sorting a largenumber of strings, it will be significantly faster toobtain collation keys with <link linkend="g-utf8-collate-key">g_utf8_collate_key</link>() and compare the keys with <function><link linkend="strcmp">strcmp</link>()</function> when sorting instead of sorting the original strings.</para><para>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -