📄 00000001.htm
字号:
<HTML><HEAD> <TITLE>BBS水木清华站∶精华区</TITLE></HEAD><BODY><CENTER><H1>BBS水木清华站∶精华区</H1></CENTER>发信人: cybergene (基因~也许以后~~), 信区: Linux <BR>标 题: How to Use Tcl 8.1 Internationalization Features <BR>发信站: BBS 水木清华站 (Thu Dec 14 15:54:36 2000) <BR> <BR> <BR>How to Use Tcl 8.1 Internationalization Features <BR> <BR>TclPro Extensions | Wrap TclPro | Compile Tcl | Stub Libraries | Threads <BR> | Windows Extensions | Regular Expressions | I18N <BR> <BR>Tcl's new internationalization facilities allow you to create Tcl <BR>applications that support any multi-byte language, including Chinese and <BR> Japanese. Tcl also now includes support for message catalogs, which <BR>makes it easier to create localized versions of applications and <BR>packages. Tcl is the first cross-platform scripting language to help <BR>developers to deploy both commercial and enterprise network applications <BR> on a global scale. <BR> <BR>This document provides a quick overview of the internationalization <BR>features introduced in Tcl 8.1. Topics include: <BR> <BR>Character Encoding Overview <BR>Character Encodings and the Operating System <BR>General String Manipulation <BR>Channel Input/Output <BR>Sourcing Scripts in Different Encodings <BR>Converting Strings to Different Encodings <BR>Fonts, Encodings, and Tk Widgets <BR>Message Catalogs <BR>Internationalization and the Tcl C APIs <BR>Summary: Tcl Internationalization Support at a Glance <BR>Character Encoding Overview <BR>A character encoding is simply a mapping of characters and symbols <BR>used in written language into a binary format used by computers. For <BR>example, in the standard ASCII encoding, the upper-case "A" character <BR>from the Latin character set is represented by the byte value 0x41 in <BR>hexadecimal. Other widely used character encodings include ISO 8859-1, <BR>used by many European languages, Shift-JIS and EUC-JP for Japanese <BR>characters, and Big5 for Chinese characters. <BR> <BR>The Unicode Standard is a fixed-width, uniform encoding scheme for <BR>virtually all characters used in the world's major written languages. <BR>Unicode uses a 16-bit encoding for all text elements. These text <BR>elements include letters such as "w" or "M", characters such as those <BR>used in Japanese Hiragana to represent syllables, or ideographs such <BR>as those used in Chinese to represent full words or concepts. The <BR>Unicode Standard does not specify the visual representation of a <BR>character, which is known as a glyph. For more information on the <BR>Unicode Standard, visit the Unicode web site at <A HREF="http://www.unicode.org.">http://www.unicode.org.</A> <BR> <BR>UTF-8 is a standard transformation format for Unicode characters. It <BR>is a method of transforming all Unicode characters into a variable <BR>length encoding of bytes; a single Unicode character can be <BR>represented by one, two, or three bytes. The advantage of the UTF-8 <BR>standard is that it and the Unicode standard were designed so that <BR>Unicode characters corresponding to the standard ASCII set (up to <BR>ASCII value 0x7F in hexadecimal) have the same byte values in both UTF-8 <BR> and ASCII encoding. In other words, an upper-case "A" character is <BR>represented by the single-byte value 0x41 in both UTF-8 and ASCII <BR>encoding. <BR> <BR>Beginning in Tcl 8.1, Tcl represents all strings internally as Unicode <BR>characters in UTF-8 format. Tcl 8.1 also ships with built-in support for <BR> approximately 30 common character encoding standards, and can convert <BR>strings from one encoding to another. The encoding names command <BR>displays a list of all known encodings. You can create additional <BR>encodings as described in the Tcl_GetEncoding.3 reference page. <BR> <BR>Tip: Because 7-bit ASCII characters have the same encoding in UTF-8 <BR>format, legacy Tcl scripts that use only 7-bit ASCII characters function <BR> the same in Tcl 8.1 as they did in Tcl 8.0. Furthermore, because the <BR>use of Unicode/UTF-8 encoding is internal to Tcl, most string handling <BR>in legacy Tcl scripts works the same in Tcl 8.1 as it did in Tcl 8.0. <BR>Most problems in converting from Tcl 8.0 to 8.1 occur in: 1) using <BR>non-Latin characters, 2) reading and writing strings from a channel, and <BR> 3) writing code that assumes that each character in a string is a fixed <BR> byte width (for example, one byte per character). <BR> <BR>Character Encodings and the Operating System <BR>The system encoding is the character encoding used by the operating <BR>system for items such as file names and environment variables. Text <BR>files used by text editors and other applications are usually encoded in <BR> the system encoding as well, unless the application that produced <BR>them explicitly saves them in another format (for example, if you use <BR>a Shift-JIS text editor on an ISO 8859-1 system). <BR> <BR>Tcl automatically converts strings from UTF-8 format to the system <BR>encoding and vice versa whenever it communicates with the operating <BR>system. For example, Tcl automatically handles any encoding conversion <BR>needed if you execute commands such as: <BR> <BR>% glob * <BR>or <BR> <BR>% set fd [open "Espa?ol.txt" w] <BR>The Tcl source command also reads files using the system encoding, and <BR>strings passed to and from the Tcl exec command are converted to and <BR>from the system encoding. <BR> <BR>Tcl attempts to determine the system encoding during initialization <BR>based on the platform and locale settings. Tcl usually can determine a <BR>reasonable default system encoding based on these settings, but if for <BR>some reason it cannot, it uses ISO 8859-1 as the default system <BR>encoding. <BR> <BR>You can override the default system encoding with the encoding system <BR>command. Ajuba Solutions recommends that you avoid using this command if <BR> at all possible. If you set the default system encoding to anything <BR>other than the actual encoding used by your operating system, Tcl will <BR>likely find it impossible to communicate properly with your operating <BR>system. <BR> <BR>Note: For reading and writing files in an encoding other than the system <BR> encoding, you need to use the fconfigure -encoding command (not the <BR>encoding system command) as described in the "Channel Input/Output" <BR>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -