📄 00000001.htm

📁 一份很好的linux入门资料
💻 HTM
📖 第 1 页 / 共 4 页
字号:
section&nbsp;of&nbsp;this&nbsp;document.&nbsp;Also&nbsp;see&nbsp;the&nbsp;&quot;Sourcing&nbsp;Scripts&nbsp;in&nbsp;Different&nbsp;&nbsp;<BR>Encodings&quot;&nbsp;section&nbsp;of&nbsp;this&nbsp;document&nbsp;for&nbsp;special&nbsp;instructions&nbsp;for&nbsp;&nbsp;<BR>sourcing&nbsp;files&nbsp;in&nbsp;formats&nbsp;other&nbsp;than&nbsp;the&nbsp;system&nbsp;encoding.&nbsp;<BR>&nbsp;<BR>General&nbsp;String&nbsp;Manipulation&nbsp;<BR>Beginning&nbsp;in&nbsp;Tcl&nbsp;8.1,&nbsp;all&nbsp;Tcl&nbsp;string&nbsp;manipulation&nbsp;functions&nbsp;expect&nbsp;and&nbsp;&nbsp;<BR>return&nbsp;Unicode&nbsp;strings&nbsp;encoded&nbsp;in&nbsp;UTF-8&nbsp;format.&nbsp;Because&nbsp;the&nbsp;use&nbsp;of&nbsp;&nbsp;<BR>Unicode/UTF-8&nbsp;encoding&nbsp;is&nbsp;internal&nbsp;to&nbsp;Tcl,&nbsp;you&nbsp;should&nbsp;see&nbsp;no&nbsp;&nbsp;<BR>difference&nbsp;in&nbsp;Tcl&nbsp;8.0&nbsp;and&nbsp;8.1&nbsp;string&nbsp;handling&nbsp;in&nbsp;your&nbsp;scripts.&nbsp;<BR>&nbsp;<BR>The&nbsp;Tcl&nbsp;string&nbsp;functions&nbsp;properly&nbsp;handle&nbsp;multi-byte&nbsp;UTF-8&nbsp;characters&nbsp;&nbsp;<BR>as&nbsp;single&nbsp;characters.&nbsp;For&nbsp;example&nbsp;in&nbsp;the&nbsp;following&nbsp;commands,&nbsp;Tcl&nbsp;&nbsp;<BR>treats&nbsp;the&nbsp;string&nbsp;&quot;Café&quot;&nbsp;as&nbsp;a&nbsp;four-character&nbsp;string,&nbsp;even&nbsp;though&nbsp;the&nbsp;&nbsp;<BR>internal&nbsp;representation&nbsp;in&nbsp;UTF-8&nbsp;format&nbsp;requires&nbsp;five&nbsp;bytes.&nbsp;(As&nbsp;with&nbsp;&nbsp;<BR>previous&nbsp;versions&nbsp;of&nbsp;Tcl,&nbsp;string&nbsp;indexes&nbsp;start&nbsp;with&nbsp;&quot;0&quot;;&nbsp;that&nbsp;is,&nbsp;the&nbsp;&nbsp;<BR>first&nbsp;character&nbsp;is&nbsp;index&nbsp;&quot;0&quot;,&nbsp;the&nbsp;second&nbsp;character&nbsp;is&nbsp;index&nbsp;&quot;1&quot;,&nbsp;etc.)&nbsp;<BR>&nbsp;<BR>%&nbsp;set&nbsp;unistr&nbsp;&quot;Café&quot;&nbsp;<BR>Café&nbsp;<BR>%&nbsp;string&nbsp;length&nbsp;$unistr&nbsp;<BR>4&nbsp;<BR>%&nbsp;string&nbsp;index&nbsp;$unistr&nbsp;3&nbsp;<BR>é&nbsp;<BR>Furthermore,&nbsp;the&nbsp;new&nbsp;regular&nbsp;expression&nbsp;implementation&nbsp;introduced&nbsp;in&nbsp;Tcl&nbsp;<BR>&nbsp;8.1&nbsp;handles&nbsp;the&nbsp;full&nbsp;range&nbsp;of&nbsp;Unicode&nbsp;characters.&nbsp;<BR>&nbsp;<BR>The&nbsp;&quot;\uxxxx&quot;&nbsp;escape&nbsp;sequence&nbsp;allows&nbsp;you&nbsp;to&nbsp;specify&nbsp;a&nbsp;Unicode&nbsp;character&nbsp;&nbsp;<BR>by&nbsp;its&nbsp;four-digit,&nbsp;hexadecimal&nbsp;Unicode&nbsp;code&nbsp;value.&nbsp;For&nbsp;example,&nbsp;the&nbsp;&nbsp;<BR>following&nbsp;assigns&nbsp;to&nbsp;a&nbsp;variable&nbsp;two&nbsp;ideograph&nbsp;characters&nbsp;corresponding&nbsp;&nbsp;<BR>to&nbsp;the&nbsp;Chinese&nbsp;transliteration&nbsp;of&nbsp;&quot;Tcl&quot;&nbsp;(TAI-KU):&nbsp;<BR>&nbsp;<BR>set&nbsp;tclstr&nbsp;&quot;\u592a\u9177&quot;&nbsp;<BR>Channel&nbsp;Input/Output&nbsp;<BR>When&nbsp;reading&nbsp;and&nbsp;writing&nbsp;data&nbsp;on&nbsp;a&nbsp;channel,&nbsp;you&nbsp;need&nbsp;to&nbsp;ensure&nbsp;that&nbsp;&nbsp;<BR>Tcl&nbsp;uses&nbsp;the&nbsp;proper&nbsp;character&nbsp;encoding&nbsp;for&nbsp;that&nbsp;channel.&nbsp;The&nbsp;default&nbsp;&nbsp;<BR>encoding&nbsp;for&nbsp;newly&nbsp;opened&nbsp;channels&nbsp;(both&nbsp;files&nbsp;and&nbsp;sockets)&nbsp;is&nbsp;the&nbsp;&nbsp;<BR>same&nbsp;as&nbsp;the&nbsp;platform-&nbsp;and&nbsp;locale-dependent&nbsp;system&nbsp;encoding&nbsp;used&nbsp;for&nbsp;&nbsp;<BR>interfacing&nbsp;with&nbsp;the&nbsp;operating&nbsp;system.&nbsp;(See&nbsp;the&nbsp;&quot;Character&nbsp;Encodings&nbsp;and&nbsp;<BR>&nbsp;the&nbsp;Operating&nbsp;System&quot;&nbsp;section&nbsp;of&nbsp;this&nbsp;document&nbsp;for&nbsp;more&nbsp;information.)&nbsp;&nbsp;<BR>In&nbsp;most&nbsp;cases,&nbsp;you&nbsp;don't&nbsp;need&nbsp;to&nbsp;do&nbsp;anything&nbsp;special&nbsp;to&nbsp;read&nbsp;or&nbsp;write&nbsp;&nbsp;<BR>data&nbsp;because&nbsp;most&nbsp;text&nbsp;files&nbsp;are&nbsp;created&nbsp;in&nbsp;the&nbsp;system&nbsp;encoding.&nbsp;You&nbsp;&nbsp;<BR>need&nbsp;to&nbsp;take&nbsp;special&nbsp;steps&nbsp;only&nbsp;when&nbsp;accessing&nbsp;files&nbsp;in&nbsp;an&nbsp;encoding&nbsp;&nbsp;<BR>other&nbsp;than&nbsp;the&nbsp;system&nbsp;encoding&nbsp;(for&nbsp;example,&nbsp;reading&nbsp;a&nbsp;file&nbsp;encoded&nbsp;in&nbsp;&nbsp;<BR>Shift-JIS&nbsp;format&nbsp;when&nbsp;your&nbsp;system&nbsp;encoding&nbsp;is&nbsp;ISO&nbsp;8859-1).&nbsp;<BR>&nbsp;<BR>The&nbsp;fconfigure&nbsp;-encoding&nbsp;option&nbsp;allows&nbsp;you&nbsp;to&nbsp;specify&nbsp;the&nbsp;encoding&nbsp;for&nbsp;a&nbsp;<BR>&nbsp;channel.&nbsp;Thus,&nbsp;to&nbsp;read&nbsp;from&nbsp;a&nbsp;file&nbsp;encoded&nbsp;in&nbsp;Shift-JIS&nbsp;format,&nbsp;you&nbsp;&nbsp;<BR>should&nbsp;execute&nbsp;the&nbsp;following&nbsp;commands:&nbsp;<BR>&nbsp;<BR>set&nbsp;fd&nbsp;[open&nbsp;$file&nbsp;r]&nbsp;<BR>fconfigure&nbsp;$fd&nbsp;-encoding&nbsp;shiftjis&nbsp;<BR>Tcl&nbsp;then&nbsp;automatically&nbsp;converts&nbsp;any&nbsp;text&nbsp;you&nbsp;read&nbsp;from&nbsp;the&nbsp;file&nbsp;into&nbsp;&nbsp;<BR>standard&nbsp;UTF-8&nbsp;format.&nbsp;<BR>&nbsp;<BR>Similarly,&nbsp;if&nbsp;you&nbsp;are&nbsp;writing&nbsp;to&nbsp;a&nbsp;channel,&nbsp;you&nbsp;can&nbsp;use&nbsp;fconfigure&nbsp;&nbsp;<BR>-encoding&nbsp;to&nbsp;specify&nbsp;the&nbsp;target&nbsp;character&nbsp;encoding&nbsp;and&nbsp;Tcl&nbsp;automatically&nbsp;<BR>&nbsp;converts&nbsp;strings&nbsp;from&nbsp;UTF-8&nbsp;to&nbsp;that&nbsp;encoding&nbsp;on&nbsp;output.&nbsp;<BR>&nbsp;<BR>Note:&nbsp;The&nbsp;Tcl&nbsp;source&nbsp;command&nbsp;always&nbsp;reads&nbsp;files&nbsp;using&nbsp;the&nbsp;system&nbsp;&nbsp;<BR>encoding.&nbsp;For&nbsp;a&nbsp;tip&nbsp;on&nbsp;sourcing&nbsp;files&nbsp;in&nbsp;different&nbsp;encodings,&nbsp;see&nbsp;the&nbsp;&nbsp;<BR>&quot;Sourcing&nbsp;Scripts&nbsp;in&nbsp;Different&nbsp;Encodings&quot;&nbsp;section&nbsp;of&nbsp;this&nbsp;document.&nbsp;<BR>&nbsp;<BR>Sourcing&nbsp;Scripts&nbsp;in&nbsp;Different&nbsp;Encodings&nbsp;<BR>The&nbsp;Tcl&nbsp;source&nbsp;command&nbsp;always&nbsp;reads&nbsp;files&nbsp;using&nbsp;the&nbsp;system&nbsp;encoding.&nbsp;&nbsp;<BR>Therefore,&nbsp;Ajuba&nbsp;Solutions&nbsp;recommends&nbsp;that&nbsp;whenever&nbsp;possible,&nbsp;you&nbsp;author&nbsp;<BR>&nbsp;scripts&nbsp;in&nbsp;the&nbsp;native&nbsp;system&nbsp;encoding.&nbsp;<BR>&nbsp;<BR>A&nbsp;difficulty&nbsp;arises&nbsp;when&nbsp;distributing&nbsp;scripts&nbsp;internationally,&nbsp;as&nbsp;you&nbsp;&nbsp;<BR>don't&nbsp;necessarily&nbsp;know&nbsp;what&nbsp;the&nbsp;system&nbsp;encoding&nbsp;will&nbsp;be.&nbsp;Fortunately,&nbsp;&nbsp;<BR>most&nbsp;common&nbsp;character&nbsp;encodings&nbsp;include&nbsp;the&nbsp;standard&nbsp;7-bit&nbsp;ASCII&nbsp;&nbsp;<BR>characters&nbsp;as&nbsp;a&nbsp;subset.&nbsp;Therefore,&nbsp;you&nbsp;are&nbsp;usually&nbsp;safe&nbsp;if&nbsp;your&nbsp;script&nbsp;&nbsp;<BR>contains&nbsp;only&nbsp;7-bit&nbsp;ASCII&nbsp;characters.&nbsp;<BR>&nbsp;<BR>If&nbsp;you&nbsp;need&nbsp;to&nbsp;use&nbsp;an&nbsp;extended&nbsp;character&nbsp;set&nbsp;for&nbsp;your&nbsp;scripts&nbsp;that&nbsp;you&nbsp;&nbsp;<BR>distribute,&nbsp;you&nbsp;can&nbsp;provide&nbsp;a&nbsp;small&nbsp;&quot;bootstrap&quot;&nbsp;script&nbsp;written&nbsp;in&nbsp;&nbsp;<BR>7-bit&nbsp;ASCII.&nbsp;The&nbsp;bootstrap&nbsp;script&nbsp;can&nbsp;then&nbsp;load&nbsp;and&nbsp;execute&nbsp;scripts&nbsp;in&nbsp;&nbsp;<BR>any&nbsp;encoding&nbsp;that&nbsp;you&nbsp;choose.&nbsp;<BR>&nbsp;<BR>You&nbsp;can&nbsp;execute&nbsp;a&nbsp;script&nbsp;written&nbsp;in&nbsp;an&nbsp;encoding&nbsp;other&nbsp;than&nbsp;the&nbsp;system&nbsp;&nbsp;<BR>encoding&nbsp;by&nbsp;opening&nbsp;the&nbsp;file,&nbsp;setting&nbsp;the&nbsp;proper&nbsp;encoding&nbsp;using&nbsp;the&nbsp;&nbsp;<BR>fconfigure&nbsp;-encoding&nbsp;command,&nbsp;reading&nbsp;the&nbsp;file&nbsp;into&nbsp;a&nbsp;variable,&nbsp;and&nbsp;then&nbsp;<BR>&nbsp;evaluating&nbsp;the&nbsp;string&nbsp;with&nbsp;the&nbsp;eval&nbsp;command.&nbsp;For&nbsp;example,&nbsp;the&nbsp;following&nbsp;<BR>&nbsp;reads&nbsp;and&nbsp;executes&nbsp;a&nbsp;Tcl&nbsp;script&nbsp;encoded&nbsp;in&nbsp;EUC-JP:&nbsp;<BR>&nbsp;<BR>set&nbsp;fd&nbsp;[open&nbsp;&quot;app.tcl&quot;&nbsp;r]&nbsp;<BR>fconfigure&nbsp;$fd&nbsp;聳encoding&nbsp;euc-jp&nbsp;<BR>set&nbsp;jpscript&nbsp;[read&nbsp;$fd]&nbsp;<BR>close&nbsp;$fd&nbsp;<BR>eval&nbsp;$jpscript&nbsp;<BR>&nbsp;<BR>Note:&nbsp;This&nbsp;technique&nbsp;works&nbsp;only&nbsp;if&nbsp;the&nbsp;file&nbsp;contains&nbsp;actual&nbsp;EUC-JP&nbsp;&nbsp;<BR>encoded&nbsp;characters&nbsp;(for&nbsp;example,&nbsp;you&nbsp;created&nbsp;the&nbsp;file&nbsp;with&nbsp;a&nbsp;EUC-JP&nbsp;text&nbsp;<BR>&nbsp;editor).&nbsp;This&nbsp;technique&nbsp;doesn't&nbsp;work&nbsp;if&nbsp;you&nbsp;build&nbsp;the&nbsp;EUC-JP&nbsp;encoded&nbsp;&nbsp;<BR>characters&nbsp;using&nbsp;the&nbsp;&quot;\x&quot;&nbsp;or&nbsp;octal&nbsp;digit&nbsp;escape&nbsp;sequences.&nbsp;Tcl&nbsp;8.1&nbsp;&nbsp;<BR>interprets&nbsp;each&nbsp;&quot;\x&quot;&nbsp;or&nbsp;octal&nbsp;digit&nbsp;escape&nbsp;sequence&nbsp;as&nbsp;a&nbsp;single&nbsp;&nbsp;<BR>Unicode&nbsp;character&nbsp;with&nbsp;the&nbsp;upper&nbsp;bits&nbsp;set&nbsp;to&nbsp;0.&nbsp;For&nbsp;example,&nbsp;if&nbsp;the&nbsp;&nbsp;<BR>script&nbsp;app.tcl&nbsp;above&nbsp;contained&nbsp;the&nbsp;line:&nbsp;<BR>&nbsp;<BR>set&nbsp;ha&nbsp;&quot;\xA4\xCF&quot;&nbsp;<BR>then&nbsp;the&nbsp;variable&nbsp;ha&nbsp;would&nbsp;contain&nbsp;two&nbsp;characters,&nbsp;&quot;陇?&quot;&nbsp;(Unicode&nbsp;&nbsp;<BR>characters&nbsp;&quot;CURRENCY&nbsp;SIGN&quot;&nbsp;and&nbsp;&quot;LATIN&nbsp;CAPITAL&nbsp;LETTER&nbsp;I&nbsp;WITH&nbsp;DIAERESIS&quot;),&nbsp;<BR>&nbsp;not&nbsp;the&nbsp;Unicode&nbsp;HA&nbsp;character.&nbsp;<BR>&nbsp;<BR>Converting&nbsp;Strings&nbsp;to&nbsp;Different&nbsp;Encodings&nbsp;<BR>You&nbsp;can&nbsp;convert&nbsp;a&nbsp;string&nbsp;to&nbsp;a&nbsp;different&nbsp;encoding&nbsp;using&nbsp;the&nbsp;encoding&nbsp;&nbsp;<BR>convertfrom&nbsp;and&nbsp;encoding&nbsp;convertto&nbsp;commands.&nbsp;The&nbsp;encoding&nbsp;convertfrom&nbsp;&nbsp;<BR>command&nbsp;converts&nbsp;a&nbsp;string&nbsp;from&nbsp;a&nbsp;specified&nbsp;encoding&nbsp;into&nbsp;UTF-8&nbsp;Unicode&nbsp;&nbsp;<BR>characters;&nbsp;the&nbsp;encoding&nbsp;convertto&nbsp;command&nbsp;converts&nbsp;a&nbsp;string&nbsp;from&nbsp;&nbsp;<BR>UTF-8&nbsp;Unicode&nbsp;into&nbsp;a&nbsp;specified&nbsp;encoding.&nbsp;In&nbsp;either&nbsp;case,&nbsp;if&nbsp;you&nbsp;omit&nbsp;the&nbsp;<BR>&nbsp;encoding&nbsp;argument,&nbsp;the&nbsp;command&nbsp;uses&nbsp;the&nbsp;current&nbsp;system&nbsp;encoding.&nbsp;<BR>&nbsp;<BR>As&nbsp;an&nbsp;example,&nbsp;the&nbsp;following&nbsp;command&nbsp;converts&nbsp;a&nbsp;string&nbsp;representing&nbsp;&nbsp;<BR>the&nbsp;Hiragana&nbsp;letter&nbsp;HA&nbsp;from&nbsp;EUC-JP&nbsp;encoding&nbsp;into&nbsp;a&nbsp;Unicode&nbsp;string:&nbsp;<BR>&nbsp;<BR>set&nbsp;ha&nbsp;[encoding&nbsp;convertfrom&nbsp;euc-jp&nbsp;&quot;\xA4\xCF&quot;]&nbsp;<BR>(In&nbsp;Tcl&nbsp;8.1,&nbsp;the&nbsp;&quot;\x&quot;&nbsp;and&nbsp;octal&nbsp;digit&nbsp;escape&nbsp;sequences&nbsp;specify&nbsp;the&nbsp;lower&nbsp;<BR>&nbsp;8&nbsp;bits&nbsp;of&nbsp;a&nbsp;Unicode&nbsp;character&nbsp;with&nbsp;the&nbsp;upper&nbsp;8&nbsp;bits&nbsp;set&nbsp;to&nbsp;0.&nbsp;The&nbsp;&nbsp;<BR>thus&nbsp;the&nbsp;string&nbsp;&quot;\xA4\xCF&quot;&nbsp;still&nbsp;specifies&nbsp;two&nbsp;characters&nbsp;in&nbsp;Tcl&nbsp;8.1,&nbsp;&nbsp;<BR>just&nbsp;as&nbsp;it&nbsp;did&nbsp;in&nbsp;Tcl&nbsp;8.0;&nbsp;however&nbsp;Tcl&nbsp;8.1&nbsp;stores&nbsp;those&nbsp;characters&nbsp;in&nbsp;&nbsp;<BR>four&nbsp;bytes,&nbsp;whereas&nbsp;Tcl&nbsp;8.0&nbsp;stored&nbsp;them&nbsp;in&nbsp;two&nbsp;bytes.)&nbsp;<BR>&nbsp;<BR>Fonts,&nbsp;Encodings,&nbsp;and&nbsp;Tk&nbsp;Widgets&nbsp;<BR>Tk&nbsp;widgets&nbsp;that&nbsp;display&nbsp;text&nbsp;now&nbsp;require&nbsp;text&nbsp;strings&nbsp;in&nbsp;Unicode/UTF-8&nbsp;&nbsp;<BR>encoding.&nbsp;Tk&nbsp;automatically&nbsp;handles&nbsp;any&nbsp;encoding&nbsp;conversion&nbsp;necessary&nbsp;&nbsp;<BR>to&nbsp;display&nbsp;the&nbsp;characters&nbsp;in&nbsp;a&nbsp;particular&nbsp;font.&nbsp;<BR>
💿 文件大小 9792 K
👤 上传用户 cenxudong4
📂 所属分类 Linux/Unix编程
🏷️ 相关标签

#linux
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -