📄 xim.txt

📁 linux 下的 oxim 输入法,简单易用.
💻 TXT
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
                         X Window System,  Version 11                          Input Method Specifications                      Public Review Draft - November 1990                   (Send comments to i18n@expo.lcs.mit.edu)                                Vania Joloboff                           Open Software Foundation                                 Bill McMahon                            Hewlett Packard Company                                   ABSTRACT       This chapter addresses the portability and interoperability of       programs in different countries. It describes specifications pro-       viding to clients of the X Window System Version 11, an interface       for input handling of characters in various languages.  The       specifications make it possible to develop portable applications       independent of a particular language or a particular encoding of       characters.  The specifications are consistent with related       specifications from X/Open Portability Guide, Release 3, and       ANSI-C. The reader is assumed to be familiar with those, particu-       larly with the notion of locale in the C language, therefore they       will not be detailed here.        Copyright c 1990 by the Massachusetts Institute of Technology.Permission to use, copy, modify, and distribute this documentation for any pur-pose and without fee is hereby granted, provided that the above copyrightnotice and this permission notice appear in all copies.  MIT makes no represen-tations about the suitability for any purpose of the information in this docu-ment.  It is provided "as is" without express or implied warranty.  This docu-ment is only a draft standard of the MIT X Consortium and is therefore subjectto change.                             1XIM Public Review DraftX Window System is a trademark of the Massachusetts Institute of Technology.                             2XIM Public Review Draft1.  Input Method OverviewThe next paragraphs provide definitions for terms  and  concepts  used  in  thespecification,  and  a  brief  overview of the intended use of the abstractionsdeveloped for Xlib internationalization.1.1.  What are Input Methods ?A large number of languages in the world rely on an alphabet, a  small  set  ofsymbols  (letters)  used  to  form  words.  To enter text into a computer in analphabetic language a user usually has a keyboard on  which  there  exists  keysymbols  corresponding  to  the  alphabet.   Sometimes,  a few characters of analphabetic language are missing on the  keyboard.   Many  computer  users,  whospeak a Latin alphabet based language only have a English-based keyboard.  Theyneed to hit a combination of keystrokes in order to enter a character that doesnot exist directly on the keyboard.  A number of algorithms have been developedfor entering such characters, known as European input methods, or compose inputmethod, or dead-keys input method.In some alphabetic languages, the rendering of characters  strings  is  contextsensitive.  When  entering  characters in those languages, a keystroke does notsystematically mean appending a new symbol at the end of  the  string.  It  maymodify the existing strings.  Both input and output methods may be used in suchlanguages.With an ideographic writing system, rather than taking a small set  of  symbolsand combining them in different ways to create words, each word consists of oneunique symbol (or, occasionally, several symbols).  The number of  symbols  maybe  very  large: 150 000 have been identified in Hanzi, the Chinese ideographicsystem.There are two major aspects of ideographics system for  their  computer  usage.First,  the standard computer character sets in Japan, China, and Korea includeroughly 8 000 characters, while sets in Taiwan have between 15 000 and  30  000characters,  which  make  it necessary to use more than one byte to represent acharacter.  Second, it  obviously  is  impractical  to  have  a  keyboard  thatincludes  all  of a given language's ideographic symbols.  Therefore a specificmechanism is required for entering characters so that a keyboard with a reason-able  number of keys can be used.  Those input methods are usually based on thelanguage's phonetics, but there also exist methods based on the  graphics  pro-perties of characters.In addition to the ideographic characters, a number  of  languages  often  alsoinclude  a  phonetic (alphabetic-based) writing system.  The phonetic signs arethen engraved on the keyboard and  the  keystrokes  are  transformed  to  theirappropriate  ideographic  counterparts.   Here's  a  brief  description  of theJapanese and Korean phonetic systems:o  Japanese: There are two phonetic symbol sets: katakana and hiragana. In gen-   eral,  you  use  katakana for words that are of foreign origin, and hiragana   for writing native Japanese words.  Collectively, the two systems are called   kana.  Each  set  consists  of approximately 50 characters.  You type either   kana or English characters and define the region that you want to convert to                             3XIM Public Review Draft   kanji.   Several kanji characters may have the same phonetic representation.   If that's the case with your string, you get a menu of characters and choose   the  appropriate  one.  If no choice is necessary, the input method does the   substitution directly. When Latin characters are converted to kana or Kanji,   it is called a romaji conversion.o  Korean: Hangul is a writing system that actually straddles the line  between   phonetic  and  ideographic.  It's  phonetic  in  the  sense that each of the   roughly 25 characters represents a specific sound.  But between two and five   of  the  characters  are combined to form syllables, and these syllables are   the basic units on which text processing is  done.  For  example,  a  delete   operation  works  on a syllable rather than the individual characters within   it. And Korean code sets include several thousands of these syllables.   You   type  the  hangul  characters that make up the syllables of the words you're   entering. The display changes as you enter each hangul letter. That is, when   you  enter  the first letter, it fills the entire space that the final syll-   able will take up. When you enter the second, the  first  shrinks  to  about   half  its  size  to  make room for the second. When you enter the third, the   first two shrink again. And so on, up to the maximum of five  letters  in  a   syllable.   It's usually acceptable to keep Korean text in hangul form, but  some  words   are  more  commonly written in hanja. If you want to change hangul to hanja,   you define the region to be converted, and follow the same basic  method  as   described for Japanese.Probably because there are well-accepted phonetic writing systems for  Japaneseand  Korean,  computer  input  methods for those languages are fairly standard.Keyboard keys have both English characters and the  local  language's  phoneticsymbols  engraved  on  them.  You  can then switch the keyboard from English tolocal mode and vice versa.The situation is different for Chinese. While there is a phonetic system calledPinyin  promoted  by  authorities,  there  is no consensus for entering Chinesetext.  Some vendors use a phonetic decomposition (Pinyin  or  another),  othersuse  ideographic  decomposition  of Chinese words, with various implementationsand keyboard layouts. There are about 16 known methods,  none  of  which  is  aclear standard.Also, there are actually two ideographic sets used: Traditional  Chinese,  (theoriginal  written  Chinese)  and  Simplified  Chinese.  Several years back, thePeople's Republic Of China launched a campaign  to  simplify  some  ideographiccharacters  and eliminate redundancies all together. Under the plan, characterswould be streamlined every five years. Characters  have  been  revised  severaltimes  now,  resulting  in  the  smaller,  simpler set that makes up SimplifiedChinese.1.1.1.  Input Method ArchitectureAs shown in the previous paragraphs, there are many different input methods  inuse  today,  varying  with  language, culture, and history. A common feature ofmany input methods is that the user may type multiple keystrokes  in  order  tocompose  a  single  character (or set of characters).  The process of composingcharacters from keystrokes  is  called  pre-editing.  It  may  require  complex                             4XIM Public Review Draftalgorithms and large dictionaries involving substantial computer resources.Input methods may require one or more areas in which to show  the  feedback  ofthe  actual  keystrokes,  to  propose  disambiguation to the user, to list dic-tionaries, and so on. The input method areas with which  we  are  concerned  inthis specification are as follows.     The Status area is intended to be a logical extension of  the  LED's  that     exist  on  the  physical  keyboard.   It is an output-only window which is     intended to present the internal state of the input method that is  criti-     cal  to the user.  The status area may consist of text data and bitmaps or     some combination.     The PreEdit area is intended to display the intermediate  text  for  those     languages that are composing prior to the client handling the data.     The Auxiliary area is used for pop-up menus and customizing  dialogs  that     may  be  required  for  an  input method.  There may be multiple Auxiliary     areas for any input method. Auxiliary  areas  are  managed  by  the  input     method  independent  of  the  client.  Auxiliary areas are assumed to be a     separate dialog which is maintained by the input method.There are various user interaction styles used for pre-editing. The  ones  thatthis specification addresses are as follows.     For on-the-spot input methods, pre-editing data will be displayed  in  the     application  window.   Application data is moved to allow pre-edit data to     be displayed at the point of insertion.     Over-the-spot pre-editing means that the data is  displayed  in  an  input     method window that is placed over the point of insertion.     Off-the-spot pre-editing means that the  pre-edit  window  is  inside  the     client  window, but not at the point of insertion. Often this type of win-     dow is placed at the bottom of the client window.     Root-window pre-editing refers to input methods that use a pre-edit window     that is the child of RootWindow.It would require a lot of computing resources if portable applications  had  toinclude  input  methods for all the languages in the world.  To avoid this, thegoal of these specifications is to allow an application to communicate with  aninput  method  placed  in a separate process. Such a process is called an inputserver.  The server to which the application should connect is  dependent  uponthe  environment when the application is started up: what is the user language,the actual encoding to be used for it.  We will say that input  method  connec-tion is locale dependent.  It is also user dependent: for a given language, theuser can choose to some extent the user interface style  of  input  method  (ifchoice is possible among several).Using an input server implies communication overhead, but applications  can  bemigrated without relinking.  Specifications in this document have been designedso input methods can be implemented either as a stub communicating to an  inputserver or as a local library.                             5XIM Public Review DraftAn input method may be based on a front-end or  a  back-end  architecture.   Infront-end,  there  are  two separate connections to the X server: keystrokes godirectly from X server to the input method on one connection, other  events  tothe  regular  client  connection.  The input method is then acting as a filter,and sends composed strings to the client.  Front-end  requires  synchronization
12 3 4 5 下一页
💿 文件大小 2163 K
👤 上传用户 eeworm
📂 所属分类 Linux/Unix编程
🏷️ 相关标签

#linux #oxim #输入法
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -