⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 0347-0349.html

📁 Presenting XML.rar,详细介绍有关XML的知识
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>Presenting XML:Potential Applications of XML:EarthWeb Inc.-</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) {        var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY  BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!--  <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15"   HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="  Book Search  "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!--  ISBN=1575213346 //--><!--  TITLE=Presenting XML//--><!--  AUTHOR=Richard Light//--><!--  PUBLISHER=Macmillan Computer Publishing//--><!--  IMPRINT=Sams//--><!--  CHAPTER=18 //--><!--  PAGES=0331-0356 //--><!--  UNASSIGNED1 //--><!--  UNASSIGNED2 //--><P><CENTER><A HREF="0344-0346.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0350-0352.html">Next</A></CENTER></P><A NAME="PAGENUM-347"><P>Page 347</P></A><P>We are still in the early days for CIDS. The specification is nowhere nearcomplete, and the ECIX project has a long way to go before it can even startimplementation, but it is seriously considering XML (and is even rumored tohave made the choice). XML's powerful addressing mechanisms (combiningmany of the strengths of both the TEI and HTML), its powerful linkingmechanisms (combining the power of HyTime while escaping much of its complexity),its closeness to &quot;full&quot; SGML, its extensibility, and its close fit with machineprocessing (without necessarily being committed to Java), will undoubtedlyprovide the means to turn this grand vision into reality. And we are going to seeit happen!</P><H4><A NAME="ch18_ 10">OpenTag</A></H4><P>There was a time, not so long ago, when everyone had to admit to usingpredominantly American software, in the English language. Over the pasteight or nine years, however, non-English_speaking users have been far moreinsistent on having a version in their own language. The translation industry andthe software localization industry have therefore been experiencing golden years.</P><P>The effort that goes into translating manuals and localizing software is notrivial task&#151;sometimes representing a task even larger than that of creatingthe original software and documentation. To make matters even morecomplicated, translation is never a matter of just one language; usually it's 10 or 11languages at the same time! If that isn't enough, the professional translator then hasto take into account a wide range of possible delivery formats,including FrameMaker MIF, Microsoft RTF, Interleaf IAF, and last but not least,plain ASCII. (ASCII is usually extracted from Windows resource,.rc, files&#151;the files that a programmer uses to define the text appearing in the windows,menus, and dialog boxes that make up the user interface for Microsoft Windowsapplications.) For examples of the these formats in the context of translation,take a look at the comparisons that the OpenTag Initiative has placed on itsWeb site (for the address, see Appendix B).</P><P>It isn't as though every software package is different, though. Just imaginehow often (assuming you're an MS-Windows user) you've seen a File menuthat contains entries such as Open, Save, Print, and Exit. It doesn't takemuch imagination to realize that these words must come up time and timeagain, and it didn't take long before specialized software packages came on themarket that offered a sort of &quot;translator's memory.&quot; If a phrase came up asecond </P><A NAME="PAGENUM-348"><P>Page 348</P></A>time (either for the current job or from an earlier one), a database wouldbe checked, and the previous translation for that language would be offered asa suggestion. (Not every software company uses the same terminology.Microsoft has a massive set of glossaries publicly available via the Internet for most ofthe leading world languages, and Microsoft is known to be very particularabout its terminology.)</P><P>To handle the wide range of formats, many of the packages available onthe market took a natural step forward and implemented a method of textextraction. Most of the formats mentioned earlier have a basic core of plainASCII text in which you can, with patience, identify the actual text that youwould see. Just as with HTML, if you can work around the codes, you can stillfind the text. However, a major difference with HTML is that the formats usedby translators and localizers are extremely sensitive to mistakes. One slip ina Microsoft Windows resource file and the software quite possibly couldnever work again, forcing you to retrace your steps and, in the worst case, startover from scratch. (This happens far more often than a lot of translators wouldlike to admit!) The software packages in general professional use are able toscan the formats they are offered, place markers at the points where they findtext that needs translating, extract that text (ignoring any internal codes), andoffer it to the translator as simple text. The translator can then translate thetexts or, more often, distribute them to a whole team of colleagues fortranslation. When the texts are finished, they can be submitted to the softwarepackage, which then plugs them back into the file where they came from and leavesthe internal codes intact.</P><P>In theory, it's a wonderful scheme. In practice, it actually works prettywell. Unfortunately, the software packages involved are expensive (which is aproblem because many translators work on a freelance basis or areself-employed), and the packages are not compatible with each other. (Often, the packagesuse an internal coding mechanism that is almost as complex as the format theyare supposed to be assisting you with.) When you have committed to onesoftware package, you are basically locked in to that software package forever.</P><P>In February 1997, the American translation and localization companystarted a movement called the OpenTag Initiative in an attempt to break openthis closed loop by establishing an open data encoding method to support thelocalization process in general, and to permit the robust interchange ofdata between the parties involved.</P><A NAME="PAGENUM-349"><P>Page 349</P></A><P>At the center of the OpenTag Initiative is an XML DTD, demonstratingthe inherent flexibility that XML has inherited from SGML. But this DTD hasa real twist. The point is that for this application, even the HTML DTD(weak as it is) is simply too complex. The OpenTag XML DTD does not needto describe information or model a complex structure; all it has to do isidentify information and its location. The full XML DTD (version 5, May 14,1997) easily fits on two sheets of paper (which is less than 200 lines includingcomments) and consists of a meager 26 elements, nine of which are empty(including the top, root element, which is an XML requirement even thoughit serves little real purpose) and a maximum of only six attributes.</P><P>In support of the DTD, an MS-DOS and a Java parser have already beendeveloped (and are available free via the Internet), and customization fileshave been produced for most of the major software packages(IBM TranslationManager/2, Trados Translators Workbench, ILE LocaliX, andAtril D&eacute;j&agrave;Vu).</P><P>So how does the OpenTag Initiative think it can get away with so little?Quite simply, by keeping it simple! More than 75 percent of the informationcontained in a file (an RTF file from a Microsoft Windows Help file or evenan HTML page) is concerned with the formatting of the information (thephysical appearance). Of course, this is of absolutely no interest to thetranslator, who only wants to change the text. The following is a simple example, atranslation of a very basic HTML sentence from English into Dutch, French,and German:</P><!--  CODE //--><PRE>HTML English: XML is &lt;STRONG&gt;undoubtably&lt;/STRONG&gt; the future.OpenTag English: XML is &lt;G n=&quot;1&quot;&gt;undoubtably&lt;/G&gt; the future.OpenTag Dutch: XML is &lt;G n=&quot;1&quot;&gt;zonder twijfel&lt;/G&gt; de toekomst.OpenTag French: XML est &lt;G n=&quot;1&quot;&gt;sans doute &lt;/G&gt; l'avenir.OpenTag German: XML is &lt;G n=&quot;1&quot;&gt;ohne Zweifel&lt;/G&gt; der Zukunft.</PRE><!--  END CODE //--><P>If someone decides to change the markup at a later stage from&lt;STRONG&gt; to &lt;EM&gt;, it won't make much difference because the OpenTag markup isreally only positional, as you see here:</P><!--  CODE //--><PRE>HTML English: XML is &lt;EM&gt;undoubtably&lt;/EM&gt; the future.OpenTag English: XML is &lt;G n=&quot;1&quot;&gt;undoubtably&lt;/G&gt; the future.OpenTag Dutch: XML is &lt;G n=&quot;1&quot;&gt;zonder twijfel&lt;/G&gt; de toekomst.OpenTag French: XML est &lt;G n=&quot;1&quot;&gt;sans doute &lt;/G&gt; l'avenir.OpenTag German: XML is &lt;G n=&quot;1&quot;&gt;ohne Zweifel&lt;/G&gt; der Zukunft.</PRE><!--  END CODE //--><P>Of course, this is a very trivial example, but it at least illustrates theintention. By using a very simple tagset, you can extract information from adocument</P> <P><CENTER><A HREF="0344-0346.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0350-0352.html">Next</A></CENTER></P></TD></TR></TABLE></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -