⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 introduction.tex

📁 Nano的XML解析器
💻 TEX
字号:
\chapter{Introduction}This chapter gives a short introduction to XML and NanoXML.\section{About \ltext{XML}}The extensible markup language,\href{http://www.w3c.org/TR/REC-xml}{\ltext{XML}}, is a way to mark up text ina structured document.\ltext{XML} is a simplification of the complex \ltext{SGML} standard.\ltext{SGML}, the Standard Generalized Markup Language, is an international(\ltext{ISO}) standard for marking up text and graphics.The best known application of \ltext{SGML} is \ltext{HTML}.Although \ltext{SGML} data is very easy to write, it's very difficult to write ageneric \ltext{SGML} parser.When designing \ltext{XML} however, the authors removed much of the flexibilityof \ltext{SGML} making it much easier to parse \ltext{XML} documents correctly.\ltext{XML} data is structured as a tree of \term{entities}.An entity can be a string of character data or an element which can contain otherentities.Elements can optionally have a set of attributes.Attributes are key/value pairs which set some properties of an element.The following example shows some XML data:\begin{example}$<$book$>$~~$<$chapter id="my chapter"$>$~~~~$<$title$>$The title$<$/title$>$~~~~Some text.~~$<$/chapter$>$$<$/book$>$\end{example}At the root of the tree, you can find the element ``book''.This element contains one child element: ``chapter''.The chapter element has one attribute which maps the key ``id'' to``my chapter''.The chapter element has two child entities: the element ``title'' and thecharacter data ``Some text.''.Finally, the title element has one child, the string ``The title''.\section{About \ltext{NanoXML}}In April 2000, \ltext{NanoXML} was first released as a spin-off project of\ltext{AUIT}, the Abstract User Interface Toolkit.The intent of NanoXML was to be a small parser which was easy to use.\ltext{SAX} and \ltext{DOM} are much too complex for what I needed and themainstream parsers were either much too big or had a very restrictive license.\ltext{NanoXML 1} has all the features I needed: it is very small (about 6K),is reasonably fast for small \ltext{XML} documents, is very easy to use and isfree (\ltext{zlib/libpng} license).As I never intended to use \ltext{NanoXML} to parse \ltext{DocBook} documents,there was no support for mixed data or \ltext{DTD} parsing.\ltext{NanoXML} was released as a \ltext{SourceForge} project and, because of thevery good response from its users, it matured to a small and stable parser.The final version, release \ltext{1.6.8} was released in May 2001.Because of its small size, people started to use \ltext{NanoXML} for embeddedsystems (\ltext{KVM}, \ltext{J2ME}) and kindly submitted patches to make\ltext{NanoXML} work in such restricted environment.\section{\ltext{NanoXML} 2}In July 2001, \ltext{NanoXML} 2 has been released.Unlike \ltext{NanoXML 1}, speed and \ltext{XML} compliancy were considered to bevery important when the new parser was designed.\ltext{NanoXML 2} is also very modular: you can easily replace the differentcomponents in the parser to customize it to your needs.The modularity of \ltext{NanoXML 2} also benefits extensions like \acronym{e.g.}\ltext{SAX} support which can now directly access the parser.In \ltext{NanoXML 1}, the \ltext{SAX} adapter had to iterate the data structurebuilt by the base product.Although many features were added to \ltext{NanoXML}, the second release wasstill very small.The full parser with builder fits in a \ltext{JAR} file of about 32K.This is still very tiny, especially when you compare this with the ``standard''parsers of more than four times its size.As there is still need for a tiny parser like \ltext{NanoXML 1}, there is aspecial branch of \ltext{NanoXML 2}: \ltext{NanoXML/Lite}. This parser is sourcecompatible with \ltext{NanoXML 1} but features a new parsing algorithm whichmakes it more than twice as fast as the older version.It is however more restrictive on the \ltext{XML} data it parses: the olderversion allowed some not-wellformed data to be parsed.There are three branches of NanoXML 2:\begin{itemize}  \item[$\bullet$]    \term{NanoXML/Lite} is the successor of \ltext{NanoXML 1}.    It features an almost compatible parser which is extremely small.  \item[$\bullet$]    \term{NanoXML/Java} is the standard parser.  \item[$\bullet$]    \term{NanoXML/SAX} is the \ltext{SAX} adapter for \ltext{NanoXML/Java}.\end{itemize}The latest version of \ltext{NanoXML} is \ltext{NanoXML 2.2.1}, which has beenreleased in February 2002.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -