📄 2a_echo.html

📁 XML_JAVA指南书籍语言：简体中文书籍类型：程序设计授权方式：免费软件书籍大小： 377 KB
💻 HTML
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
    <P>Element attributes are listed all together on a single line. If your window       isn't really wide, you won't see them all.</P>  </LI>  <LI>     <P> The single-tag empty element you defined (<CODE>&lt;item/&gt;</CODE>)       is treated exactly the same as a two-tag empty element (<CODE>&lt;item&gt;&lt;/item&gt;</CODE>).       It is, for all intents and purposes, identical. (It's just easier to type       and consumes less space.) </P>  </LI></UL><H3><A NAME=identifying></A>Identifying the Events</H3><P>This version of the echo program might be useful for displaying an XML file,   but it's not telling you much about what's going on in the parser. The next   step is to modify the program so that you see where the spaces and vertical   lines are coming from.</P><BLOCKQUOTE>   <P><B>Note:</B> The code discussed in this section is in <A HREF=work/Echo02.java><CODE>Echo02.java</CODE></A>.     The output it produces is contained in <A HREF=work/Echo02-01.log><CODE>Echo02-01.log</CODE></A>.   </P></BLOCKQUOTE><P> Make the changes highlighted below to identify the events as they occur:</P><PRE>    public void startDocument ()    throws SAXException    {<NEW><B>        nl();        nl();         emit ("START DOCUMENT");        nl(); </B></NEW>        emit ("&lt;?xml version='1.0' encoding='UTF-8'?&gt;");        <OLD><STRIKE>nl();</STRIKE></OLD>    }    public void endDocument ()    throws SAXException    {<NEW><B>        nl(); emit ("END DOCUMENT");</B></NEW>        try {         ...    }    public void startElement (String name, AttributeList attrs)    throws SAXException    {<NEW><B>        nl(); emit ("ELEMENT: ");</B></NEW>        emit ("&lt;"+name);        if (attrs != null) {            for (int i = 0; i &lt; attrs.getLength (); i++) {                <OLD><STRIKE>emit (" ");</STRIKE></OLD>                <OLD><STRIKE>emit (attrs.getName(i)+"=\""+attrs.getValue (i)+"\"");</STRIKE></OLD><NEW><B>                nl();                 emit("   ATTR: ");                emit (attrs.getName (i));                emit ("\t\"");                emit (attrs.getValue (i));                emit ("\"");</B></NEW>            }        }<NEW><B>        if (attrs.getLength() &gt; 0) nl();</B></NEW>        emit ("&gt;");    }    public void endElement (String name)    throws SAXException    {<NEW><B>        nl();         emit ("END_ELM: ");</B></NEW>        emit ("&lt;/"+name+"&gt;");    }    public void characters (char buf [], int offset, int len)    throws SAXException    {   <NEW><B>        nl(); emit ("CHARS: |");     </B></NEW>        String s = new String(buf, offset, len);        emit (s);<NEW><B>        emit ("|");</B></NEW>    }</PRE><P>Compile and run this version of the program to produce a more informative output   listing. The attributes are now shown one per line, which is nice. But, more   importantly, output lines like this one:</P><BLOCKQUOTE>   <PRE>CHARS: |    |</PRE></BLOCKQUOTE><P>show that the <CODE>characters</CODE> method is responsible for echoing both   the spaces that create the indentation and the multiple newlines that separate   <A name=DIFF27></A><A href=#DIFF0><IMG src=../diffpics/oold.gif></A><STRIKE>them. </STRIKE><A name=DIFF27></A><A href=#DIFF0><IMG src=../diffpics/onew.gif></A><STRONG><I>the attributes.</STRONG></I></P><BLOCKQUOTE>   <P><B><A NAME=lineEndings></A>Note: </B>The XML specification requires all     input line separators to be normalized to a single newline. The newline character     is specified as <CODE>\n</CODE> in Java, C, and Unix systems, but goes by     the alias &quot;linefeed&quot; in Windows systems.</P></BLOCKQUOTE><H3><A NAME=compressing></A>Compressing the Output</H3><P>To make the output more readable, modify the program so that it only outputs   characters containing something other than whitespace.</P><BLOCKQUOTE>   <P><B>Note:</B> The code discussed in this section is in <A HREF=work/Echo03.java><CODE>Echo03.java</CODE></A>.   </P></BLOCKQUOTE><P>Make the changes shown below to suppress output of characters that are all   whitespace:</P><PRE>    public void characters (char buf [], int offset, int len)    throws SAXException    {        <OLD><STRIKE>nl(); emit ("CHARS: |");</STRIKE></OLD><NEW><B>        nl(); emit ("CHARS:   ");</B></NEW>        String s = new String(buf, offset, len);        <OLD><STRIKE>emit (s);</STRIKE></OLD>        <OLD><STRIKE>emit ("|");</STRIKE></OLD><NEW><B>        if (!s.trim().equals("")) emit (s);</B></NEW>    }</PRE><P>If you run the program now, you will see that you have eliminated the indentation   as well, because the indent space is part of the whitespace that precedes the   start of an element. Add the code highlighted below to manage the indentation:</P><PRE>    static private Writer	out;    <NEW><B>    private String indentString = "    "; // Amount to indent    private int indentLevel = 0;</B></NEW>    ...    public void startElement (String name, AttributeList attrs)    throws SAXException    {<NEW><B>        indentLevel++;</B></NEW>        nl(); emit ("ELEMENT: ");        ...    }    public void endElement (String name)    throws SAXException    {        nl();         emit ("END_ELM: ");        emit ("&lt;/"+name+"&gt;");<NEW><B>        indentLevel--;</B></NEW>    }    ...    private void nl ()    throws SAXException    {        ...        try {            out.write (lineEnd);<NEW><B>            for (int i=0; i &lt; indentLevel; i++) out.write(indentString);          </B></NEW>        } catch (IOException e) {        ...     }</PRE><P>This code sets up an indent string, keeps track of the current indent level,   and outputs the indent string whenever the <CODE>nl</CODE> method is called.   If you set the indent string to &quot;&quot;, the output will be un-indented   (Try it. You'll see why it's worth the work to add the indentation.)</P><P><B> </B>You'll be happy to know that you have reached the end of the &quot;mechanical&quot;   code you have to add to the Echo program. From here on, you'll be doing things   that give you more insight into how the parser works. The steps you've taken   so far, though, have given you a lot of insight into how the parser sees the   XML data it processes. It's also given you a helpful debugging tool you can   use to see what the parser sees.</P><H3><A NAME=inspecting></A>Inspecting the Output</H3><P>The complete output for this version of the program is contained in <A HREF=work/Echo03-01.log><CODE>Echo03-01.log</CODE></A>.   Part of that output is shown here:</P><PRE>    ELEMENT: &lt;slideshow    ...    CHARS:       CHARS:           ELEMENT: &lt;slide        ...          END_ELM: &lt;/slide&gt;    CHARS:       CHARS:   </PRE><P>Note that the <CODE>characters</CODE> method was invoked twice in a row. Inspecting   the source file <A HREF=samples/slideSample01.xml><CODE>slideSample01.xml</CODE></A>   shows that there is a comment before the first slide. The first call to <CODE>characters</CODE>   comes before that comment. The second call comes after. (Later on, you'll see   how to be notified when the parser encounters a comment, although in most cases   you won't need such notifications.)</P><P>Note, too, that the <CODE>characters</CODE> method is invoked after the first   slide element, as well as before. When you are thinking in terms of hierarchically   structured data, that seems odd. After all, you intended for the <CODE>slideshow</CODE>   element to contain <CODE>slide</CODE> elements, not text. Later on, you'll see   how to restrict the <CODE>slideshow</CODE> element using a DTD. When you do   that, the <CODE>characters</CODE> method will no longer be invoked. </P><P>In the absence of a DTD, though, the parser must assume that any element it   sees contains text like that in the first item element of the overview slide:</P><BLOCKQUOTE>  <PRE>&lt;item&gt;Why &lt;em&gt;WonderWidgets&lt;/em&gt; are great&lt;/item&gt;</PRE></BLOCKQUOTE><P>Here, the hierarchical structure looks like this:</P><BLOCKQUOTE>   <PRE>ELEMENT: &lt;item&gt;CHARS:   Why     ELEMENT: &lt;em&gt;    CHARS:   WonderWidgets    END_ELM: &lt;/em&gt;CHARS:    are greatEND_ELM: &lt;/item&gt;</PRE></BLOCKQUOTE><H3><A NAME=docsAndData></A>Documents and Data</H3><P>In this example, it's clear that there are characters intermixed with the hierarchical   structure of the elements. The fact that text can surround elements (or be prevented   from doing so with a DTD or schema) helps to explain why you sometimes hear   talk about &quot;XML data&quot; and other times hear about &quot;XML documents&quot;.   XML comfortably handles both structured data and text documents that include   markup. The only difference between the two is whether or not text is allowed   between the elements.</P><BLOCKQUOTE>   <P><B>Note: </B><BR>    In an upcoming section of this tutorial, you will work with the <CODE>ignorableWhitespace</CODE>     method in the <CODE>DocumentHandler</CODE> interface. This method can only     be invoked when a DTD is present. If a DTD specifies that <CODE>slideshow</CODE>     does not contain text, then all of the whitespace surrounding the <CODE>slide</CODE>     elements is by definition ignorable. On the other hand, if <CODE>slideshow</CODE>     can contain text (which must be assumed to be true in the absence of a DTD),     then the parser must assume that spaces and lines it sees between the <CODE>slide</CODE>     elements are significant parts of the document. </P></BLOCKQUOTE><BLOCKQUOTE><HR SIZE=4></BLOCKQUOTE><P><P> <TABLE WIDTH=100%><TR>    <TD ALIGN=left> <A HREF=1_write.html><IMG SRC=../images/PreviousArrow.gif WIDTH=26 HEIGHT=26 ALIGN=top BORDER=0 ALT="Previous | "></A><A HREF=2b_echo.html><IMG SRC=../images/NextArrow.gif WIDTH=26 HEIGHT=26 ALIGN=top BORDER=0 ALT="Next | "></A><A HREF=../alphaIndex.html><IMG SRC=../images/xml_IDX.gif WIDTH=26 HEIGHT=26 ALIGN=top BORDER=0 ALT="Index | "></A><A HREF=../TOC.html><IMG SRC=../images/xml_TOC.gif WIDTH=26 HEIGHT=26 ALIGN=top BORDER=0 ALT="TOC | "></A><A HREF=../index.html><IMG SRC=../images/xml_Top.gif WIDTH=26 HEIGHT=26 ALIGN=top BORDER=0 ALT="Top | "></A>     </TD><TD ALIGN=right><STRONG><EM><A HREF=index.html>Top</A></EM></STRONG> <A HREF=../TOC.html#intro><STRONG><EM>Contents</EM></STRONG></A>       <A HREF=../alphaIndex.html><STRONG><EM>Index</EM></STRONG></A> <A HREF=../glossary.html><STRONG><EM>Glossary</EM></STRONG></A></TD></TR></TABLE></BODY></HTML>
上一页 1 23
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -