📄 2a_echo.html

📁 XML_JAVA指南书籍语言：简体中文书籍类型：程序设计授权方式：免费软件书籍大小： 377 KB
💻 HTML
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
      isn't really wide, you won't see them all.</p>
  </li>
  <li> 
    <p> The single-tag empty element you defined (<code>&lt;item/&gt;</code>) 
      is treated exactly the same as a two-tag empty element (<code>&lt;item&gt;&lt;/item&gt;</code>). 
      It is, for all intents and purposes, identical. (It's just easier to type 
      and consumes less space.) </p>
  </li>
</ul>
<h3><a name="identifying"></a>Identifying the Events</h3>
<p>This version of the echo program might be useful for displaying an XML file, 
  but it's not telling you much about what's going on in the parser. The next 
  step is to modify the program so that you see where the spaces and vertical 
  lines are coming from.</p>
<blockquote> 
  <p><b>Note:</b> The code discussed in this section is in <a href="work/Echo02.java"><code>Echo02.java</code></a>. 
    The output it produces is contained in <a href="work/Echo02-01.log"><code>Echo02-01.log</code></a>. 
  </p>
</blockquote>
<p> Make the changes highlighted below to identify the events as they occur:</p>
<pre>    public void startDocument ()
    throws SAXException
    {
<new><b>        nl();
        nl(); 
        emit ("START DOCUMENT");
        nl(); 
</b></new>        emit ("&lt;?xml version='1.0' encoding='UTF-8'?&gt;");
        <old><strike>nl();</strike></old>
    }

    public void endDocument ()
    throws SAXException
    {
<new><b>        nl(); emit ("END DOCUMENT");
</b></new>        try {
         ...
    }

    public void startElement (String name, AttributeList attrs)
    throws SAXException
    {
<new><b>        nl(); emit ("ELEMENT: ");
</b></new>        emit ("&lt;"+name);
        if (attrs != null) {
            for (int i = 0; i &lt; attrs.getLength (); i++) {
                <old><strike>emit (" ");</strike></old>
                <old><strike>emit (attrs.getName(i)+"=\""+attrs.getValue (i)+"\"");</strike></old>
<new><b>                nl(); 
                emit("   ATTR: ");
                emit (attrs.getName (i));
                emit ("\t\"");
                emit (attrs.getValue (i));
                emit ("\"");
</b></new>            }
        }
<new><b>        if (attrs.getLength() &gt; 0) nl();
</b></new>        emit ("&gt;");
    }

    public void endElement (String name)
    throws SAXException
    {
<new><b>        nl(); 
        emit ("END_ELM: ");
</b></new>        emit ("&lt;/"+name+"&gt;");
    }

    public void characters (char buf [], int offset, int len)
    throws SAXException
    {   
<new><b>        nl(); emit ("CHARS: |");     </b></new>
        String s = new String(buf, offset, len);
        emit (s);
<new><b>        emit ("|");
</b></new>    }
</pre>
<p>Compile and run this version of the program to produce a more informative output 
  listing. The attributes are now shown one per line, which is nice. But, more 
  importantly, output lines like this one:</p>
<blockquote> 
  <pre>CHARS: |



    |
</pre>
</blockquote>
<p>show that the <code>characters</code> method is responsible for echoing both 
  the spaces that create the indentation and the multiple newlines that separate 
  the attributes.</p>
<blockquote> 
  <p><b><a name="lineEndings"></a>Note: </b>The XML specification requires all 
    input line separators to be normalized to a single newline. The newline character 
    is specified as <code>\n</code> in Java, C, and Unix systems, but goes by 
    the alias &quot;linefeed&quot; in Windows systems.</p>
</blockquote>
<h3><a name="compressing"></a>Compressing the Output</h3>
<p>To make the output more readable, modify the program so that it only outputs 
  characters containing something other than whitespace.</p>
<blockquote> 
  <p><b>Note:</b> The code discussed in this section is in <a href="work/Echo03.java"><code>Echo03.java</code></a>. 
  </p>
</blockquote>
<p>Make the changes shown below to suppress output of characters that are all 
  whitespace:</p>
<pre>    public void characters (char buf [], int offset, int len)
    throws SAXException
    {
        <old><strike>nl(); emit ("CHARS: |");</strike></old>
<new><b>        nl(); emit ("CHARS:   ");
</b></new>        String s = new String(buf, offset, len);
        <old><strike>emit (s);</strike></old>
        <old><strike>emit ("|");</strike></old>
<new><b>        if (!s.trim().equals("")) emit (s);
</b></new>    }
</pre>
<p>If you run the program now, you will see that you have eliminated the indentation 
  as well, because the indent space is part of the whitespace that precedes the 
  start of an element. Add the code highlighted below to manage the indentation:</p>
<pre>
    static private Writer	out;
    <new><b>
    private String indentString = "    "; // Amount to indent
    private int indentLevel = 0;
</b></new>
    ...

    public void startElement (String name, AttributeList attrs)
    throws SAXException
    {
<new><b>        indentLevel++;
</b></new>        nl(); emit ("ELEMENT: ");
        ...
    }

    public void endElement (String name)
    throws SAXException
    {
        nl(); 
        emit ("END_ELM: ");
        emit ("&lt;/"+name+"&gt;");
<new><b>        indentLevel--;
</b></new>    }
    ...
    private void nl ()
    throws SAXException
    {
        ...
        try {
            out.write (lineEnd);
<new><b>            for (int i=0; i &lt; indentLevel; i++) out.write(indentString);
          </b></new>
        } catch (IOException e) {
        ... 
    }
</pre>
<p>This code sets up an indent string, keeps track of the current indent level, 
  and outputs the indent string whenever the <code>nl</code> method is called. 
  If you set the indent string to &quot;&quot;, the output will be un-indented 
  (Try it. You'll see why it's worth the work to add the indentation.)</p>
<p><b> </b>You'll be happy to know that you have reached the end of the &quot;mechanical&quot; 
  code you have to add to the Echo program. From here on, you'll be doing things 
  that give you more insight into how the parser works. The steps you've taken 
  so far, though, have given you a lot of insight into how the parser sees the 
  XML data it processes. It's also given you a helpful debugging tool you can 
  use to see what the parser sees.</p>
<h3><a name="inspecting"></a>Inspecting the Output</h3>
<p>The complete output for this version of the program is contained in <a href="work/Echo03-01.log"><code>Echo03-01.log</code></a>. 
  Part of that output is shown here:</p>
<pre>    ELEMENT: &lt;slideshow
    ...
    CHARS:   
    CHARS:   
        ELEMENT: &lt;slide
        ...  
        END_ELM: &lt;/slide&gt;
    CHARS:   
    CHARS:   
</pre>
<p>Note that the <code>characters</code> method was invoked twice in a row. Inspecting 
  the source file <a href="samples/slideSample01.xml"><code>slideSample01.xml</code></a> 
  shows that there is a comment before the first slide. The first call to <code>characters</code> 
  comes before that comment. The second call comes after. (Later on, you'll see 
  how to be notified when the parser encounters a comment, although in most cases 
  you won't need such notifications.)</p>
<p>Note, too, that the <code>characters</code> method is invoked after the first 
  slide element, as well as before. When you are thinking in terms of hierarchically 
  structured data, that seems odd. After all, you intended for the <code>slideshow</code> 
  element to contain <code>slide</code> elements, not text. Later on, you'll see 
  how to restrict the <code>slideshow</code> element using a DTD. When you do 
  that, the <code>characters</code> method will no longer be invoked. </p>
<p>In the absence of a DTD, though, the parser must assume that any element it 
  sees contains text like that in the first item element of the overview slide:</p>
<blockquote>
  <pre>&lt;item&gt;Why &lt;em&gt;WonderWidgets&lt;/em&gt; are great&lt;/item&gt;</pre>
</blockquote>
<p>Here, the hierarchical structure looks like this:</p>
<blockquote> 
  <pre>ELEMENT: &lt;item&gt;
CHARS:   Why 
    ELEMENT: &lt;em&gt;
    CHARS:   WonderWidgets
    END_ELM: &lt;/em&gt;
CHARS:    are great
END_ELM: &lt;/item&gt;
</pre>
</blockquote>
<h3><a name="docsAndData"></a>Documents and Data</h3>
<p>In this example, it's clear that there are characters intermixed with the hierarchical 
  structure of the elements. The fact that text can surround elements (or be prevented 
  from doing so with a DTD or schema) helps to explain why you sometimes hear 
  talk about &quot;XML data&quot; and other times hear about &quot;XML documents&quot;. 
  XML comfortably handles both structured data and text documents that include 
  markup. The only difference between the two is whether or not text is allowed 
  between the elements.</p>
<blockquote> 
  <p><b>Note: </b><br>
    In an upcoming section of this tutorial, you will work with the <code>ignorableWhitespace</code> 
    method in the <code>DocumentHandler</code> interface. This method can only 
    be invoked when a DTD is present. If a DTD specifies that <code>slideshow</code> 
    does not contain text, then all of the whitespace surrounding the <code>slide</code> 
    elements is by definition ignorable. On the other hand, if <code>slideshow</code> 
    can contain text (which must be assumed to be true in the absence of a DTD), 
    then the parser must assume that spaces and lines it sees between the <code>slide</code> 
    elements are significant parts of the document. </p>
</blockquote>
<blockquote>
<hr size=4>
</blockquote>
<p>
<p> 
<table width="100%">
<tr>
    <td align=left> <a href="1_write.html"><img src="../images/PreviousArrow.gif" width=26 height=26 align=top border=0 alt="Previous | "></a><a
href="2b_echo.html"><img src="../images/NextArrow.gif" width=26 height=26 align=top border=0 alt="Next | "></a><a href="../alphaIndex.html"><img src="../images/xml_IDX.gif" width=26 height=26 align=top border=0 alt="Index | "></a><a href="../TOC.html"><img
src="../images/xml_TOC.gif" width=26 height=26 align=top border=0 alt="TOC | "></a><a href="../index.html"><img
src="../images/xml_Top.gif" width=26 height=26 align=top border=0 alt="Top | "></a> 
    </td>
<td align=right><strong><em><a href="index.html">Top</a></em></strong> <a href="../TOC.html#intro"><strong><em>Contents</em></strong></a> 
      <a href="../alphaIndex.html"><strong><em>Index</em></strong></a> <a href="../glossary.html"><strong><em>Glossary</em></strong></a>
</td>
</tr>
</table>
</body>
</html>
上一页 1 23
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -