📄 jaxpsax3.html
字号:
<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /> <meta http-equiv="Content-Style-Type" content="text/css" /> <title>Echoing an XML File with the SAX Parser</title> <link rel="StyleSheet" href="document.css" type="text/css" media="all" /> <link rel="StyleSheet" href="catalog.css" type="text/css" media="all" /> <link rel="Table of Contents" href="J2EETutorialTOC.html" /> <link rel="Previous" href="JAXPSAX2.html" /> <link rel="Next" href="JAXPSAX4.html" /> <link rel="Index" href="J2EETutorialIX.html" /> </head> <body> <table width="550" summary="layout" id="SummaryNotReq1"> <tr> <td align="left" valign="center"> <font size="-1"> <a href="http://java.sun.com/j2ee/1.4/download.html#tutorial" target="_blank">Download</a> <br> <a href="http://java.sun.com/j2ee/1.4/docs/tutorial/information/faq.html" target="_blank">FAQ</a> <br> <a href="http://java.sun.com/j2ee/1.4/docs/tutorial/information/history.html" target="_blank">History</a> </td> <td align="center" valign="center"><a accesskey="p" href="JAXPSAX2.html"><img id="LongDescNotReq1" src="images/PrevArrow.gif" width="26" height="26" border="0" alt="Prev" /></a><a accesskey="c" href="J2EETutorialFront.html"><img id="LongDescNotReq1" src="images/UpArrow.gif" width="26" height="26" border="0" alt="Home" /></a><a accesskey="n" href="JAXPSAX4.html"><img id="LongDescNotReq3" src="images/NextArrow.gif" width="26" height="26" border="0" alt="Next" /></a><a accesskey="i" href="J2EETutorialIX.html"></a> </td> <td align="right" valign="center"> <font size="-1"> <a href="http://java.sun.com/j2ee/1.4/docs/api/index.html" target="_blank">API</a> <br> <a href="http://java.sun.com/j2ee/1.4/docs/tutorial/information/search.html" target="_blank">Search</a> <br> <a href="http://java.sun.com/j2ee/1.4/docs/tutorial/information/sendusmail.html" target="_blank">Feedback</a></font> </font> </td> </tr> </table> <img src="images/blueline.gif" width="550" height="8" ALIGN="BOTTOM" NATURALSIZEFLAG="3" ALT="Divider"> <blockquote><a name="wp64190"> </a><h2 class="pHeading1">Echoing an XML File with the SAX Parser</h2><a name="wp64191"> </a><p class="pBody">In real life, you are going to have little need to echo an XML file with a SAX parser. Usually, you'll want to process the data in some way in order to do something useful with it. (If you want to echo it, it's easier to build a DOM tree and use that for output.) But echoing an XML structure is a great way to see the SAX parser in action, and it can be useful for debugging. </p><a name="wp64192"> </a><p class="pBody">In this exercise, you'll echo SAX parser events to <code class="cCode">System.out</code>. Consider it the "Hello World" version of an XML-processing program. It shows you how to use the SAX parser to get at the data, and then echoes it to show you what you've got. </p><hr><a name="wp64193"> </a><p class="pNote">Note: The code discussed in this section is in <code class="cCode"><a href="../examples/jaxp/sax/samples/Echo01.java" target="_blank">Echo01.java</a></code>. The file it operates on is <code class="cCode"><a href="../examples/xml/samples/slideSample01.xml" target="_blank">slideSample01.xml</a></code>, as described in <a href="IntroXML4.html#wp67589">Writing a Simple XML File</a>. (The browsable version is <code class="cCode"><a href="../examples/xml/samples/slideSample01-xml.html" target="_blank">slideSample01-xml.html</a></code>.) </p><hr><a name="wp64195"> </a><h3 class="pHeading2">Creating the Skeleton</h3><a name="wp64196"> </a><p class="pBody">Start by creating a file named<span style="font-weight: bold"> </span><code class="cCode">Echo.java</code> and enter the skeleton for the application:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative"><code class="cCode">public class Echo</code>{ public static void main(String argv[]) {<a name="wp64197"> </a> }<a name="wp64198"> </a>}<a name="wp64199"> </a></pre></div><a name="wp64200"> </a><p class="pBody">Since we're going to run it standalone, we need a main method. And we need command-line arguments so we can tell the application which file to echo.</p><a name="wp64202"> </a><h3 class="pHeading2">Importing Classes</h3><a name="wp64203"> </a><p class="pBody">Next, add the import statements for the classes the application will use:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">import java.io.*;import org.xml.sax.*;import org.xml.sax.helpers.DefaultHandler;import javax.xml.parsers.SAXParserFactory; import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;<a name="wp64204"> </a>public class Echo{ ...<a name="wp64205"> </a></pre></div><a name="wp64206"> </a><p class="pBody">The classes in <code class="cCode">java.io</code>, of course, are needed to do output. The <code class="cCode">org.xml.sax</code> package defines all the interfaces we use for the SAX parser. The <code class="cCode">SAXParserFactory</code> class creates the instance we use. It throws a <code class="cCode">ParserConfigurationException</code> if it is unable to produce a parser that matches the specified configuration of options. (You'll see more about the configuration options later.) The <code class="cCode">SAXParser</code> is what the factory returns for parsing, and the <code class="cCode">DefaultHandler</code> defines the class that will handle the SAX events that the parser generates. </p><a name="wp64208"> </a><h3 class="pHeading2">Setting up for I/O</h3><a name="wp64209"> </a><p class="pBody">The first order of business is to process the command line argument, get the name of the file to echo, and set up the output stream. Add the text highlighted below to take care of those tasks and do a bit of additional housekeeping: </p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">public static void main(String argv[]){<code class="cCodeBold"> if (argv.length != 1) { System.err.println("Usage: cmd filename"); System.exit(1); } try { // Set up output stream out = new OutputStreamWriter(System.out, "UTF8"); } catch (Throwable t) { t.printStackTrace(); } System.exit(0);</code>}<code class="cCodeBold">static private Writer out;</code><a name="wp64210"> </a></pre></div><a name="wp64213"> </a><p class="pBody">When we create the output stream writer, we are selecting the UTF-8 character encoding. We could also have chosen US-ASCII, or UTF-16, which the Java platform also supports. For more information on these character sets, see <a href="Encodings.html#wp64176">Java Encoding Schemes</a>.</p><a name="wp64218"> </a><h3 class="pHeading2">Implementing the ContentHandler Interface</h3><a name="wp64219"> </a><p class="pBody">The most important interface for our current purposes is the <code class="cCode">ContentHandler</code> interface. That interface requires a number of methods that the SAX parser invokes in response to different parsing events. The major event handling methods are: <code class="cCode">startDocument</code>, <code class="cCode">endDocument</code>, <code class="cCode">startElement</code>, <code class="cCode">endElement</code>, and <code class="cCode">characters</code>. </p><a name="wp64220"> </a><p class="pBody">The easiest way to implement that interface is to extend the <code class="cCode">DefaultHandler</code> class, defined in the <code class="cCode">org.xml.sax.helpers</code> package. That class provides do-nothing methods for all of the <code class="cCode">ContentHandler</code> events. Enter the code highlighted below to extend that class:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">public class Echo <code class="cCodeBold">extends DefaultHandler</code>{ ...}<a name="wp67414"> </a></pre></div><hr><a name="wp67415"> </a><p class="pNote">Note: <code class="cCode">DefaultHandler</code> also defines do-nothing methods for the other major events, defined in the <code class="cCode">DTDHandler</code>, <code class="cCode">EntityResolver</code>, and <code class="cCode">ErrorHandler</code> interfaces. You'll learn more about those methods as we go along.</p><hr><a name="wp64224"> </a><p class="pBody">Each of these methods is required by the interface to throw a <code class="cCode">SAXException</code>. An exception thrown here is sent back to the parser, which sends it on to the code that invoked the parser. In the current program, that means it winds up back at the <code class="cCode">Throwable</code> exception handler at the bottom of the <code class="cCode">main</code> method. </p><a name="wp64225"> </a><p class="pBody">When a start tag or end tag is encountered, the name of the tag is passed as a <code class="cCode">String</code> to the <code class="cCode">startElement</code> or <code class="cCode">endElement</code> method, as appropriate. When a start tag is encountered, any attributes it defines are also passed in an <code class="cCode">Attributes</code> list. Characters found within the element are passed as an array of characters, along with the number of characters (<code class="cCode">length</code>) and an offset into the array that points to the first character.</p><a name="wp64227"> </a><h3 class="pHeading2">Setting up the Parser</h3><a name="wp64228"> </a><p class="pBody">Now (at last) you're ready to set up the parser. Add the text highlighted below to set it up and get it started:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">public static void main(String argv[]){ if (argv.length != 1) { System.err.println("Usage: cmd filename"); System.exit(1); }<a name="wp64229"> </a> <code class="cCodeBold"> // Use an instance of ourselves as the SAX event handler DefaultHandler handler = new Echo();</code><a name="wp64230"> </a><code class="cCodeBold"> // Use the default (non-validating) parser SAXParserFactory factory = SAXParserFactory.newInstance();</code> try { // Set up output stream out = new OutputStreamWriter(System.out, "UTF8");<a name="wp64231"> </a> <code class="cCodeBold"> // Parse the input SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv[0]), handler );</code><a name="wp64232"> </a> } catch (Throwable t) { t.printStackTrace(); } System.exit(0);}<a name="wp64233"> </a></pre></div><a name="wp64234"> </a><p class="pBody">With these lines of code, you created a <code class="cCode">SAXParserFactory</code> instance, as determined by the setting of the <code class="cCode">javax.xml.parsers.SAXParserFactory</code> system property. You then got a parser from the factory and gave the parser an instance of this class to handle the parsing events, telling it which input file to process.</p><hr><a name="wp64235"> </a><p class="pNote">Note: The <code class="cCode">javax.xml.parsers.SAXParser</code> class is a wrapper that defines a number of convenience methods. It wraps the (somewhat-less friendly) <code class="cCode">org.xml.sax.Parser</code> object. If needed, you can obtain that parser using the <code class="cCode">SAXParser</code>'s <code class="cCode">getParser()</code> method.</p><hr><a name="wp64236"> </a><p class="pBody">For now, you are simply catching any exception that the parser might throw. You'll learn more about error processing in a later section of the tutorial, <a href="JAXPSAX5.html#wp64579">Handling Errors with the Nonvalidating Parser</a>.</p><a name="wp64241"> </a><h3 class="pHeading2">Writing the Output</h3><a name="wp64242"> </a><p class="pBody">The <code class="cCode">ContentHandler</code> methods throw <code class="cCode">SAXException</code>s but not <code class="cCode">IOException</code>s, which can occur while writing. The <code class="cCode">SAXException</code> can wrap another exception, though, so it makes sense to do the output in a method that takes care of the exception-handling details. Add the code highlighted below to define an <code class="cCode">emit</code> method that does that:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative"><code class="cCode">static private Writer out;</code><a name="wp64243"> </a><code class="cCodeBold">private void emit(String s)throws SAXException{ try { out.write(s); out.flush(); } catch (IOException e) { throw new SAXException("I/O error", e); }}</code>...<a name="wp64244"> </a></pre></div><a name="wp64245"> </a><p class="pBody">When emit is called, any I/O error is wrapped in <code class="cCode">SAXException</code> along with a message that identifies it. That exception is then thrown back to the SAX parser. You'll learn more about SAX exceptions later on. For now, keep in mind that <code class="cCode">emit</code> is a small method that handles the string output. (You'll see it called a lot in the code ahead.)</p><a name="wp64247"> </a><h3 class="pHeading2">Spacing the Output</h3><a name="wp64248"> </a><p class="pBody">Here is another bit of infrastructure we need before doing some real processing. Add the code highlighted below to define a <code class="cCode">nl()</code> method that writes the kind of line-ending character used by the current system:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">private void emit(String s) ...}<a name="wp64249"> </a><code class="cCodeBold">private void nl()throws SAXException{ String lineEnd = System.getProperty("line.separator"); try { out.write(lineEnd); } catch (IOException e) { throw new SAXException("I/O error", e); }</code>}<a name="wp64250"> </a></pre></div><hr><a name="wp64251"> </a><p class="pNote">Note: Although it seems like a bit of a nuisance, you will be invoking <code class="cCode">nl</code>() many times in the code ahead. Defining it now will simplify the code later on. It also provides a place to indent the output when we get to that section of the tutorial.</p><hr><a name="wp64253"> </a><h3 class="pHeading2">Handling Content Events</h3><a name="wp64254"> </a><p class="pBody">Finally, let's write some code that actually processes the <code class="cCode">ContentHandler</code> events. </p><a name="wp71040"> </a><h4 class="pHeading3">Document Events</h4><a name="wp71041"> </a><p class="pBody">Add the code highlighted below to handle the start-document and end-document events:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">static private Writer out;<code class="cCodeBold">public void startDocument()throws SAXException{ emit("<?xml version='1.0' encoding='UTF-8'?>"); nl();}public void endDocument()throws SAXException{
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -