📄 xmlwriter.java
字号:
package org.webharvest.utils;
//XMLWriter.java - serialize an XML document.
//Written by David Megginson, david@megginson.com
//NO WARRANTY! This class is in the public domain.
//Modified by John Cowan and Leigh Klotz for the TagSoup project. Still in the public domain.
//New features:
// it is a LexicalHandler
// it prints a comment if the LexicalHandler#comment method is called
// it supports certain XSLT output properties using get/setOutputProperty
//$Id: XMLWriter.java,v 1.1 2004/01/28 05:35:43 joe Exp $
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.Writer;
import java.util.Enumeration;
import java.util.Hashtable;
import java.util.Properties;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.AttributesImpl;
import org.xml.sax.helpers.NamespaceSupport;
import org.xml.sax.helpers.XMLFilterImpl;
import org.xml.sax.ext.LexicalHandler;
/**
* Filter to write an XML document from a SAX event stream.
*
* <p>
* This class can be used by itself or as part of a SAX event stream: it takes
* as input a series of SAX2 ContentHandler events and uses the information in
* those events to write an XML document. Since this class is a filter, it can
* also pass the events on down a filter chain for further processing (you can
* use the XMLWriter to take a snapshot of the current state at any point in a
* filter chain), and it can be used directly as a ContentHandler for a SAX2
* XMLReader.
* </p>
*
* <p>
* The client creates a document by invoking the methods for standard SAX2
* events, always beginning with the {@link #startDocument startDocument} method
* and ending with the {@link #endDocument endDocument} method. There are
* convenience methods provided so that clients to not have to create empty
* attribute lists or provide empty strings as parameters; for example, the
* method invocation
* </p>
*
* <pre>
* w.startElement("foo");
* </pre>
*
* <p>
* is equivalent to the regular SAX2 ContentHandler method
* </p>
*
* <pre>
* w.startElement("", "foo", "", new AttributesImpl());
* </pre>
*
* <p>
* Except that it is more efficient because it does not allocate a new empty
* attribute list each time. The following code will send a simple XML document
* to standard output:
* </p>
*
* <pre>
* XMLWriter w = new XMLWriter();
*
* w.startDocument();
* w.startElement("greeting");
* w.characters("Hello, world!");
* w.endElement("greeting");
* w.endDocument();
* </pre>
*
* <p>
* The resulting document will look like this:
* </p>
*
* <pre>
* <?xml version="1.0" standalone="yes"?>
*
* <greeting>Hello, world!</greeting>
* </pre>
*
* <p>
* In fact, there is an even simpler convenience method, <var>dataElement</var>,
* designed for writing elements that contain only character data, so the code
* to generate the document could be shortened to
* </p>
*
* <pre>
* XMLWriter w = new XMLWriter();
*
* w.startDocument();
* w.dataElement("greeting", "Hello, world!");
* w.endDocument();
* </pre>
*
* <h2>Whitespace</h2>
*
* <p>
* According to the XML Recommendation, <em>all</em> whitespace in an XML
* document is potentially significant to an application, so this class never
* adds newlines or indentation. If you insert three elements in a row, as in
* </p>
*
* <pre>
* w.dataElement("item", "1");
* w.dataElement("item", "2");
* w.dataElement("item", "3");
* </pre>
*
* <p>
* you will end up with
* </p>
*
* <pre>
* <item>1</item><item>3</item><item>3</item>
* </pre>
*
* <p>
* You need to invoke one of the <var>characters</var> methods explicitly to
* add newlines or indentation. Alternatively, you can use
* {@link com.megginson.sax.DataWriter DataWriter}, which is derived from this
* class -- it is optimized for writing purely data-oriented (or field-oriented)
* XML, and does automatic linebreaks and indentation (but does not support
* mixed content properly).
* </p>
*
*
* <h2>Namespace Support</h2>
*
* <p>
* The writer contains extensive support for XML Namespaces, so that a client
* application does not have to keep track of prefixes and supply <var>xmlns</var>
* attributes. By default, the XML writer will generate Namespace declarations
* in the form _NS1, _NS2, etc., wherever they are needed, as in the following
* example:
* </p>
*
* <pre>
* w.startDocument();
* w.emptyElement("http://www.foo.com/ns/", "foo");
* w.endDocument();
* </pre>
*
* <p>
* The resulting document will look like this:
* </p>
*
* <pre>
* <?xml version="1.0" standalone="yes"?>
*
* <_NS1:foo xmlns:_NS1="http://www.foo.com/ns/"/>
* </pre>
*
* <p>
* In many cases, document authors will prefer to choose their own prefixes
* rather than using the (ugly) default names. The XML writer allows two methods
* for selecting prefixes:
* </p>
*
* <ol>
* <li>the qualified name</li>
* <li>the {@link #setPrefix setPrefix} method.</li>
* </ol>
*
* <p>
* Whenever the XML writer finds a new Namespace URI, it checks to see if a
* qualified (prefixed) name is also available; if so it attempts to use the
* name's prefix (as long as the prefix is not already in use for another
* Namespace URI).
* </p>
*
* <p>
* Before writing a document, the client can also pre-map a prefix to a
* Namespace URI with the setPrefix method:
* </p>
*
* <pre>
* w.setPrefix("http://www.foo.com/ns/", "foo");
* w.startDocument();
* w.emptyElement("http://www.foo.com/ns/", "foo");
* w.endDocument();
* </pre>
*
* <p>
* The resulting document will look like this:
* </p>
*
* <pre>
* <?xml version="1.0" standalone="yes"?>
*
* <foo:foo xmlns:foo="http://www.foo.com/ns/"/>
* </pre>
*
* <p>
* The default Namespace simply uses an empty string as the prefix:
* </p>
*
* <pre>
* w.setPrefix("http://www.foo.com/ns/", "");
* w.startDocument();
* w.emptyElement("http://www.foo.com/ns/", "foo");
* w.endDocument();
* </pre>
*
* <p>
* The resulting document will look like this:
* </p>
*
* <pre>
* <?xml version="1.0" standalone="yes"?>
*
* <foo xmlns="http://www.foo.com/ns/"/>
* </pre>
*
* <p>
* By default, the XML writer will not declare a Namespace until it is actually
* used. Sometimes, this approach will create a large number of Namespace
* declarations, as in the following example:
* </p>
*
* <pre>
* <xml version="1.0" standalone="yes"?>
*
* <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
* <rdf:Description about="http://www.foo.com/ids/books/12345">
* <dc:title xmlns:dc="http://www.purl.org/dc/">A Dark Night</dc:title>
* <dc:creator xmlns:dc="http://www.purl.org/dc/">Jane Smith</dc:title>
* <dc:date xmlns:dc="http://www.purl.org/dc/">2000-09-09</dc:title>
* </rdf:Description>
* </rdf:RDF>
* </pre>
*
* <p>
* The "rdf" prefix is declared only once, because the RDF Namespace is used by
* the root element and can be inherited by all of its descendants; the "dc"
* prefix, on the other hand, is declared three times, because no higher element
* uses the Namespace. To solve this problem, you can instruct the XML writer to
* predeclare Namespaces on the root element even if they are not used there:
* </p>
*
* <pre>
* w.forceNSDecl("http://www.purl.org/dc/");
* </pre>
*
* <p>
* Now, the "dc" prefix will be declared on the root element even though it's
* not needed there, and can be inherited by its descendants:
* </p>
*
* <pre>
* <xml version="1.0" standalone="yes"?>
*
* <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
* xmlns:dc="http://www.purl.org/dc/">
* <rdf:Description about="http://www.foo.com/ids/books/12345">
* <dc:title>A Dark Night</dc:title>
* <dc:creator>Jane Smith</dc:title>
* <dc:date>2000-09-09</dc:title>
* </rdf:Description>
* </rdf:RDF>
* </pre>
*
* <p>
* This approach is also useful for declaring Namespace prefixes that be used by
* qualified names appearing in attribute values or character data.
* </p>
*
* @author David Megginson, david@megginson.com
* @version 0.2
* @see org.xml.sax.XMLFilter
* @see org.xml.sax.ContentHandler
*/
public class XMLWriter extends XMLFilterImpl implements LexicalHandler {
// //////////////////////////////////////////////////////////////////
// Constructors.
// //////////////////////////////////////////////////////////////////
/**
* Create a new XML writer.
*
* <p>
* Write to standard output.
* </p>
*/
public XMLWriter() {
init(null);
}
/**
* Create a new XML writer.
*
* <p>
* Write to the writer provided.
* </p>
*
* @param writer
* The output destination, or null to use standard output.
*/
public XMLWriter(Writer writer) {
init(writer);
}
/**
* Create a new XML writer.
*
* <p>
* Use the specified XML reader as the parent.
* </p>
*
* @param xmlreader
* The parent in the filter chain, or null for no parent.
*/
public XMLWriter(XMLReader xmlreader) {
super(xmlreader);
init(null);
}
/**
* Create a new XML writer.
*
* <p>
* Use the specified XML reader as the parent, and write to the specified
* writer.
* </p>
*
* @param xmlreader
* The parent in the filter chain, or null for no parent.
* @param writer
* The output destination, or null to use standard output.
*/
public XMLWriter(XMLReader xmlreader, Writer writer) {
super(xmlreader);
init(writer);
}
/**
* Internal initialization method.
*
* <p>
* All of the public constructors invoke this method.
*
* @param writer
* The output destination, or null to use standard output.
*/
private void init(Writer writer) {
setOutput(writer);
nsSupport = new NamespaceSupport();
prefixTable = new Hashtable();
forcedDeclTable = new Hashtable();
doneDeclTable = new Hashtable();
outputProperties = new Properties();
}
// //////////////////////////////////////////////////////////////////
// Public methods.
// //////////////////////////////////////////////////////////////////
/**
* Reset the writer.
*
* <p>
* This method is especially useful if the writer throws an exception before
* it is finished, and you want to reuse the writer for a new document. It
* is usually a good idea to invoke {@link #flush flush} before resetting
* the writer, to make sure that no output is lost.
* </p>
*
* <p>
* This method is invoked automatically by the
* {@link #startDocument startDocument} method before writing a new
* document.
* </p>
*
* <p>
* <strong>Note:</strong> this method will <em>not</em> clear the prefix
* or URI information in the writer or the selected output writer.
* </p>
*
* @see #flush
*/
public void reset() {
elementLevel = 0;
prefixCounter = 0;
nsSupport.reset();
}
/**
* Flush the output.
*
* <p>
* This method flushes the output stream. It is especially useful when you
* need to make certain that the entire document has been written to output
* but do not want to close the output stream.
* </p>
*
* <p>
* This method is invoked automatically by the
* {@link #endDocument endDocument} method after writing a document.
* </p>
*
* @see #reset
*/
public void flush() throws IOException {
output.flush();
}
/**
* Set a new output destination for the document.
*
* @param writer
* The output destination, or null to use standard output.
* @return The current output writer.
* @see #flush
*/
public void setOutput(Writer writer) {
if (writer == null) {
output = new OutputStreamWriter(System.out);
} else {
output = writer;
}
}
/**
* Specify a preferred prefix for a Namespace URI.
*
* <p>
* Note that this method does not actually force the Namespace to be
* declared; to do that, use the {@link #forceNSDecl(java.lang.String)
* forceNSDecl} method as well.
* </p>
*
* @param uri
* The Namespace URI.
* @param prefix
* The preferred prefix, or "" to select the default Namespace.
* @see #getPrefix
* @see #forceNSDecl(java.lang.String)
* @see #forceNSDecl(java.lang.String,java.lang.String)
*/
public void setPrefix(String uri, String prefix) {
prefixTable.put(uri, prefix);
}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -