📄 index.html
字号:
<!-- -*-html-*-
$Source: /usr/local/cvsroot/BigC++/29/index.html,v $
$Revision: 1.2 $
Big C++, chptr 29
editor: cols=80, tabstop=2
Kurt Schmidt, kschmidt@cs.drexel.edu
NOTES
- 3 spaces are used for each indent in examples
REVISIONS:
$Log: index.html,v $
Revision 1.2 2004/04/20 22:03:07 kurt
Made slides smaller (view %150, on 1024x768 display)
Revision 1.1 2004/03/01 06:24:23 kurt
Creation
-->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<script language="JavaScript" src="./config.js"></script>
<script language="JavaScript" src="./pageFormat.js"></script>
<script><!-- // Set title on page
title()
//--></script>
</head>
<body>
<hr><h2><font color="#009999" size="+3">Chapter 29 - XML</font></h2>
<font size="+1">
<script><!--
image( "cover.png" )
//--></script>
</font>
<hr><h2><font color="#009999" size="+2">Chapter Goals</font></h2>
</font>
<hr noshade size="4" color="#009999">
<font size="+1">
<ul>
<li>To understand XML elements and attributes</li>
<li>To understand the concept of an XML parser</li>
<li>To write programs that load and save XML documents</li>
<li>To design Document Type Definitions for XML documents</li>
</ul>
</font>
<hr><h2><font color="#009999">Chapter 29 - XML</font></h2>
<font size="+1">
<ul>
<li>Extensible Markup Language (XML)</li>
<li>Describes structure, not appearance</li>
<li>For transmitting data</li>
<li>Encode complex data</li>
<li>Easy to decode</li>
<li>Parsers widely available</li>
</ul>
</font>
<hr><h2><font color="#009999">29.1.1 Advantages of XML</font></h2>
<font size="+1">
<ul>
<li>E.g., naive encoding of names and salaries:
<blockquote><tt>
Mary Miller 64500<br>
Jim J. Jones Jr 42000
</tt></blockquote>
</li>
<li>In XML:
<blockquote>
<pre><employee>
<name>Mary Miller</name>
<salary>64500</salary>
</employee>
<employee>
<name>Jim J. Jones Jr</name>
<salary>42000</salary>
</employee></pre>
</blockquote>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.1.1 Advantages of XML</font></h2>
<font size="+1">
<ul>
<li>Not ambiguous</li>
<li>Easily read by humans</li>
<li>Resilient to change</li>
<li>E.g., add hire date:
<blockquote><tt>
Mary Miller 64500 1982
</tt></blockquote>
<ul>
<li>Old code will choke</li>
<li>New code won't parse old data</li>
</ul>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.1.1 Advantages of XML (cont.)</font></h2>
<font size="+1">
<ul>
<li>To add to XML:
<blockquote>
<pre><employee>
<name>Mary Miller</name>
<salary>64500</salary>
<year>1982</year>
</employee></pre>
</blockquote>
</li>
</ul>
</font>
<hr><h2><font color="#009999">Quality Tip 29.1</font></h2>
<font size="+1">
<hr color="#00ffff" size="6">
<p><font color="#009999">XML is Stricter Than HTML</font></p>
<ul>
<li>Tags are case sensitive</li>
<li>Every tag must have a matching close tag
<ul>
<li>Or end with <tt>/></tt>
<blockquote><tt>
<img src="hamster.jpeg"/>
</tt></blockquote>
</li>
</ul>
</li>
<li>Attribute values <b>must</b> be enclosed in quotes</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2.1 The Structure of an XML Document
</font></h2>
<font size="+1">
<ul>
<li><i>Document</i> - XML data set</li>
<li>Should start with header (<b>not</b> an XML element):
<blockquote><tt>
<?xml version="1.0"?>
</tt></blockquote>
</li>
<li>Actual data contained in a <i>root element</i>:
<blockquote>
<pre><?xml version="1.0"?>
<staff>
<i>more data</i>
</staff></pre>
</blockquote>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2.1 The Structure of an XML Document (cont.)
</font></h2>
<font size="+1">
<ul>
<li>An element may have content:
<blockquote><tt>
<<i>elementTag optional attributes</i>> <i>content</i>
</<i>elementTag</i>>
</tt></blockquote>
<ul>
<li>Content may be:
<ul>
<li>Other elements (<i>element content</i>)</li>
<li>Text</li>
<li>Both (<i>mixed content</i>)</li>
</ul>
</li>
</ul>
</li>
<li>Avoid mixed content</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2.1 The Structure of an XML Document (cont.)
</font></h2>
<font size="+1">
<ul>
<li>An element may have no content:
<blockquote><tt>
<img src="hamster.jpeg"/>
</tt></blockquote>
</li>
<li>May have attributes
<ul>
<li>An attribute has:
<ul>
<li>name (e.g., <tt>src</tt>)</li>
<li>value, in single or double quotes (<tt>"hamster.jpeg"</tt>)</li>
</ul>
</li>
</ul>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents</font></h2>
<font size="+1">
<ul>
<li><i>Parser</i> - reads and analyzes an XML document</li>
<li>Available in C++, Java, and many other languages</li>
<li>Two common flavors:
<dl>
<dt>DOM (Document Object Model)</dt>
<dd>Builds parse tree out of entire document</dd>
<dd>Easier to use</dd>
<dd>Complete overview of the data</dd>
<dt>SAX (Simple Access to XML)</dt>
<dd><i>Event-driven</i>; calls user-provided functions on certain
events</dd>
<dd>More efficient for large documents. Gives client information
in bits and pieces</dd>
</dl>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>This chapter describes a DOM parser</li>
<li>Xerces library (from Apache) - parser API</li>
<ul>
<li>Classes to process XML input</li>
<li>Classes to represent tree structure of document</li>
</ul>
<li>Download and instructions at <a target='bigc' href=
'http://xml.apache.org/xerces-c/'>http://xml.apache.org/xerces-c/</a>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>To read a particular type of XML document:
<ul>
<li>Write C++ program, translate XML input into C++ objects
<ul>
<li>Use Xerces library to parse input</li>
<li>Examine resulting tree, translate into objects</li>
</ul>
</li>
</ul>
</li>
<li>Separate program for each type of document</li>
<li>Parser performs tasks common to all such programs</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li><tt>DOM</tt> prefix</li>
<li>Objects accessed through ptrs
<ul>
<li>Do <b>not</b> <tt>delete</tt></li>
<li>Call <tt>release</tt> method on the builder object when done</li>
</ul>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>Use <tt>DOMBuilder</tt> object to read XML document from file:
<blockquote><tt>
DOMImplementation* implementation<br>
= DOMImplementation::getImplementation();<br>
DOMBuilder* parser = implementation->createDOMBuilder(<br>
DOMImplementation::MODE_SYNCHRONOUS, NULL);
</tt></blockquote>
</li>
<li>1<sup>st</sup> call retrieves the default object
<ul>
<li>A factory for classes for reading and writing documents</li>
</ul>
</li>
<li>2<sup>nd</sup> call gets builder object from implementation object</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>Use builder to parse document from file:
<blockquote><tt>
DOMDocument* doc = parser->parseURI("items.xml");
</tt></blockquote>
</li>
<li><tt>DOMDocument</tt> describes the tree structure of the document</li>
<li>To inspect the root element:
<blockquote><tt>
DOMNode* root = doc->getDocumentElement();
</tt></blockquote>
</li>
<li><tt>getDocumentElement</tt> returns DOMElement</li>
<li><tt>DOMElement</tt>, along w/<tt>DOMText</tt>, and others, are derived
from <tt>DOMNode</tt></li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<script><!--
image( "fig05.png" )
//--></script>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li><tt>getFirstChild</tt> and <tt>getLastChild</tt> methods to retrieve just
one child.
<ul>
<li>Return <tt>NULL</tt> if node doesn't exist</li>
</ul>
</li>
<li>For more children, use <tt>DOMNode::getChildNodes</tt>
<ul>
<li>Returns <tt>DOMNodeList</tt></li>
</ul>
</li>
<li>E.g., item list:</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<blockquote>
<pre><?xml version="1.0"?>
<items>
<item>
<product>
<description>Ink Jet Refill Kit</description>
<price>29.95</price>
</product>
<quantity>8</quantity>
</item>
<item>
<product>
<description>4-port Mini Hub</description>
<price>19.95</price>
</product>
<quantity>4</quantity>
</item>
</items></pre>
</blockquote>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>Document is a tree</li>
<li><tt>Items</tt> element is root</li>
<script><!--
image( "fig07.png" )
//--></script>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li><tt>getChildNodes</tt> on root produces list of two <tt>item</tt>
elements</li>
<li>To traverse <tt>DOMNodeList</tt>:
<ul>
<li><tt>getLength</tt> - # of nodes in list</li>
<li><tt>item</tt> w/index to get a item</li>
</ul>
</li>
<li>So,
<blockquote><tt>
DOMNodeList* nodes = root->getChildNodes();<br>
int i = . . .; <font color="#0000cc"><br>
// A # between 0 and nodes->getLength() - 1
</font><br>
DOMNode* child_node = nodes->item(i);
</tt></blockquote>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>Parser keeps all white space, by default</li>
<li>Stored in nodes</li>
<li>If you didn't mix content, easy to ignore:</li>
<ul>
<li>If current element contains elements, skip all non-element child
nodes:
<blockquote>
<pre>for (int i = 0; i < nodes->getLength(); i++)
{
DOMNode* child_node = nodes->item(i);
DOMElement* child_element
= dynamic_cast<DOMElement*>(child_node);
if (child_element != NULL)
{
<font color="#0000cc">// Do something with child_element</font>
. . .
}
}</pre>
</blockquote>
</li>
</ul>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>Given an element, get the name (e.g., <tt>price</tt>):
<blockquote><tt>
DOMElement* price_element = . . .;<br>
XMLCh* name = price_element->getTagName();<br>
<font color="#0000cc">
// Returns a tag name, such as price</font>
</tt></blockquote>
<ul>
<li>Returns Unicode, but not in <tt>wstring</tt></li>
</ul>
</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>If ASCII suffices, use following function to convert:
<blockquote>
<pre>string XMLCh_to_string(const XMLCh* in)
{
char* s = XMLString::transcode(in);
string r(s);
XMLString::release(&s);
return r;
}</pre>
</blockquote>
<ul>
<li><tt>XMLString::transcode</tt> converts <tt>XMLCh</tt> array into
<tt>char</tt> array</li>
<li><tt>XMLString::release</tt> recycles memory</li>
</ul>
</li>
<li>Localize library idiosyncrasies w/helper functions</li>
</ul>
</font>
<hr><h2><font color="#009999">29.2 Parsing XML Documents (cont.)</font></h2>
<font size="+1">
<ul>
<li>To retrieve attributes by name:
<ul>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -