📄 ch05_02.htm
字号:
<html><head><title>DTD Handlers (Perl and XML)</title><link rel="stylesheet" type="text/css" href="../style/style1.css" /><meta name="DC.Creator" content="Erik T. Ray and Jason McIntosh" /><meta name="DC.Format" content="text/xml" scheme="MIME" /><meta name="DC.Language" content="en-US" /><meta name="DC.Publisher" content="O'Reilly & Associates, Inc." /><meta name="DC.Source" scheme="ISBN" content="059600205XL" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="Perl and XML" /><meta name="DC.Type" content="Text.Monograph" /></head><body bgcolor="#ffffff"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl & XML" /><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch05_01.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228" /><td align="right" valign="top" width="228"><a href="ch05_03.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr></table></div><h2 class="sect1">5.2. DTD Handlers</h2><p><tt class="literal">XML::Parser::PerlSAX</tt><a name="INDEX-380" /> <a name="INDEX-381" /><a name="INDEX-382" />supports another group of handlers used to process DTDevents<a name="INDEX-383" /><a name="INDEX-384" />.It takes care of anything that appears before the root element, suchas the XML declaration, doctype declaration, and the internal subsetof entity and element declarations, which are collectively called the<em class="emphasis">document prolog</em><a name="INDEX-385" />. If you want to outputthe document literally as you read it (e.g., in a filter program),you need to define some of these handlers to reproduce the documentprolog. Defining these handlers is just what we needed in theprevious example.</p><p>You can use these handlers for other purposes. For example, you mayneed to pre-load entity definitions for special processing ratherthan rely on the parser to do its default substitution for you. Thesehandlers are listed in <a href="ch05_02.htm#perlxml-CHP-5-TABLE-2">Table 5-2</a>.</p><a name="perlxml-CHP-5-TABLE-2" /><h4 class="objtitle">Table 5-2. PerlSAX DTD handlers </h4><table border="1"><tr><th><p>Method name</p></th><th><p>Event</p></th><th><p>Properties</p></th></tr><tr><td><p><tt class="literal">entity_decl</tt><a name="INDEX-386" /></p></td><td><p>The parser sees an entity declaration (internal or external, parsedor unparsed).</p></td><td><p><tt class="literal">Name, Value, PublicId, SystemId, Notation</tt></p></td></tr><tr><td><p><tt class="literal">notation_decl</tt><a name="INDEX-387" /></p></td><td><p>The parser found a notation declaration. </p></td><td><p><tt class="literal">Name, PublicId, SystemId, Base</tt></p></td></tr><tr><td><p><tt class="literal">unparsed_entity_decl</tt><a name="INDEX-388" /></p></td><td><p>The parser found a declaration for an unparsed entity (e.g., a binarydata entity).</p></td><td><p><tt class="literal">Name, PublicId, SystemId, Base</tt></p></td></tr><tr><td><p><tt class="literal">element_decl</tt><a name="INDEX-389" /></p></td><td><p>An element declaration was found. </p></td><td><p><tt class="literal">Name, Model</tt></p></td></tr><tr><td><p><tt class="literal">attlist_decl</tt><a name="INDEX-390" /></p></td><td><p>An element's attribute list declaration wasencountered.</p></td><td><p><tt class="literal">ElementName, AttributeName, Type, Fixed</tt></p></td></tr><tr><td><p><tt class="literal">doctype_decl</tt><a name="INDEX-391" /></p></td><td><p>The parser found the document type declaration. </p></td><td><p><tt class="literal">Name, SystemId, PublicId, Internal</tt></p></td></tr><tr><td><p><tt class="literal">xml_decl</tt><a name="INDEX-392" /></p></td><td><p>The XML declaration was encountered. </p></td><td><p><tt class="literal">Version, Encoding, Standalone</tt></p></td></tr></table><p><p>The <tt class="literal">entity_decl( )</tt> handler is called for allkinds of entity declarations unless a more specific handler isdefined. Thus, unparsed entity declarations trigger the<tt class="literal">entity_decl( )</tt> handler unlessyou've defined an <tt class="literal">unparsed_entity_decl()</tt>, which will take precedence.</p><p><tt class="literal">entity_decl( )</tt>'s parametersvary depending on the entity type. The <tt class="literal">Value</tt>parameter is set for internal entities, but not external ones.Likewise, <tt class="literal">PublicId</tt> and<tt class="literal">SystemId</tt>, parameters that tell an XML processorwhere to find the file containing the entity'svalue, is not set for internal entities, only external ones.<tt class="literal">Base</tt> tells the procesor what to use for a base URLif the <tt class="literal">SystemId</tt> contains a relative location.</p><p>Notation declarations are a special feature of DTDs that allow you toassign a special type identifier to an entity. For example, you coulddeclare an entity to be of type"date" to tell the XML processorthat the entity should be treated as that kind of data.It's not used very often in XML, so wewon't go into it further.</p><p>The <tt class="literal">Model</tt> property of the <tt class="literal">element_decl()</tt> contains the content model, or grammar, for an element.This property describes what is allowed to go inside an elementaccording to the DTD.</p><p>An attribute list declaration in a DTD can contain more than oneattribute description. Fortunately, the parser breaks thesedescriptions up into individual calls to the <tt class="literal">attlist_decl()</tt> handler for each attribute.</p><p>The document type declaration is an optional part of the document atthe top, just under the XML declaration. The parameter<tt class="literal">Name</tt> is the name of the root element in yourdocument. <tt class="literal">PublicId</tt> and <tt class="literal">SystemId</tt>tell the processor where to find the external DTD. Finally, the<tt class="literal">Internal</tt> parameter contains the whole internalsubset as a string, in case you want to skip the individual entityand element declaration handling.</p><p>As an example, let's say you wanted to add to thefilter example code to output the document prolog exactly as it wasencountered by the parser. You'd need to definehandlers like the program in <a href="ch05_02.htm#perlxml-CHP-5-EX-4">Example 5-4</a>.</p><a name="perlxml-CHP-5-EX-4" /><div class="example"><h4 class="objtitle">Example 5-4. A better filter </h4><blockquote><pre class="code"># handle xml declaration#sub xml_decl { my( $self, $properties ) = @_; output( "<?xml version=\"" . $properties->{'Version'} . "\"" ); my $encoding = $properties->{'Encoding'}; output( " encoding=\"$encoding\"" ) if( $encoding ); my $standalone = $properties->{'Standalone'}; output( " standalone=\"$standalone\"" ) if( $standalone ); output( "?>\n" );}## handle doctype declaration:# try to duplicate the original#sub doctype_decl { my( $self, $properties ) = @_; output( "\n<!DOCTYPE " . $properties->{'Name'} . "\n" ); my $pubid = $properties->{'PublicId'}; if( $pubid ) { output( " PUBLIC \"$pubid\"\n" ); output( " \"" . $properties->{'SystemId'} . "\"\n" ); } else { output( " SYSTEM \"" . $properties->{'SystemId'} . "\"\n" ); } my $intset = $properties->{'Internal'}; if( $intset ) { $in_intset = 1; output( "[\n" ); } else { output( ">\n" ); }}## handle entity declaration in internal subset:# recreate the original declaration as it was#sub entity_decl { my( $self, $properties ) = @_; my $name = $properties->{'Name'}; output( "<!ENTITY $name " ); my $pubid = $properties->{'PublicId'}; my $sysid = $properties->{'SystemId'}; if( $pubid ) { output( "PUBLIC \"$pubid\" \"$sysid\"" ); } elsif( $sysid ) { output( "SYSTEM \"$sysid\"" ); } else { output( "\"" . $properties->{'Value'} . "\"" ); } output( ">\n" );}</pre></blockquote></div><p>Now let's see how the output from our filter looks.The result is in <a href="ch05_02.htm#perlxml-CHP-5-EX-5">Example 5-5</a>.</p><a name="perlxml-CHP-5-EX-5" /><div class="example"><h4 class="objtitle">Example 5-5. Output from the filter </h4><blockquote><pre class="code"><?xml version="1.0"?><!DOCTYPE book SYSTEM "/usr/local/prod/sgml/db.dtd"[<!ENTITY thingy "hoo hah blah blah">]><book id="mybook"> <title>GRXL in a Nutshell</title> <chapter id="intro"> <title>What is GRXL?</title><comment> need a better title </comment> <para>Yet another acronym. That was our attitude at first, but then we saw the amazing uses of this new technology called<literal>GRXL</literal>. Consider the following program: </para> <programlisting>AH aof -- %%%%{{{{{{ let x = 0 }}}}}} print! <lineannotation>wow</lineannotation>or not!</programlisting><comment> what font should we use? </comment> <para>What does it do? Who cares? It's just lovely to look at. In fact,I'd have to say, "&thingy;". </para> </chapter></book></pre></blockquote></div><p>That's much better. Now we have a complete filterprogram. The basic handlers take care of elements and everythinginside them. The DTD handlers deal with whatever happens<a name="INDEX-393" />outside<a name="INDEX-394" /> <a name="INDEX-395" /> of the<a name="INDEX-396" /> <a name="INDEX-397" /> root element.</p><hr width="684" align="left" /><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch05_01.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img alt="Home" border="0" src="../gifs/txthome.gif" /></a></td><td align="right" valign="top" width="228"><a href="ch05_03.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr><tr><td align="left" valign="top" width="228">5. SAX</td><td align="center" valign="top" width="228"><a href="index/index.htm"><img alt="Book Index" border="0" src="../gifs/index.gif" /></a></td><td align="right" valign="top" width="228">5.3. External Entity Resolution</td></tr></table></div><hr width="684" align="left" /><img alt="Library Navigation Links" border="0" src="../gifs/navbar.gif" usemap="#library-map" /><p><p><font size="-1"><a href="copyrght.htm">Copyright © 2002</a> O'Reilly & Associates. All rights reserved.</font></p><map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -