dtdscanner.cpp
来自「IBM的解析xml的工具Xerces的源代码」· C++ 代码 · 共 1,936 行 · 第 1/5 页
CPP
1,936 行
/* * Copyright 1999-2001,2004 The Apache Software Foundation. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. *//* * $Log: DTDScanner.cpp,v $ * Revision 1.35 2004/09/30 13:14:27 amassari * Fix jira#1280 - Borland leaks memory if break or continue are used inside a catch block * * Revision 1.34 2004/09/08 13:56:50 peiyongz * Apache License Version 2.0 * * Revision 1.33 2004/07/06 15:57:55 amassari * Fix for jira#1226: when a 32 bit entity is encountered, reset the secondCh variable after using it * * Revision 1.32 2004/01/29 11:52:30 cargilld * Code cleanup changes to get rid of various compiler diagnostic messages. * * Revision 1.31 2003/12/31 15:40:00 cargilld * Release memory when an error is encountered. * * Revision 1.30 2003/12/17 00:18:40 cargilld * Update to memory management so that the static memory manager (one used to call Initialize) is only for static data. * * Revision 1.29 2003/10/01 16:32:41 neilg * improve handling of out of memory conditions, bug #23415. Thanks to David Cargill. * * Revision 1.28 2003/07/10 19:50:12 peiyongz * Stateless Grammar: create grammar components with grammarPool's memory Manager * * Revision 1.27 2003/05/18 14:02:06 knoaman * Memory manager implementation: pass per instance manager. * * Revision 1.26 2003/05/16 21:43:19 knoaman * Memory manager implementation: Modify constructors to pass in the memory manager. * * Revision 1.25 2003/05/15 18:54:50 knoaman * Partial implementation of the configurable memory manager. * * Revision 1.24 2003/03/10 15:28:07 tng * XML1.0 Errata E38 * * Revision 1.23 2003/02/05 22:07:09 tng * [Bug 3111] Problem with LexicalHandler::startDTD() and LexicalHandler::endDTD(). * * Revision 1.22 2003/01/20 22:01:38 tng * Need to check text decl when expanding PE * * Revision 1.21 2003/01/16 21:30:14 tng * [Bug 16151] Memory leak in DTDScanner with ill-formed DTD declaration. Fix by David Bertoni. * * Revision 1.20 2002/12/24 16:12:19 tng * For performance reason, move the character check to scancharref. * * Revision 1.19 2002/12/20 22:10:47 tng * XML 1.1 * * Revision 1.18 2002/12/18 14:17:55 gareth * Fix to bug #13438. When you eant a vector that calls delete[] on its members you should use RefArrayVectorOf. * * Revision 1.17 2002/12/04 02:47:25 knoaman * scanner re-organization. * * Revision 1.16 2002/11/14 22:34:11 tng * [Bug 14265] Access violation with Null systemId/publicId in DTDScanner * * Revision 1.15 2002/11/05 21:40:36 tng * Oasis test fix: * 1. Should check if content model allow character for CDataSection case * 2. Should check partial markup in entity for INCLUDE and IGNORE scenario * 3. If standalone is yes, reference to entity where its declaration is external is a well-formness fatal error (XML 1.0 Section 4.1) * If standalone is yes, reference to parameter entity where is declaration is external is a validity constraint (XML 1.0 Section 2.9) * 4. XML 1.0 Section 2.8 Partial markup in parameter entity reference. * If it is a complete declaration, partial markup is a fatal error. * * Revision 1.14 2002/11/04 14:50:40 tng * C++ Namespace Support. * * Revision 1.13 2002/09/24 20:10:30 tng * Performance: use XMLString::equals instead of XMLString::compareString * * Revision 1.12 2002/08/22 21:05:29 tng * [Bug 7475] Xerces-C++ reports validation error with Docbook. * * Revision 1.11 2002/08/22 20:26:01 tng * [Bug 7512] Wrong error message created . * * Revision 1.10 2002/08/22 19:29:13 tng * [Bug 11448] DomCount has problems with XHTML1.1 DTD. * * Revision 1.9 2002/08/19 14:40:31 tng * Fix: public id / system id in entity decl should be null if empty * * Revision 1.8 2002/07/26 13:33:44 knoaman * Public/System id for notations should be stored as NULL if missing. * * Revision 1.7 2002/07/11 18:39:48 knoaman * Access entities through the DTDGrammar instead of the scanner. * * Revision 1.6 2002/06/06 20:36:33 tng * Fix: Valid encoding name is not checked in scanning Text Decl * * Revision 1.5 2002/05/30 16:17:19 tng * Add feature to optionally ignore external DTD. * * Revision 1.4 2002/05/03 14:51:16 peiyongz * Bug#8769: UMR detected by memory tool - patch from Kenneth Palsson * * Revision 1.3 2002/02/28 22:34:36 peiyongz * Bug#2717: patch to Unterminated INCLUDE section causes infinite loop with setExitOnFirstFatalError(false) * * Revision 1.2 2002/02/26 21:06:53 knoaman * Create ZeroOrOne node only if needed. * * Revision 1.1.1.1 2002/02/01 22:22:44 peiyongz * sane_include * * Revision 1.25 2002/01/24 16:30:50 tng * [Bug 3111] Problem with LexicalHandler::startDTD() and LexicalHandler::endDTD() . * * Revision 1.24 2001/12/17 15:39:14 knoaman * Fix for surrogate pair support. * * Revision 1.23 2001/12/14 20:21:37 knoaman * Add surrogate support to comments and processing instrunctions. * * Revision 1.22 2001/12/06 17:51:18 tng * Performance Enhancement. The ContentSpecNode constructor always copied the QName * that was passed to it. Added a second constructor that allows the QName to be just assigned, not copied. * That was because there are some cases in which a temporary QName was constructed, passed to ContentSpecNode, and then deleted. * There were examples of that in TraverseSchema and DTDScanner. * By Henry Zongaro. * * Revision 1.21 2001/11/13 13:27:28 tng * Move root element check to XMLScanner. * * Revision 1.20 2001/09/05 20:49:10 knoaman * Fix for complexTypes with mixed content model. * * Revision 1.19 2001/08/02 16:54:39 tng * Reset some Scanner flags in scanReset(). * * Revision 1.18 2001/07/13 16:57:11 tng * ScanId fix. * * Revision 1.17 2001/07/12 20:10:18 tng * Partial Markup in Parameter Entity is validity constraint and thus should be just error, not fatal error. * * Revision 1.16 2001/07/10 21:09:39 tng * Give proper error messsage when scanning external id. * * Revision 1.15 2001/07/10 20:56:17 tng * Should check the first char of PI Target Name. * * Revision 1.14 2001/07/09 13:42:20 tng * Partial Markup in Parameter Entity is validity constraint and thus should be just error, not fatal error. * * Revision 1.13 2001/07/05 14:05:29 tng * Encoding String must present for external entity text decl. * * Revision 1.12 2001/07/05 13:12:19 tng * Standalone checking is validity constraint and thus should be just error, not fatal error: * * Revision 1.11 2001/06/25 14:39:54 knoaman * Fix bug #965 - submitted by Matt Lovett * * Revision 1.10 2001/06/22 12:42:33 tng * [Bug 2257] 1.5 thinks a <?xml-stylesheet ...> tag is a <?xml ...> tag * * Revision 1.9 2001/06/21 14:25:53 knoaman * Fix for bug 1946 * * Revision 1.8 2001/06/04 13:25:50 tng * the start tag "<?xml" could be followed by (#x20 | #x9 | #xD | #xA)+. Fixed by Pei Yong Zhang. * * Revision 1.7 2001/05/28 20:54:06 tng * Schema: allocate a fDTDValidator, fSchemaValidator explicitly to avoid wrong cast * * Revision 1.6 2001/05/11 13:27:09 tng * Copyright update. * * Revision 1.5 2001/05/03 20:34:36 tng * Schema: SchemaValidator update * * Revision 1.4 2001/04/23 18:54:35 tng * Reuse grammar should allow users to use any stored element decl as root. Fixed by Erik Rydgren. * * Revision 1.3 2001/04/19 18:17:21 tng * Schema: SchemaValidator update, and use QName in Content Model * * Revision 1.2 2001/03/30 16:35:17 tng * Schema: Whitespace normalization. * * Revision 1.1 2001/03/21 21:56:20 tng * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar. * */// ---------------------------------------------------------------------------// Includes// ---------------------------------------------------------------------------#include <xercesc/util/BinMemInputStream.hpp>#include <xercesc/util/FlagJanitor.hpp>#include <xercesc/util/Janitor.hpp>#include <xercesc/util/XMLUniDefs.hpp>#include <xercesc/util/UnexpectedEOFException.hpp>#include <xercesc/sax/InputSource.hpp>#include <xercesc/framework/XMLDocumentHandler.hpp>#include <xercesc/framework/XMLEntityHandler.hpp>#include <xercesc/framework/XMLValidator.hpp>#include <xercesc/internal/EndOfEntityException.hpp>#include <xercesc/internal/XMLScanner.hpp>#include <xercesc/validators/common/ContentSpecNode.hpp>#include <xercesc/validators/common/MixedContentModel.hpp>#include <xercesc/validators/DTD/DTDEntityDecl.hpp>#include <xercesc/validators/DTD/DocTypeHandler.hpp>#include <xercesc/validators/DTD/DTDScanner.hpp>#include <xercesc/util/OutOfMemoryException.hpp>XERCES_CPP_NAMESPACE_BEGIN// ---------------------------------------------------------------------------// Local methods// ---------------------------------------------------------------------------//// This method automates the grunt work of looking at a char and see if its// a repetition suffix. If so, it creates a new correct rep node and wraps// the pass node in it. Otherwise, it returns the previous node.//static ContentSpecNode* makeRepNode(const XMLCh testCh, ContentSpecNode* const prevNode, MemoryManager* const manager){ if (testCh == chQuestion) { return new (manager) ContentSpecNode ( ContentSpecNode::ZeroOrOne , prevNode , 0 , true , true , manager ); } else if (testCh == chPlus) { return new (manager) ContentSpecNode ( ContentSpecNode::OneOrMore , prevNode , 0 , true , true , manager ); } else if (testCh == chAsterisk) { return new (manager) ContentSpecNode ( ContentSpecNode::ZeroOrMore , prevNode , 0 , true , true , manager ); } // Just return the incoming node return prevNode;}// ---------------------------------------------------------------------------// DTDValidator: Constructors and Destructor// ---------------------------------------------------------------------------DTDScanner::DTDScanner( DTDGrammar* dtdGrammar , DocTypeHandler* const docTypeHandler , MemoryManager* const grammarPoolMemoryManager , MemoryManager* const manager) : fMemoryManager(manager) , fGrammarPoolMemoryManager(grammarPoolMemoryManager) , fDocTypeHandler(docTypeHandler) , fDumAttDef(0) , fDumElemDecl(0) , fDumEntityDecl(0) , fInternalSubset(false) , fNextAttrId(1) , fDTDGrammar(dtdGrammar) , fPEntityDeclPool(0) , fDocTypeReaderId(0) , fBufMgr(0) , fReaderMgr(0) , fScanner(0) , fEmptyNamespaceId(0){ fPEntityDeclPool = new (fMemoryManager) NameIdPool<DTDEntityDecl>(109, 128, fMemoryManager);}DTDScanner::~DTDScanner(){ delete fDumAttDef; delete fDumElemDecl; delete fDumEntityDecl; delete fPEntityDeclPool;}// -----------------------------------------------------------------------// Setter methods// -----------------------------------------------------------------------void DTDScanner::setScannerInfo(XMLScanner* const owningScanner , ReaderMgr* const readerMgr , XMLBufferMgr* const bufMgr){ // We don't own any of these, we just reference them fScanner = owningScanner; fReaderMgr = readerMgr; fBufMgr = bufMgr; if (fScanner->getDoNamespaces()) fEmptyNamespaceId = fScanner->getEmptyNamespaceId(); else fEmptyNamespaceId = 0; fDocTypeReaderId = fReaderMgr->getCurrentReaderNum();}// ---------------------------------------------------------------------------// DTDScanner: Private scanning methods// ---------------------------------------------------------------------------bool DTDScanner::checkForPERef( const bool inLiteral , const bool inMarkup){ bool gotSpace = false; // // See if we have any spaces up front. If so, then skip them and set // the gotSpaces flag. // if (fReaderMgr->skippedSpace()) { fReaderMgr->skipPastSpaces(); gotSpace = true; } // If the next char is a percent, then expand the PERef if (!fReaderMgr->skippedChar(chPercent)) return gotSpace; while (true) { if (!expandPERef(false, inLiteral, inMarkup, false)) fScanner->emitError(XMLErrs::ExpectedEntityRefName); // And skip any more spaces in the expanded value if (fReaderMgr->skippedSpace()) { fReaderMgr->skipPastSpaces(); gotSpace = true; } if (!fReaderMgr->skippedChar(chPercent)) break; } return gotSpace;}bool DTDScanner::expandPERef( const bool scanExternal , const bool inLiteral , const bool inMarkup , const bool throwEndOfExt){ fScanner->setHasNoDTD(false); XMLBufBid bbName(fBufMgr);
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?