dtdscanner.cpp

来自「IBM的解析xml的工具Xerces的源代码」· C++ 代码 · 共 1,936 行 · 第 1/5 页

CPP
1,936
字号
/* * Copyright 1999-2001,2004 The Apache Software Foundation. *  * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at *  *      http://www.apache.org/licenses/LICENSE-2.0 *  * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. *//* * $Log: DTDScanner.cpp,v $ * Revision 1.35  2004/09/30 13:14:27  amassari * Fix jira#1280 - Borland leaks memory if break or continue are used inside a catch block * * Revision 1.34  2004/09/08 13:56:50  peiyongz * Apache License Version 2.0 * * Revision 1.33  2004/07/06 15:57:55  amassari * Fix for jira#1226: when a 32 bit entity is encountered, reset the secondCh variable after using it * * Revision 1.32  2004/01/29 11:52:30  cargilld * Code cleanup changes to get rid of various compiler diagnostic messages. * * Revision 1.31  2003/12/31 15:40:00  cargilld * Release memory when an error is encountered. * * Revision 1.30  2003/12/17 00:18:40  cargilld * Update to memory management so that the static memory manager (one used to call Initialize) is only for static data. * * Revision 1.29  2003/10/01 16:32:41  neilg * improve handling of out of memory conditions, bug #23415.  Thanks to David Cargill. * * Revision 1.28  2003/07/10 19:50:12  peiyongz * Stateless Grammar: create grammar components with grammarPool's memory Manager * * Revision 1.27  2003/05/18 14:02:06  knoaman * Memory manager implementation: pass per instance manager. * * Revision 1.26  2003/05/16 21:43:19  knoaman * Memory manager implementation: Modify constructors to pass in the memory manager. * * Revision 1.25  2003/05/15 18:54:50  knoaman * Partial implementation of the configurable memory manager. * * Revision 1.24  2003/03/10 15:28:07  tng * XML1.0 Errata E38 * * Revision 1.23  2003/02/05 22:07:09  tng * [Bug 3111] Problem with LexicalHandler::startDTD() and LexicalHandler::endDTD(). * * Revision 1.22  2003/01/20 22:01:38  tng * Need to check text decl when expanding PE * * Revision 1.21  2003/01/16 21:30:14  tng * [Bug 16151] Memory leak in DTDScanner with ill-formed DTD declaration.  Fix by David Bertoni. * * Revision 1.20  2002/12/24 16:12:19  tng * For performance reason, move the character check to scancharref. * * Revision 1.19  2002/12/20 22:10:47  tng * XML 1.1 * * Revision 1.18  2002/12/18 14:17:55  gareth * Fix to bug #13438. When you eant a vector that calls delete[] on its members you should use RefArrayVectorOf. * * Revision 1.17  2002/12/04 02:47:25  knoaman * scanner re-organization. * * Revision 1.16  2002/11/14 22:34:11  tng * [Bug 14265] Access violation with Null systemId/publicId in DTDScanner * * Revision 1.15  2002/11/05 21:40:36  tng * Oasis test fix: * 1.  Should check if content model allow character for CDataSection case * 2. Should check partial markup in entity for INCLUDE and IGNORE scenario * 3. If standalone is yes, reference to entity where its declaration is external is a well-formness fatal error (XML 1.0 Section 4.1) * If standalone is yes, reference to parameter entity where is declaration is external is a validity constraint (XML 1.0 Section 2.9) * 4.  XML 1.0 Section 2.8 Partial markup in parameter entity reference. * If it is a complete declaration, partial markup is a fatal error. * * Revision 1.14  2002/11/04 14:50:40  tng * C++ Namespace Support. * * Revision 1.13  2002/09/24 20:10:30  tng * Performance: use XMLString::equals instead of XMLString::compareString * * Revision 1.12  2002/08/22 21:05:29  tng * [Bug 7475] Xerces-C++ reports validation error with Docbook. * * Revision 1.11  2002/08/22 20:26:01  tng * [Bug 7512] Wrong error message created . * * Revision 1.10  2002/08/22 19:29:13  tng * [Bug 11448] DomCount has problems with XHTML1.1 DTD. * * Revision 1.9  2002/08/19 14:40:31  tng * Fix: public id / system id in entity decl should be null if empty * * Revision 1.8  2002/07/26 13:33:44  knoaman * Public/System id for notations should be stored as NULL if missing. * * Revision 1.7  2002/07/11 18:39:48  knoaman * Access entities through the DTDGrammar instead of the scanner. * * Revision 1.6  2002/06/06 20:36:33  tng * Fix: Valid encoding name is not checked in scanning Text Decl * * Revision 1.5  2002/05/30 16:17:19  tng * Add feature to optionally ignore external DTD. * * Revision 1.4  2002/05/03 14:51:16  peiyongz * Bug#8769: UMR detected by memory tool - patch from Kenneth Palsson * * Revision 1.3  2002/02/28 22:34:36  peiyongz * Bug#2717: patch to Unterminated INCLUDE section causes infinite loop with setExitOnFirstFatalError(false) * * Revision 1.2  2002/02/26 21:06:53  knoaman * Create ZeroOrOne node only if needed. * * Revision 1.1.1.1  2002/02/01 22:22:44  peiyongz * sane_include * * Revision 1.25  2002/01/24 16:30:50  tng * [Bug 3111] Problem with LexicalHandler::startDTD() and LexicalHandler::endDTD() . * * Revision 1.24  2001/12/17 15:39:14  knoaman * Fix for surrogate pair support. * * Revision 1.23  2001/12/14 20:21:37  knoaman * Add surrogate support to comments and processing instrunctions. * * Revision 1.22  2001/12/06 17:51:18  tng * Performance Enhancement. The ContentSpecNode constructor always copied the QName * that was passed to it.  Added a second constructor that allows the QName to be just assigned, not copied. * That was because there are some cases in which a temporary QName was constructed, passed to ContentSpecNode, and then deleted. * There were examples of that in TraverseSchema and DTDScanner. * By Henry Zongaro. * * Revision 1.21  2001/11/13 13:27:28  tng * Move root element check to XMLScanner. * * Revision 1.20  2001/09/05 20:49:10  knoaman * Fix for complexTypes with mixed content model. * * Revision 1.19  2001/08/02 16:54:39  tng * Reset some Scanner flags in scanReset(). * * Revision 1.18  2001/07/13 16:57:11  tng * ScanId fix. * * Revision 1.17  2001/07/12 20:10:18  tng * Partial Markup in Parameter Entity is validity constraint and thus should be just error, not fatal error. * * Revision 1.16  2001/07/10 21:09:39  tng * Give proper error messsage when scanning external id. * * Revision 1.15  2001/07/10 20:56:17  tng * Should check the first char of PI Target Name. * * Revision 1.14  2001/07/09 13:42:20  tng * Partial Markup in Parameter Entity is validity constraint and thus should be just error, not fatal error. * * Revision 1.13  2001/07/05 14:05:29  tng * Encoding String must present for external entity text decl. * * Revision 1.12  2001/07/05 13:12:19  tng * Standalone checking is validity constraint and thus should be just error, not fatal error: * * Revision 1.11  2001/06/25 14:39:54  knoaman * Fix bug #965 - submitted by Matt Lovett * * Revision 1.10  2001/06/22 12:42:33  tng * [Bug 2257] 1.5 thinks a <?xml-stylesheet ...> tag is a <?xml ...> tag * * Revision 1.9  2001/06/21 14:25:53  knoaman * Fix for bug 1946 * * Revision 1.8  2001/06/04 13:25:50  tng * the start tag "<?xml" could be followed by (#x20 | #x9 | #xD | #xA)+.  Fixed by Pei Yong Zhang. * * Revision 1.7  2001/05/28 20:54:06  tng * Schema: allocate a fDTDValidator, fSchemaValidator explicitly to avoid wrong cast * * Revision 1.6  2001/05/11 13:27:09  tng * Copyright update. * * Revision 1.5  2001/05/03 20:34:36  tng * Schema: SchemaValidator update * * Revision 1.4  2001/04/23 18:54:35  tng * Reuse grammar should allow users to use any stored element decl as root.  Fixed by Erik Rydgren. * * Revision 1.3  2001/04/19 18:17:21  tng * Schema: SchemaValidator update, and use QName in Content Model * * Revision 1.2  2001/03/30 16:35:17  tng * Schema: Whitespace normalization. * * Revision 1.1  2001/03/21 21:56:20  tng * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar. * */// ---------------------------------------------------------------------------//  Includes// ---------------------------------------------------------------------------#include <xercesc/util/BinMemInputStream.hpp>#include <xercesc/util/FlagJanitor.hpp>#include <xercesc/util/Janitor.hpp>#include <xercesc/util/XMLUniDefs.hpp>#include <xercesc/util/UnexpectedEOFException.hpp>#include <xercesc/sax/InputSource.hpp>#include <xercesc/framework/XMLDocumentHandler.hpp>#include <xercesc/framework/XMLEntityHandler.hpp>#include <xercesc/framework/XMLValidator.hpp>#include <xercesc/internal/EndOfEntityException.hpp>#include <xercesc/internal/XMLScanner.hpp>#include <xercesc/validators/common/ContentSpecNode.hpp>#include <xercesc/validators/common/MixedContentModel.hpp>#include <xercesc/validators/DTD/DTDEntityDecl.hpp>#include <xercesc/validators/DTD/DocTypeHandler.hpp>#include <xercesc/validators/DTD/DTDScanner.hpp>#include <xercesc/util/OutOfMemoryException.hpp>XERCES_CPP_NAMESPACE_BEGIN// ---------------------------------------------------------------------------//  Local methods// ---------------------------------------------------------------------------////  This method automates the grunt work of looking at a char and see if its//  a repetition suffix. If so, it creates a new correct rep node and wraps//  the pass node in it. Otherwise, it returns the previous node.//static ContentSpecNode* makeRepNode(const XMLCh testCh,                                    ContentSpecNode* const prevNode,                                    MemoryManager* const manager){    if (testCh == chQuestion)    {        return new (manager) ContentSpecNode        (            ContentSpecNode::ZeroOrOne            , prevNode            , 0            , true            , true            , manager        );    }     else if (testCh == chPlus)    {        return new (manager) ContentSpecNode        (            ContentSpecNode::OneOrMore            , prevNode            , 0            , true            , true            , manager        );    }     else if (testCh == chAsterisk)    {        return new (manager) ContentSpecNode        (            ContentSpecNode::ZeroOrMore            , prevNode            , 0            , true            , true            , manager        );    }    // Just return the incoming node    return prevNode;}// ---------------------------------------------------------------------------//  DTDValidator: Constructors and Destructor// ---------------------------------------------------------------------------DTDScanner::DTDScanner( DTDGrammar*           dtdGrammar                      , DocTypeHandler* const docTypeHandler                      , MemoryManager* const  grammarPoolMemoryManager                      , MemoryManager* const  manager) :    fMemoryManager(manager)    , fGrammarPoolMemoryManager(grammarPoolMemoryManager)    , fDocTypeHandler(docTypeHandler)    , fDumAttDef(0)    , fDumElemDecl(0)    , fDumEntityDecl(0)    , fInternalSubset(false)    , fNextAttrId(1)    , fDTDGrammar(dtdGrammar)    , fPEntityDeclPool(0)    , fDocTypeReaderId(0)    , fBufMgr(0)    , fReaderMgr(0)    , fScanner(0)    , fEmptyNamespaceId(0){    fPEntityDeclPool = new (fMemoryManager) NameIdPool<DTDEntityDecl>(109, 128, fMemoryManager);}DTDScanner::~DTDScanner(){    delete fDumAttDef;    delete fDumElemDecl;    delete fDumEntityDecl;    delete fPEntityDeclPool;}// -----------------------------------------------------------------------//  Setter methods// -----------------------------------------------------------------------void DTDScanner::setScannerInfo(XMLScanner* const      owningScanner                            , ReaderMgr* const      readerMgr                            , XMLBufferMgr* const   bufMgr){    // We don't own any of these, we just reference them    fScanner = owningScanner;    fReaderMgr = readerMgr;    fBufMgr = bufMgr;    if (fScanner->getDoNamespaces())        fEmptyNamespaceId = fScanner->getEmptyNamespaceId();    else        fEmptyNamespaceId = 0;    fDocTypeReaderId = fReaderMgr->getCurrentReaderNum();}// ---------------------------------------------------------------------------//  DTDScanner: Private scanning methods// ---------------------------------------------------------------------------bool DTDScanner::checkForPERef(   const bool    inLiteral                                , const bool    inMarkup){    bool gotSpace = false;    //    //  See if we have any spaces up front. If so, then skip them and set    //  the gotSpaces flag.    //    if (fReaderMgr->skippedSpace())    {        fReaderMgr->skipPastSpaces();        gotSpace = true;    }    // If the next char is a percent, then expand the PERef    if (!fReaderMgr->skippedChar(chPercent))       return gotSpace;    while (true)    {       if (!expandPERef(false, inLiteral, inMarkup, false))          fScanner->emitError(XMLErrs::ExpectedEntityRefName);       // And skip any more spaces in the expanded value       if (fReaderMgr->skippedSpace())       {          fReaderMgr->skipPastSpaces();          gotSpace = true;       }       if (!fReaderMgr->skippedChar(chPercent))          break;    }    return gotSpace;}bool DTDScanner::expandPERef( const   bool    scanExternal                                , const bool    inLiteral                                , const bool    inMarkup                                , const bool    throwEndOfExt){    fScanner->setHasNoDTD(false);    XMLBufBid bbName(fBufMgr);

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?