saxparser.cpp

来自「IBM的解析xml的工具Xerces的源代码」· C++ 代码 · 共 1,546 行 · 第 1/3 页

CPP
1,546
字号
/* * Copyright 1999-2004 The Apache Software Foundation. *  * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at *  *      http://www.apache.org/licenses/LICENSE-2.0 *  * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. *//* * $Log: SAXParser.cpp,v $ * Revision 1.36  2004/09/29 19:00:29  peiyongz * [jira1207] --patch from Dan Rosen * * Revision 1.35  2004/09/28 02:14:14  cargilld * Add support for validating annotations. * * Revision 1.34  2004/09/23 01:09:55  cargilld * Add support for generating synthetic XSAnnotations.  When a schema component has non-schema attributes and no child attributes create a synthetic XSAnnotation (under feature control) so the non-schema attributes can be recovered under PSVI. * * Revision 1.33  2004/09/08 13:56:17  peiyongz * Apache License Version 2.0 * * Revision 1.32  2004/04/13 16:53:26  peiyongz * get/setIdentityConstraintChecking * * Revision 1.31  2004/01/29 11:46:32  cargilld * Code cleanup changes to get rid of various compiler diagnostic messages. * * Revision 1.30  2003/12/17 00:18:35  cargilld * Update to memory management so that the static memory manager (one used to call Initialize) is only for static data. * * Revision 1.29  2003/11/21 22:38:50  neilg * Enable grammar pools and grammar resolvers to manufacture * XSModels.  This also cleans up handling in the * parser classes by eliminating the need to tell * the grammar pool that schema compoments need to be produced. * Thanks to David Cargill. * * Revision 1.28  2003/11/06 15:30:07  neilg * first part of PSVI/schema component model implementation, thanks to David Cargill.  This covers setting the PSVIHandler on parser objects, as well as implementing XSNotation, XSSimpleTypeDefinition, XSIDCDefinition, and most of XSWildcard, XSComplexTypeDefinition, XSElementDeclaration, XSAttributeDeclaration and XSAttributeUse. * * Revision 1.27  2003/10/30 21:37:31  knoaman * Enhanced Entity Resolver Support. Thanks to David Cargill. * * Revision 1.26  2003/10/01 16:32:38  neilg * improve handling of out of memory conditions, bug #23415.  Thanks to David Cargill. * * Revision 1.25  2003/09/16 18:30:54  neilg * make Grammar pool be responsible for creating and owning URI string pools.  This is one more step towards having grammars be independent of the parsers involved in their creation * * Revision 1.24  2003/08/13 15:43:24  knoaman * Use memory manager when creating SAX exceptions. * * Revision 1.23  2003/07/31 17:05:48  peiyongz * using getGrammar(URI) * * Revision 1.22  2003/07/10 19:48:24  peiyongz * Stateless Grammar: Initialize scanner with grammarResolver, * * Revision 1.21  2003/06/25 22:36:46  peiyongz * to use new GrammarResolver::getGrammar() * * Revision 1.20  2003/06/20 18:55:54  peiyongz * Stateless Grammar Pool :: Part I * * Revision 1.19  2003/05/18 14:02:05  knoaman * Memory manager implementation: pass per instance manager. * * Revision 1.18  2003/05/16 21:36:59  knoaman * Memory manager implementation: Modify constructors to pass in the memory manager. * * Revision 1.17  2003/05/15 18:26:50  knoaman * Partial implementation of the configurable memory manager. * * Revision 1.16  2003/04/17 21:58:50  neilg * Adding a new property, * http://apache.org/xml/properties/security-manager, with * appropriate getSecurityManager/setSecurityManager methods on DOM * and SAX parsers.  Also adding a new SecurityManager class. * * The purpose of these modifications is to permit applications a * means to have the parser reject documents whose processing would * otherwise consume large amounts of system resources.  Malicious * use of such documents could be used to launch a denial-of-service * attack against a system running the parser.  Initially, the * SecurityManager only knows about attacks that can result from * exponential entity expansion; this is the only known attack that * involves processing a single XML document.  Other, simlar attacks * can be launched if arbitrary schemas may be parsed; there already * exist means (via use of the EntityResolver interface) by which * applications can deny processing of untrusted schemas.  In future, * the SecurityManager will be expanded to take these other exploits * into account. * * add security manager *  * Revision 1.15  2003/02/04 19:27:43  knoaman * Performance: use global buffer to eliminate repetitive memory creation/deletion. * * Revision 1.14  2003/01/09 19:07:08  tng * [Bug 15802] Add "const" qualifier to getURIText. * * Revision 1.13  2003/01/03 20:09:36  tng * New feature StandardUriConformant to force strict standard uri conformance. * * Revision 1.12  2002/12/27 16:16:51  knoaman * Set scanner options and handlers. * * Revision 1.11  2002/12/23 15:23:18  knoaman * Added a public api to various parsers to return the src offset within the input * source. * * Revision 1.10  2002/12/04 01:57:09  knoaman * Scanner re-organization. * * Revision 1.9  2002/11/04 14:57:03  tng * C++ Namespace Support. * * Revision 1.8  2002/08/14 15:20:38  knoaman * [Bug 3111] Problem with LexicalHandler::startDTD() and LexicalHandler::endDTD(). * * Revision 1.7  2002/07/11 18:27:04  knoaman * Grammar caching/preparsing - initial implementation. * * Revision 1.6  2002/05/30 16:20:09  tng * Add feature to optionally ignore external DTD. * * Revision 1.5  2002/05/29 21:37:47  knoaman * Add baseURI to resolveEntity to support DOMInputSource. * * Revision 1.4  2002/05/28 20:44:14  tng * [Bug 9104] prefixes dissapearing when schema validation turned on. * * Revision 1.3  2002/05/27 18:39:21  tng * To get ready for 64 bit large file, use XMLSSize_t to represent line and column number. * * Revision 1.2  2002/05/22 20:53:41  knoaman * Prepare for DOM L3 : * - Make use of the XMLEntityHandler/XMLErrorReporter interfaces, instead of using * EntityHandler/ErrorHandler directly. * - Add 'AbstractDOMParser' class to provide common functionality for XercesDOMParser * and DOMBuilder. * * Revision 1.1.1.1  2002/02/01 22:22:07  peiyongz * sane_include * * Revision 1.23  2001/11/20 18:51:44  tng * Schema: schemaLocation and noNamespaceSchemaLocation to be specified outside the instance document.  New methods setExternalSchemaLocation and setExternalNoNamespaceSchemaLocation are added (for SAX2, two new properties are added). * * Revision 1.22  2001/10/25 19:46:15  tng * Comment outside root element should also be reported. * * Revision 1.21  2001/08/01 19:11:02  tng * Add full schema constraint checking flag to the samples and the parser. * * Revision 1.20  2001/06/03 19:26:20  jberry * Add support for querying error count following parse; enables simple parse without requiring error handler. * * Revision 1.19  2001/05/11 13:26:22  tng * Copyright update. * * Revision 1.18  2001/05/03 19:09:23  knoaman * Support Warning/Error/FatalError messaging. * Validity constraints errors are treated as errors, with the ability by user to set * validity constraints as fatal errors. * * Revision 1.17  2001/03/30 16:46:57  tng * Schema: Use setDoSchema instead of setSchemaValidation which makes more sense. * * Revision 1.16  2001/03/21 21:56:08  tng * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar. * * Revision 1.15  2001/02/15 15:56:29  tng * Schema: Add setSchemaValidation and getSchemaValidation for DOMParser and SAXParser. * Add feature "http://apache.org/xml/features/validation/schema" for SAX2XMLReader. * New data field  fSchemaValidation in XMLScanner as the flag. * * Revision 1.14  2000/09/05 23:38:26  andyh * Added advanced callback support for XMLDecl() * * Revision 1.13  2000/06/19 18:12:56  rahulj * Suppress the comments, characters, ignoreableWhitespaces before * root element. Only allow the PI's to get through. Still need to come * to a consensus on this. * * Revision 1.12  2000/06/17 02:00:55  rahulj * Also pass any PI's, comment's, character's occuring before root * element to the registered document Handler. Defect identified * by John Smirl and Rich Taylor. * * Revision 1.11  2000/05/15 22:31:18  andyh * Replace #include<memory.h> with <string.h> everywhere. * * Revision 1.10  2000/04/12 22:58:30  roddey * Added support for 'auto validate' mode. * * Revision 1.9  2000/04/11 19:17:58  roddey * If a SAX error handler is installed, then the resetErrors() event handler * should call the one on the installed SAX error handler. * * Revision 1.8  2000/04/05 18:56:17  roddey * Init the fDTDHandler member. Enable installation of DTDHandler * on SAX parser. * * Revision 1.7  2000/03/03 01:29:34  roddey * Added a scanReset()/parseReset() method to the scanner and * parsers, to allow for reset after early exit from a progressive parse. * Added calls to new Terminate() call to all of the samples. Improved * documentation in SAX and DOM parsers. * * Revision 1.6  2000/03/02 19:54:33  roddey * This checkin includes many changes done while waiting for the * 1.1.0 code to be finished. I can't list them all here, but a list is * available elsewhere. * * Revision 1.5  2000/02/17 03:54:26  rahulj * Added some new getters to query the parser state and * clarified the documentation. * * Revision 1.4  2000/02/06 07:47:56  rahulj * Year 2K copyright swat. * * Revision 1.3  2000/01/12 00:15:22  roddey * Changes to deal with multiply nested, relative pathed, entities and to deal * with the new URL class changes. * * Revision 1.2  1999/12/15 19:57:48  roddey * Got rid of redundant 'const' on boolean return value. Some compilers choke * on this and its useless. * * Revision 1.1.1.1  1999/11/09 01:07:50  twl * Initial checkin * * Revision 1.6  1999/11/08 20:44:53  rahul * Swat for adding in Product name and CVS comment log variable. * */// ---------------------------------------------------------------------------//  Includes// ---------------------------------------------------------------------------#include <xercesc/parsers/SAXParser.hpp>#include <xercesc/internal/XMLScannerResolver.hpp>#include <xercesc/framework/XMLValidator.hpp>#include <xercesc/util/IOException.hpp>#include <xercesc/sax/DocumentHandler.hpp>#include <xercesc/sax/DTDHandler.hpp>#include <xercesc/sax/ErrorHandler.hpp>#include <xercesc/sax/EntityResolver.hpp>#include <xercesc/sax/SAXParseException.hpp>#include <xercesc/validators/common/GrammarResolver.hpp>#include <xercesc/framework/XMLGrammarPool.hpp>#include <xercesc/framework/XMLSchemaDescription.hpp>#include <xercesc/util/Janitor.hpp>#include <xercesc/util/OutOfMemoryException.hpp>#include <xercesc/util/XMLEntityResolver.hpp>#include <string.h>XERCES_CPP_NAMESPACE_BEGIN// ---------------------------------------------------------------------------//  SAXParser: Constructors and Destructor// ---------------------------------------------------------------------------SAXParser::SAXParser( XMLValidator* const   valToAdopt                    , MemoryManager* const  manager                    , XMLGrammarPool* const gramPool):    fParseInProgress(false)    , fElemDepth(0)    , fAdvDHCount(0)    , fAdvDHListSize(32)    , fDocHandler(0)    , fDTDHandler(0)    , fEntityResolver(0)    , fXMLEntityResolver(0)    , fErrorHandler(0)    , fPSVIHandler(0)    , fAdvDHList(0)    , fScanner(0)    , fGrammarResolver(0)    , fURIStringPool(0)    , fValidator(valToAdopt)    , fMemoryManager(manager)    , fGrammarPool(gramPool)    , fElemQNameBuf(1023, manager){    try    {        initialize();    }    catch(const OutOfMemoryException&)    {        throw;    }    catch(...)    {        cleanUp();        throw;    }}SAXParser::~SAXParser(){    cleanUp();}// ---------------------------------------------------------------------------//  SAXParser: Initialize/CleanUp methods// ---------------------------------------------------------------------------void SAXParser::initialize(){    // Create grammar resolver and string pool to pass to scanner    fGrammarResolver = new (fMemoryManager) GrammarResolver(fGrammarPool, fMemoryManager);    fURIStringPool = fGrammarResolver->getStringPool();    // Create our scanner and tell it what validator to use    fScanner = XMLScannerResolver::getDefaultScanner(fValidator, fGrammarResolver, fMemoryManager);    fScanner->setURIStringPool(fURIStringPool);    // Create the initial advanced handler list array and zero it out    fAdvDHList = (XMLDocumentHandler**) fMemoryManager->allocate    (        fAdvDHListSize * sizeof(XMLDocumentHandler*)    );//new XMLDocumentHandler*[fAdvDHListSize];    memset(fAdvDHList, 0, sizeof(void*) * fAdvDHListSize);}void SAXParser::cleanUp(){    fMemoryManager->deallocate(fAdvDHList);//delete [] fAdvDHList;    delete fScanner;    delete fGrammarResolver;    // grammar pool must do this    //delete fURIStringPool;    if (fValidator)        delete fValidator;}// ---------------------------------------------------------------------------//  SAXParser: Advanced document handler list maintenance methods// ---------------------------------------------------------------------------void SAXParser::installAdvDocHandler(XMLDocumentHandler* const toInstall){    // See if we need to expand and do so now if needed    if (fAdvDHCount == fAdvDHListSize)    {        // Calc a new size and allocate the new temp buffer        const unsigned int newSize = (unsigned int)(fAdvDHListSize * 1.5);        XMLDocumentHandler** newList = (XMLDocumentHandler**) fMemoryManager->allocate        (            newSize * sizeof(XMLDocumentHandler*)        );//new XMLDocumentHandler*[newSize];        // Copy over the old data to the new list and zero out the rest        memcpy(newList, fAdvDHList, sizeof(void*) * fAdvDHListSize);        memset        (            &newList[fAdvDHListSize]            , 0            , sizeof(void*) * (newSize - fAdvDHListSize)        );        // And now clean up the old array and store the new stuff        fMemoryManager->deallocate(fAdvDHList);//delete [] fAdvDHList;        fAdvDHList = newList;        fAdvDHListSize = newSize;    }    // Add this new guy into the empty slot    fAdvDHList[fAdvDHCount++] = toInstall;    //    //  Install ourself as the document handler with the scanner. We might    //  already be, but its not worth checking, just do it.    //    fScanner->setDocHandler(this);}bool SAXParser::removeAdvDocHandler(XMLDocumentHandler* const toRemove){    // If our count is zero, can't be any installed    if (!fAdvDHCount)        return false;    //    //  Search the array until we find this handler. If we find a null entry    //  first, we can stop there before the list is kept contiguous.    //    unsigned int index;    for (index = 0; index < fAdvDHCount; index++)    {        //        //  We found it. We have to keep the list contiguous, so we have to        //  copy down any used elements after this one.        //        if (fAdvDHList[index] == toRemove)        {            //            //  Optimize if only one entry (pretty common). Otherwise, we            //  have to copy them down to compact them.            //            if (fAdvDHCount > 1)            {                index++;                while (index < fAdvDHCount)                    fAdvDHList[index - 1] = fAdvDHList[index];            }            // Bump down the count and zero out the last one            fAdvDHCount--;            fAdvDHList[fAdvDHCount] = 0;            //            //  If this leaves us with no advanced handlers and there is            //  no SAX doc handler installed on us, then remove us from the            //  scanner as the document handler.            //            if (!fAdvDHCount && !fDocHandler)                fScanner->setDocHandler(0);            return true;        }    }    // Never found it    return false;}// ---------------------------------------------------------------------------//  SAXParser: Getter methods// ---------------------------------------------------------------------------const XMLValidator& SAXParser::getValidator() const{    return *fScanner->getValidator();}bool SAXParser::getDoNamespaces() const{    return fScanner->getDoNamespaces();}bool SAXParser::getGenerateSyntheticAnnotations() const{    return fScanner->getGenerateSyntheticAnnotations();}bool SAXParser::getValidateAnnotations() const{    return fScanner->getValidateAnnotations();}bool SAXParser::getExitOnFirstFatalError() const{    return fScanner->getExitOnFirstFatal();}bool SAXParser::getValidationConstraintFatal() const{    return fScanner->getValidationConstraintFatal();}SAXParser::ValSchemes SAXParser::getValidationScheme() const{    const XMLScanner::ValSchemes scheme = fScanner->getValidationScheme();    if (scheme == XMLScanner::Val_Always)        return Val_Always;    else if (scheme == XMLScanner::Val_Never)        return Val_Never;    return Val_Auto;}bool SAXParser::getDoSchema() const{    return fScanner->getDoSchema();}bool SAXParser::getValidationSchemaFullChecking() const{    return fScanner->getValidationSchemaFullChecking();}bool SAXParser::getIdentityConstraintChecking() const{    return fScanner->getIdentityConstraintChecking();}int SAXParser::getErrorCount() const{    return fScanner->getErrorCount();}XMLCh* SAXParser::getExternalSchemaLocation() const{    return fScanner->getExternalSchemaLocation();}XMLCh* SAXParser::getExternalNoNamespaceSchemaLocation() const

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?