sax2xmlreaderimpl.cpp
来自「IBM的解析xml的工具Xerces的源代码」· C++ 代码 · 共 1,817 行 · 第 1/4 页
CPP
1,817 行
/* * Copyright 1999-2004 The Apache Software Foundation. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. *//* * $Log: SAX2XMLReaderImpl.cpp,v $ * Revision 1.38 2004/09/30 14:07:23 peiyongz * setInputBufferSize * * Revision 1.37 2004/09/28 02:14:14 cargilld * Add support for validating annotations. * * Revision 1.36 2004/09/23 01:09:55 cargilld * Add support for generating synthetic XSAnnotations. When a schema component has non-schema attributes and no child attributes create a synthetic XSAnnotation (under feature control) so the non-schema attributes can be recovered under PSVI. * * Revision 1.35 2004/09/08 13:56:17 peiyongz * Apache License Version 2.0 * * Revision 1.34 2004/04/13 16:53:26 peiyongz * get/setIdentityConstraintChecking * * Revision 1.33 2004/01/29 11:46:32 cargilld * Code cleanup changes to get rid of various compiler diagnostic messages. * * Revision 1.32 2003/12/17 00:18:35 cargilld * Update to memory management so that the static memory manager (one used to call Initialize) is only for static data. * * Revision 1.31 2003/11/21 22:38:50 neilg * Enable grammar pools and grammar resolvers to manufacture * XSModels. This also cleans up handling in the * parser classes by eliminating the need to tell * the grammar pool that schema compoments need to be produced. * Thanks to David Cargill. * * Revision 1.30 2003/11/06 15:30:07 neilg * first part of PSVI/schema component model implementation, thanks to David Cargill. This covers setting the PSVIHandler on parser objects, as well as implementing XSNotation, XSSimpleTypeDefinition, XSIDCDefinition, and most of XSWildcard, XSComplexTypeDefinition, XSElementDeclaration, XSAttributeDeclaration and XSAttributeUse. * * Revision 1.29 2003/10/30 21:37:31 knoaman * Enhanced Entity Resolver Support. Thanks to David Cargill. * * Revision 1.28 2003/10/01 16:32:38 neilg * improve handling of out of memory conditions, bug #23415. Thanks to David Cargill. * * Revision 1.27 2003/09/16 18:30:54 neilg * make Grammar pool be responsible for creating and owning URI string pools. This is one more step towards having grammars be independent of the parsers involved in their creation * * Revision 1.26 2003/08/13 15:43:24 knoaman * Use memory manager when creating SAX exceptions. * * Revision 1.25 2003/07/31 17:05:48 peiyongz * using getGrammar(URI) * * Revision 1.24 2003/07/10 19:48:24 peiyongz * Stateless Grammar: Initialize scanner with grammarResolver, * * Revision 1.23 2003/06/25 22:36:46 peiyongz * to use new GrammarResolver::getGrammar() * * Revision 1.22 2003/06/20 18:55:54 peiyongz * Stateless Grammar Pool :: Part I * * Revision 1.21 2003/05/18 14:02:05 knoaman * Memory manager implementation: pass per instance manager. * * Revision 1.20 2003/05/16 21:36:59 knoaman * Memory manager implementation: Modify constructors to pass in the memory manager. * * Revision 1.19 2003/05/16 06:01:52 knoaman * Partial implementation of the configurable memory manager. * * Revision 1.18 2003/05/15 18:26:50 knoaman * Partial implementation of the configurable memory manager. * * Revision 1.17 2003/04/17 21:58:50 neilg * Adding a new property, * http://apache.org/xml/properties/security-manager, with * appropriate getSecurityManager/setSecurityManager methods on DOM * and SAX parsers. Also adding a new SecurityManager class. * * The purpose of these modifications is to permit applications a * means to have the parser reject documents whose processing would * otherwise consume large amounts of system resources. Malicious * use of such documents could be used to launch a denial-of-service * attack against a system running the parser. Initially, the * SecurityManager only knows about attacks that can result from * exponential entity expansion; this is the only known attack that * involves processing a single XML document. Other, simlar attacks * can be launched if arbitrary schemas may be parsed; there already * exist means (via use of the EntityResolver interface) by which * applications can deny processing of untrusted schemas. In future, * the SecurityManager will be expanded to take these other exploits * into account. * * Adding SecurityManager support * * Revision 1.16 2003/01/03 20:09:36 tng * New feature StandardUriConformant to force strict standard uri conformance. * * Revision 1.15 2002/12/27 16:16:51 knoaman * Set scanner options and handlers. * * Revision 1.14 2002/12/11 22:14:54 knoaman * Performance: no need to use temporary buffer to hold namespace value. * * Revision 1.13 2002/12/04 01:57:09 knoaman * Scanner re-organization. * * Revision 1.12 2002/11/04 14:57:03 tng * C++ Namespace Support. * * Revision 1.11 2002/09/24 20:00:32 tng * Performance: use XMLString::equals instead of XMLString::compareString * * Revision 1.10 2002/08/14 15:20:38 knoaman * [Bug 3111] Problem with LexicalHandler::startDTD() and LexicalHandler::endDTD(). * * Revision 1.9 2002/07/11 18:27:20 knoaman * Grammar caching/preparsing - initial implementation. * * Revision 1.8 2002/06/17 15:41:15 tng * To be consistent, SAX2 is updated with: * 1. the progressive parse methods should use the fReuseGrammar flag set from setFeature instead of using parameter * 2. add feature "http://apache.org/xml/features/continue-after-fatal-error", and users should use setFeature instead of setExitOnFirstFatalError * 3. add feature "http://apache.org/xml/features/validation-error-as-fatal", and users should use setFeature instead of setValidationConstraintFatal * * Revision 1.7 2002/05/30 16:20:09 tng * Add feature to optionally ignore external DTD. * * Revision 1.6 2002/05/29 21:37:47 knoaman * Add baseURI to resolveEntity to support DOMInputSource. * * Revision 1.5 2002/05/28 20:44:14 tng * [Bug 9104] prefixes dissapearing when schema validation turned on. * * Revision 1.4 2002/05/27 18:39:21 tng * To get ready for 64 bit large file, use XMLSSize_t to represent line and column number. * * Revision 1.3 2002/05/22 20:53:41 knoaman * Prepare for DOM L3 : * - Make use of the XMLEntityHandler/XMLErrorReporter interfaces, instead of using * EntityHandler/ErrorHandler directly. * - Add 'AbstractDOMParser' class to provide common functionality for XercesDOMParser * and DOMBuilder. * * Revision 1.2 2002/02/13 16:09:24 knoaman * Move SAX2 features/properties names constants to XMLUni. * * Revision 1.1.1.1 2002/02/01 22:22:06 peiyongz * sane_include * * Revision 1.25 2002/01/28 17:47:41 knoaman * Some SAX calls were not passed to the LexicalHandler. * * Revision 1.24 2002/01/28 17:08:47 knoaman * SAX2-ext's DeclHandler support. * * Revision 1.23 2002/01/28 16:29:21 knoaman * The namespace-prefixes feature in SAX2 should be off by default. * * Revision 1.22 2002/01/24 16:30:34 tng * [Bug 3111] Problem with LexicalHandler::startDTD() and LexicalHandler::endDTD() . * * Revision 1.21 2001/12/21 18:03:25 tng * [Bug 1833] LexicalHandler::startDTD not called correctly. * * Revision 1.20 2001/11/20 18:51:44 tng * Schema: schemaLocation and noNamespaceSchemaLocation to be specified outside the instance document. New methods setExternalSchemaLocation and setExternalNoNamespaceSchemaLocation are added (for SAX2, two new properties are added). * * Revision 1.19 2001/10/25 19:46:15 tng * Comment outside root element should also be reported. * * Revision 1.18 2001/09/12 13:03:43 tng * [Bug 3155] SAX2 does not offer progressive parse. * * Revision 1.17 2001/08/02 19:00:46 tng * [Bug 1329] SAX2XMLReaderImpl leaks XMLBuffers. * * Revision 1.16 2001/08/01 19:11:02 tng * Add full schema constraint checking flag to the samples and the parser. * * Revision 1.15 2001/06/27 17:39:50 knoaman * Fix for bug #2353. * * Revision 1.14 2001/06/19 16:45:08 tng * Add installAdvDocHandler to SAX2XMLReader as the code is there already. * * Revision 1.13 2001/06/03 19:26:19 jberry * Add support for querying error count following parse; enables simple parse without requiring error handler. * * Revision 1.12 2001/05/11 13:26:21 tng * Copyright update. * * Revision 1.11 2001/05/03 20:34:33 tng * Schema: SchemaValidator update * * Revision 1.10 2001/03/30 16:46:57 tng * Schema: Use setDoSchema instead of setSchemaValidation which makes more sense. * * Revision 1.9 2001/03/21 21:56:08 tng * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar. * * Revision 1.8 2001/02/15 15:56:29 tng * Schema: Add setSchemaValidation and getSchemaValidation for DOMParser and SAXParser. * Add feature "http://apache.org/xml/features/validation/schema" for SAX2XMLReader. * New data field fSchemaValidation in XMLScanner as the flag. * * Revision 1.7 2001/01/15 21:26:33 tng * Performance Patches by David Bertoni. * * Details: (see xerces-c-dev mailing Jan 14) * XMLRecognizer.cpp: the internal encoding string XMLUni::fgXMLChEncodingString * was going through this function numerous times. As a result, the top hot-spot * for the parse was _wcsicmp(). The real problem is that the Microsofts wide string * functions are unbelievably slow. For things like encodings, it might be * better to use a special comparison function that only considers a-z and * A-Z as characters with case. This works since the character set for * encodings is limit to printable ASCII characters. * * XMLScanner2.cpp: This also has some case-sensitive vs. insensitive compares. * They are also much faster. The other tweak is to only make a copy of an attribute * string if it needs to be split. And then, the strategy is to try to use a * stack-based buffer, rather than a dynamically-allocated one. * * SAX2XMLReaderImpl.cpp: Again, more case-sensitive vs. insensitive comparisons. * * KVStringPair.cpp & hpp: By storing the size of the allocation, the storage can * likely be re-used many times, cutting down on dynamic memory allocations. * * XMLString.hpp: a more efficient implementation of stringLen(). * * DTDValidator.cpp: another case of using a stack-based buffer when possible * * These patches made a big difference in parse time in some of our test * files, especially the ones are very attribute-heavy. * * Revision 1.6 2000/12/22 20:41:52 tng * XMLUni::fgEmptyString which is defined as "EMPTY" is incorrectly used as an empty string; in fact XMLUni::fgZeroLenString should be used instead * * Revision 1.5 2000/12/22 15:16:51 tng * SAX2-ext's LexicalHandler support added by David Bertoni. * * Revision 1.4 2000/08/09 23:39:58 jpolast * should be namespace-prefixes; not namespaces-prefixes * * Revision 1.3 2000/08/09 22:16:12 jpolast * many conformance & stability changes: * - ContentHandler::resetDocument() removed * - attrs param of ContentHandler::startDocument() made const * - SAXExceptions thrown now have msgs * - removed duplicate function signatures that had 'const' * [ eg: getContentHander() ] * - changed getFeature and getProperty to apply to const objs * - setProperty now takes a void* instead of const void* * - SAX2XMLReaderImpl does not inherit from SAXParser anymore * - Reuse Validator (http://apache.org/xml/features/reuse-validator) implemented * - Features & Properties now read-only during parse * * Revision 1.2 2000/08/07 22:53:44 jpolast * fixes for when 'namespaces' feature is turned off: * * namespaces-prefixes only used when namespaces is on * * URIs not looked up when namespaces is off, blank string instead * * default validation scheme is validation on, auto-validation off. * * Revision 1.1 2000/08/02 18:04:41 jpolast * initial checkin of sax2 implemenation * submitted by Simon Fell (simon@fell.com) * and Joe Polastre (jpolast@apache.org) * * */#include <xercesc/util/IOException.hpp>#include <xercesc/util/XMLChTranscoder.hpp>#include <xercesc/util/RefStackOf.hpp>#include <xercesc/util/XMLUniDefs.hpp>#include <xercesc/util/Janitor.hpp>#include <xercesc/sax2/ContentHandler.hpp>#include <xercesc/sax2/LexicalHandler.hpp>#include <xercesc/sax2/DeclHandler.hpp>#include <xercesc/sax/DTDHandler.hpp>#include <xercesc/sax/ErrorHandler.hpp>#include <xercesc/sax/EntityResolver.hpp>#include <xercesc/sax/SAXParseException.hpp>#include <xercesc/sax/SAXException.hpp>#include <xercesc/internal/XMLScannerResolver.hpp>#include <xercesc/parsers/SAX2XMLReaderImpl.hpp>#include <xercesc/validators/common/GrammarResolver.hpp>#include <xercesc/framework/XMLGrammarPool.hpp>#include <xercesc/framework/XMLSchemaDescription.hpp>#include <xercesc/util/OutOfMemoryException.hpp>#include <xercesc/util/XMLEntityResolver.hpp>#include <string.h>XERCES_CPP_NAMESPACE_BEGINconst XMLCh gDTDEntityStr[] ={ chOpenSquare, chLatin_d, chLatin_t, chLatin_d, chCloseSquare, chNull};SAX2XMLReaderImpl::SAX2XMLReaderImpl(MemoryManager* const manager , XMLGrammarPool* const gramPool): fNamespacePrefix(false) , fAutoValidation(false) , fValidation(true) , fParseInProgress(false) , fHasExternalSubset(false) , fElemDepth(0) , fAdvDHCount(0) , fAdvDHListSize(32) , fDocHandler(0) , fTempAttrVec(0) , fPrefixes(0) , fPrefixCounts(0) , fDTDHandler(0) , fEntityResolver(0) , fXMLEntityResolver(0) , fErrorHandler(0) , fPSVIHandler(0) , fLexicalHandler(0) , fDeclHandler(0) , fAdvDHList(0) , fScanner(0) , fGrammarResolver(0) , fURIStringPool(0) , fValidator(0) , fMemoryManager(manager) , fGrammarPool(gramPool) , fStringBuffers(manager){ try { initialize(); } catch(const OutOfMemoryException&) { throw; } catch(...) { cleanUp(); throw; }}SAX2XMLReaderImpl::~SAX2XMLReaderImpl(){ cleanUp();}// ---------------------------------------------------------------------------// SAX2XMLReaderImpl: Initialize/Cleanup methods// ---------------------------------------------------------------------------void SAX2XMLReaderImpl::initialize(){ // Create grammar resolver and string pool that we pass to the scanner fGrammarResolver = new (fMemoryManager) GrammarResolver(fGrammarPool, fMemoryManager); fURIStringPool = fGrammarResolver->getStringPool(); // Create a scanner and tell it what validator to use. Then set us // as the document event handler so we can fill the DOM document. fScanner = XMLScannerResolver::getDefaultScanner(0, fGrammarResolver, fMemoryManager); fScanner->setURIStringPool(fURIStringPool); // Create the initial advanced handler list array and zero it out fAdvDHList = (XMLDocumentHandler**) fMemoryManager->allocate ( fAdvDHListSize * sizeof(XMLDocumentHandler*) );//new XMLDocumentHandler*[fAdvDHListSize]; memset(fAdvDHList, 0, sizeof(void*) * fAdvDHListSize); // SAX2 default is for namespaces (feature http://xml.org/sax/features/namespaces) to be on setDoNamespaces(true) ; // default: schema is on setDoSchema(true); fPrefixes = new (fMemoryManager) RefStackOf<XMLBuffer> (10, false, fMemoryManager) ; fTempAttrVec = new (fMemoryManager) RefVectorOf<XMLAttr> (10, false, fMemoryManager) ; fPrefixCounts = new (fMemoryManager) ValueStackOf<unsigned int>(10, fMemoryManager) ;}void SAX2XMLReaderImpl::cleanUp(){ fMemoryManager->deallocate(fAdvDHList);//delete [] fAdvDHList; delete fScanner; delete fPrefixes; delete fTempAttrVec; delete fPrefixCounts; delete fGrammarResolver; // grammar pool must do this //delete fURIStringPool;}// ---------------------------------------------------------------------------// SAX2XMLReaderImpl: Advanced document handler list maintenance methods// ---------------------------------------------------------------------------void SAX2XMLReaderImpl::installAdvDocHandler(XMLDocumentHandler* const toInstall){ // See if we need to expand and do so now if needed if (fAdvDHCount == fAdvDHListSize) { // Calc a new size and allocate the new temp buffer const unsigned int newSize = (unsigned int)(fAdvDHListSize * 1.5); XMLDocumentHandler** newList = (XMLDocumentHandler**) fMemoryManager->allocate ( newSize * sizeof(XMLDocumentHandler*) );//new XMLDocumentHandler*[newSize]; // Copy over the old data to the new list and zero out the rest memcpy(newList, fAdvDHList, sizeof(void*) * fAdvDHListSize); memset ( &newList[fAdvDHListSize] , 0 , sizeof(void*) * (newSize - fAdvDHListSize) ); // And now clean up the old array and store the new stuff fMemoryManager->deallocate(fAdvDHList);//delete [] fAdvDHList; fAdvDHList = newList; fAdvDHListSize = newSize; } // Add this new guy into the empty slot fAdvDHList[fAdvDHCount++] = toInstall; // // Install ourself as the document handler with the scanner. We might // already be, but its not worth checking, just do it. // fScanner->setDocHandler(this);}bool SAX2XMLReaderImpl::removeAdvDocHandler(XMLDocumentHandler* const toRemove){ // If our count is zero, can't be any installed if (!fAdvDHCount) return false;
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?