📄 xmlreader.hpp
字号:
/* * The Apache Software License, Version 1.1 * * Copyright (c) 1999-2001 The Apache Software Foundation. All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Apache Software Foundation (http://www.apache.org/)." * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * * 4. The names "Xerces" and "Apache Software Foundation" must * not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache\@apache.org. * * 5. Products derived from this software may not be called "Apache", * nor may "Apache" appear in their name, without prior written * permission of the Apache Software Foundation. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ==================================================================== * * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation, and was * originally based on software copyright (c) 1999, International * Business Machines, Inc., http://www.ibm.com . For more information * on the Apache Software Foundation, please see * <http://www.apache.org/>. *//* * $Log: XMLReader.hpp,v $ * Revision 1.15 2004/01/29 11:46:30 cargilld * Code cleanup changes to get rid of various compiler diagnostic messages. * * Revision 1.14 2003/05/16 21:36:58 knoaman * Memory manager implementation: Modify constructors to pass in the memory manager. * * Revision 1.13 2003/05/15 18:26:29 knoaman * Partial implementation of the configurable memory manager. * * Revision 1.12 2003/01/27 16:50:27 knoaman * some cleanup. * * Revision 1.11 2002/12/20 22:09:56 tng * XML 1.1 * * Revision 1.10 2002/12/11 22:09:08 knoaman * Performance: reduce instructions count. * * Revision 1.9 2002/12/03 15:31:19 knoaman * Enable/disable calculation of src offset. * * Revision 1.8 2002/12/02 17:20:05 knoaman * Remove unused data member. * * Revision 1.7 2002/11/28 19:19:12 knoaman * Performance: remove unnecessary if condition. * * Revision 1.6 2002/11/28 18:17:22 knoaman * Performance: make getNextChar/peekNextChar inline. * * Revision 1.5 2002/11/25 21:31:08 tng * Performance: * 1. use XMLRecognizer::Encodings enum to make new transcode, faster than comparing the encoding string every time. * 2. Pre uppercase the encodingString before calling encodingForName to avoid calling compareIString * * Revision 1.4 2002/11/04 14:58:19 tng * C++ Namespace Support. * * Revision 1.3 2002/09/27 12:56:23 tng * [Bug 12740] Extra include. By Peter Volchek. * * Revision 1.2 2002/05/27 18:42:14 tng * To get ready for 64 bit large file, use XMLSSize_t to represent line and column number. * * Revision 1.1.1.1 2002/02/01 22:22:02 peiyongz * sane_include * * Revision 1.18 2001/12/06 17:47:04 tng * Performance Enhancement. Modify the handling of the fNEL option so that it results in fgCharCharsTable being modified, instead of having all of the low-level routines check the option. This seemed acceptable because the code appears to only permit the option to be turned on and not turned off again. By Henry Zongaro. * * Revision 1.17 2001/07/12 18:50:13 tng * Some performance modification regarding standalone check and xml decl check. * * Revision 1.16 2001/05/11 13:26:17 tng * Copyright update. * * Revision 1.15 2001/05/03 18:42:51 knoaman * Added new option to the parsers so that the NEL (0x85) char can be treated as a newline character. * * Revision 1.14 2001/01/25 19:16:58 tng * const should be used instead of static const. Fixed by Khaled Noaman. * * Revision 1.13 2000/07/25 22:33:05 aruna1 * Char definitions in XMLUni moved to XMLUniDefs * * Revision 1.12 2000/07/08 00:17:13 andyh * Cleanup of yesterday's speedup changes. Merged new bit into the * scanner character properties table. * * Revision 1.11 2000/07/07 01:08:44 andyh * Parser speed up in scan of XML content. * * Revision 1.10 2000/07/06 21:00:52 jpolast * inlined getNextCharIfNot() for better performance * * Revision 1.9 2000/05/11 23:11:33 andyh * Add missing validity checks for stand-alone documents, character range * and Well-formed parsed entities. Changes contributed by Sean MacRoibeaird * <sean.Macroibeaird@ireland.sun.com> * * Revision 1.8 2000/03/02 19:54:29 roddey * This checkin includes many changes done while waiting for the * 1.1.0 code to be finished. I can't list them all here, but a list is * available elsewhere. * * Revision 1.7 2000/02/24 20:18:07 abagchi * Swat for removing Log from API docs * * Revision 1.6 2000/02/06 07:47:53 rahulj * Year 2K copyright swat. * * Revision 1.5 2000/01/25 01:04:21 roddey * Fixes a bogus error about ]]> in char data. * * Revision 1.4 2000/01/22 00:01:08 roddey * Simple change to get rid of two hard coded 'x' type characters, which won't * work on EBCDIC systems. * * Revision 1.3 1999/12/18 00:20:00 roddey * More changes to support the new, completely orthagonal, support for * intrinsic encodings. * * Revision 1.2 1999/12/15 19:48:03 roddey * Changed to use new split of transcoder interfaces into XML transcoders and * LCP transcoders, and implementation of intrinsic transcoders as pluggable * transcoders, and addition of Latin1 intrinsic support. * * Revision 1.1.1.1 1999/11/09 01:08:22 twl * Initial checkin * * Revision 1.3 1999/11/08 20:44:47 rahul * Swat for adding in Product name and CVS comment log variable. * */#if !defined(XMLREADER_HPP)#define XMLREADER_HPP#include <xercesc/util/XMLChar.hpp>#include <xercesc/framework/XMLRecognizer.hpp>#include <xercesc/framework/XMLBuffer.hpp>XERCES_CPP_NAMESPACE_BEGINclass InputSource;class BinInputStream;class ReaderMgr;class XMLScanner;class XMLTranscoder;// ---------------------------------------------------------------------------// Instances of this class are used to manage the content of entities. The// scanner maintains a stack of these, one for each entity (this means entity// in the sense of any parsed file or internal entity) currently being// scanned. This class, given a binary input stream will handle reading in// the data and decoding it from its external decoding into the internal// Unicode format. Once internallized, this class provides the access// methods to read in the data in various ways, maintains line and column// information, and provides high performance character attribute checking// methods.//// This is NOT to be derived from.//// ---------------------------------------------------------------------------class XMLPARSER_EXPORT XMLReader : public XMemory{public: // ----------------------------------------------------------------------- // Public types // ----------------------------------------------------------------------- enum Types { Type_PE , Type_General }; enum Sources { Source_Internal , Source_External }; enum RefFrom { RefFrom_Literal , RefFrom_NonLiteral }; enum XMLVersion { XMLV1_0 , XMLV1_1 , XMLV_Unknown }; // ----------------------------------------------------------------------- // Public, query methods // ----------------------------------------------------------------------- bool isAllSpaces ( const XMLCh* const toCheck , const unsigned int count ); bool containsWhiteSpace ( const XMLCh* const toCheck , const unsigned int count ); bool isXMLLetter(const XMLCh toCheck); bool isFirstNameChar(const XMLCh toCheck); bool isNameChar(const XMLCh toCheck); bool isPlainContentChar(const XMLCh toCheck); bool isSpecialStartTagChar(const XMLCh toCheck); bool isXMLChar(const XMLCh toCheck); bool isWhitespace(const XMLCh toCheck); bool isControlChar(const XMLCh toCheck); bool isPublicIdChar(const XMLCh toCheck); // ----------------------------------------------------------------------- // Constructors and Destructor // ----------------------------------------------------------------------- XMLReader ( const XMLCh* const pubId , const XMLCh* const sysId , BinInputStream* const streamToAdopt , const RefFrom from , const Types type , const Sources source , const bool throwAtEnd = false , const bool calculateSrcOfs = true , const XMLVersion xmlVersion = XMLV1_0 , MemoryManager* const manager = XMLPlatformUtils::fgMemoryManager ); XMLReader ( const XMLCh* const pubId , const XMLCh* const sysId , BinInputStream* const streamToAdopt , const XMLCh* const encodingStr , const RefFrom from , const Types type , const Sources source , const bool throwAtEnd = false , const bool calculateSrcOfs = true , const XMLVersion xmlVersion = XMLV1_0 , MemoryManager* const manager = XMLPlatformUtils::fgMemoryManager ); XMLReader ( const XMLCh* const pubId , const XMLCh* const sysId , BinInputStream* const streamToAdopt , XMLRecognizer::Encodings encodingEnum , const RefFrom from , const Types type , const Sources source , const bool throwAtEnd = false , const bool calculateSrcOfs = true , const XMLVersion xmlVersion = XMLV1_0 , MemoryManager* const manager = XMLPlatformUtils::fgMemoryManager ); ~XMLReader(); // ----------------------------------------------------------------------- // Character buffer management methods // ----------------------------------------------------------------------- unsigned long charsLeftInBuffer() const; bool refreshCharBuffer(); // ----------------------------------------------------------------------- // Scanning methods // ----------------------------------------------------------------------- bool getName(XMLBuffer& toFill, const bool token); bool getNextChar(XMLCh& chGotten);
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -