📄 qxmlstream.cpp
字号:
/******************************************************************************** Copyright (C) 1992-2007 Trolltech ASA. All rights reserved.**** This file is part of the QtXML module of the Qt Toolkit.**** This file may be used under the terms of the GNU General Public** License version 2.0 as published by the Free Software Foundation** and appearing in the file LICENSE.GPL included in the packaging of** this file. Please review the following information to ensure GNU** General Public Licensing requirements will be met:** http://trolltech.com/products/qt/licenses/licensing/opensource/**** If you are unsure which license is appropriate for your use, please** review the following information:** http://trolltech.com/products/qt/licenses/licensing/licensingoverview** or contact the sales department at sales@trolltech.com.**** In addition, as a special exception, Trolltech gives you certain** additional rights. These rights are described in the Trolltech GPL** Exception version 1.0, which can be found at** http://www.trolltech.com/products/qt/gplexception/ and in the file** GPL_EXCEPTION.txt in this package.**** In addition, as a special exception, Trolltech, as the sole copyright** holder for Qt Designer, grants users of the Qt/Eclipse Integration** plug-in the right for the Qt/Eclipse Integration to link to** functionality provided by Qt Designer and its related libraries.**** Trolltech reserves all rights not expressly granted herein.**** This file is provided AS IS with NO WARRANTY OF ANY KIND, INCLUDING THE** WARRANTY OF DESIGN, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.******************************************************************************/#include "qxmlstream.h"#include "qxmlutils_p.h"#include <qdebug.h>#include <QFile>#include <stdio.h>#include <qtextcodec.h>#include <qstack.h>#include <qbuffer.h>#include "qxmlstream_p.h"/*! \enum QXmlStreamReader::TokenType This enum specifies the type of token the reader just read. \value NoToken The reader has not yet read anything. \value Invalid An error has occurred, reported in error() and errorString(). \value StartDocument The reader reports the start of the document. If the document is declared standalone, isStandaloneDocument() returns true; otherwise it returns false. \value EndDocument The reader reports the end of the document. \value StartElement The reader reports the start of an element with namespaceUri() and name(). Empty elements are also reported as StartElement, followed directly by EndElement. The convenience function readElementText() can be called to concatenate all content until the corresponding EndElement. Attributes are reported in attributes(), namespace declarations in namespaceDeclarations(). \value EndElement The reader reports the end of an element with namespaceUri() and name(). \value Characters The reader reports characters in text(). If the characters are all white-space, isWhitespace() returns true. If the characters stem from a CDATA section, isCDATA() returns true. \value Comment The reader reports a comment in text(). \value DTD The reader reports a DTD in text(), notation declarations in notationDeclarations(). \value EntityReference The reader reports an entity reference that could not be resolved. The name of the reference is reported in name(), the replacement text in text(). \value ProcessingInstruction The reader reports a processing instruction in processingInstructionTarget() and processingInstructionData().*//*! \enum QXmlStreamReader::Error This enum specifies different error cases \value NoError No error has occurred. \value CustomError A custom error has been raised with raiseError() \value NotWellFormedError The parser internally raised an error due to the read XML not being well-formed. \value PrematureEndOfDocumentError The input stream ended before the document was parsed completely. This error can be recovered from. \value UnexpectedElementError The parser encountered an element that was different to those it expected.*//*! \class QXmlStreamReader \reentrant \since 4.3 \brief The QXmlStreamReader class provides a fast well-formed XML parser with a simple streaming API. \module XML \mainclass \ingroup xml-tools QXmlStreamReader is a faster and more convenient replacement for Qt's own SAX parser (see QXmlSimpleReader), and in some cases also for applications that would previously use a DOM tree (see QDomDocument). QXmlStreamReader reads data either from a QIODevice (see setDevice()), or from a raw QByteArray (see addData()). With QXmlStreamWriter, Qt provides a related class for writing XML. The basic concept of a stream reader is to report an XML document as a stream of tokens, similar to SAX. The main difference between QXmlStreamReader and SAX is \e how these XML tokens are reported. With SAX, the application must provide handlers that receive so-called XML \e events from the parser at the parser's convenience. With QXmlStreamReader, the application code itself drives the loop and pulls \e tokens from the reader one after another as it needs them. This is done by calling readNext(), which makes the reader read from the input stream until it has completed a new token, and then returns its tokenType(). A set of convenient functions like isStartElement() or text() then allows to examine this token, and to obtain information about what has been read. The big advantage of the pulling approach is the possibility to build recursive descent parsers, meaning you can split your XML parsing code easily into different methods or classes. This makes it easy to keep track of the application's own state when parsing XML. A typical loop with QXmlStreamReader looks like this: \code QXmlStreamReader xml; ... while (!xml.atEnd()) { xml.readNext(); ... // do processing } if (xml.hasError()) { ... // do error handling } \endcode QXmlStreamReader is a well-formed XML 1.0 parser that does \e not include external parsed entities. As long as no error occurs, the application code can thus be assured that the data provided by the stream reader satisfies the W3C's criteria for well-formed XML. For example, you can be certain that all tags are indeed nested and closed properly, that references to internal entities have been replaced with the correct replacement text, and that attributes have been normalized or added according to the internal subset of the DTD. If an error does occur while parsing, atEnd() returns true and error() returns the kind of error that occurred. hasError() can also be used to check whether an error has occurred. The functions errorString(), lineNumber(), columnNumber(), and characterOffset() make it possible to generate a verbose human-understandable error or warning message. In order to simplify application code, QXmlStreamReader contains a raiseError() mechanism that makes it possible to raise custom errors that then trigger the same error handling code path. The \l{QXmlStream Bookmarks Example} illustrates how to use the recursive descent technique with a subclassed stream reader to read an XML bookmark file (XBEL). \section1 Namespaces QXmlStream understands and resolves XML namespaces. E.g. in case of a StartElement, namespaceUri() returns the namespace the element is in, and name() returns the element's \e local name. The combination of namespaceUri and name uniquely identifies an element. If a namespace prefix was not declared in the XML entities parsed by the reader, the namespaceUri is empty. If you parse XML data that does not utilize namespaces according to the XML specification or doesn't use namespaces at all, you can use the element's qualifiedName() instead. A qualified name is the element's \e prefix followed by colon followed by the element's local name - exactly like the element appears in the raw XML data. Since the mapping namespaceUri to prefix is neither unique nor universal, qualifiedName() should be avoided for namespace-compliant XML data. In order to parse standalone documents that do use undeclared namespace prefixes, you can turn off namespace processing completely with the \l namespaceProcessing property. \section1 Incremental parsing QXmlStreamReader is an incremental parser. If you can't parse the entire input in one go (for example, it is huge, or is being delivered over a network connection), data can be fed to the parser in pieces. If the reader runs out of data before the document has been parsed completely, it reports a PrematureEndOfDocumentError. Once more data has arrived, either through the device or because it has been added with addData(), it recovers from that error and continues parsing on the next call to read(). For example, if you read data from the network using QHttp, you would connect its \l{QHttp::readyRead()}{readyRead()} signal to a custom slot. In this slot, you read all available data with \l{QHttp::readAll()}{readAll()} and pass it to the XML stream reader using addData(). Then you call your custom parsing function that reads the XML events from the reader. \section1 Performance and memory consumption QXmlStreamReader is memory-conservative by design, since it doesn't store the entire XML document tree in memory, but only the current token at the time it is reported. In addition, QXmlStreamReader avoids the many small string allocations that it normally takes to map an XML document to a convenient and Qt-ish API. It does this by reporting all string data as QStringRef rather than real QString objects. QStringRef is a thin wrapper around QString substrings that provides a subset of the QString API without the memory allocation and reference-counting overhead. Calling \l{QStringRef::toString()}{toString()} on any of those objects returns an equivalent real QString object.*//*! Constructs a stream reader. \sa setDevice(), addData() */QXmlStreamReader::QXmlStreamReader() : d_ptr(new QXmlStreamReaderPrivate(this)){}/*! Creates a new stream reader that reads from \a device.\sa setDevice(), clear() */QXmlStreamReader::QXmlStreamReader(QIODevice *device) : d_ptr(new QXmlStreamReaderPrivate(this)){ setDevice(device);}/*! Creates a new stream reader that reads from \a data. \sa addData(), clear(), setDevice() */QXmlStreamReader::QXmlStreamReader(const QByteArray &data) : d_ptr(new QXmlStreamReaderPrivate(this)){ Q_D(QXmlStreamReader); d->dataBuffer = data;}/*! Creates a new stream reader that reads from \a data. \sa addData(), clear(), setDevice() */QXmlStreamReader::QXmlStreamReader(const QString &data) : d_ptr(new QXmlStreamReaderPrivate(this)){ Q_D(QXmlStreamReader);#ifdef QT_NO_TEXTCODEC d->dataBuffer = data.toLatin1();#else d->dataBuffer = d->codec->fromUnicode(data); d->decoder = d->codec->makeDecoder();#endif d->lockEncoding = true;}/*! Creates a new stream reader that reads from \a data. \sa addData(), clear(), setDevice() */QXmlStreamReader::QXmlStreamReader(const char *data) : d_ptr(new QXmlStreamReaderPrivate(this)){ Q_D(QXmlStreamReader); d->dataBuffer = QByteArray(data);}/*! Destructs the reader. */QXmlStreamReader::~QXmlStreamReader(){ Q_D(QXmlStreamReader); if (d->deleteDevice) delete d->device; delete d;}/*! \fn bool QXmlStreamReader::hasError() const Returns \c true if an error has occurred, otherwise \c false. \sa errorString(), error() *//*! Sets the current device to \a device. Setting the device resets the stream to its initial state. \sa device(), clear()*/void QXmlStreamReader::setDevice(QIODevice *device){ Q_D(QXmlStreamReader); if (d->deleteDevice) { delete d->device; d->deleteDevice = false; } d->device = device; d->init();}/*! Returns the current device associated with the QXmlStreamReader, or 0 if no device has been assigned. \sa setDevice()*/QIODevice *QXmlStreamReader::device() const{ Q_D(const QXmlStreamReader); return d->device;}/*! Adds more \a data for the reader to read. This function does nothing if the reader has a device(). \sa clear() */void QXmlStreamReader::addData(const QByteArray &data){ Q_D(QXmlStreamReader); if (d->device) { qWarning("QXmlStreamReader: addData() with device()"); return; } d->dataBuffer += data;}/*!\overload Adds more \a data for the reader to read. This function does nothing if the reader has a device(). \sa clear() */void QXmlStreamReader::addData(const QString &data){ Q_D(QXmlStreamReader); d->lockEncoding = true;#ifdef QT_NO_TEXTCODEC addData(data.toLatin1());#else addData(d->codec->fromUnicode(data));#endif}/*! \overload Adds more \a data for the reader to read. This function does nothing if the reader has a device(). \sa clear() */void QXmlStreamReader::addData(const char *data){ addData(QByteArray(data));}/*! Removes any device() or data from the reader, and resets its state to the initial state. \sa addData() */void QXmlStreamReader::clear(){ Q_D(QXmlStreamReader); d->init(); if (d->device) { if (d->deleteDevice) delete d->device; d->device = 0; }}/*! Returns true if the reader has read until the end of the XML document, or an error has occurred and reading has been aborted; otherwise returns false. Has reading been aborted with a PrematureEndOfDocumentError because the device no longer delivered data, atEnd() will return true once more data has arrived. \sa device(), QIODevice::atEnd() */bool QXmlStreamReader::atEnd() const{ Q_D(const QXmlStreamReader); if (d->atEnd && ((d->type == QXmlStreamReader::Invalid && d->error == PrematureEndOfDocumentError) || (d->type == QXmlStreamReader::EndDocument))) { if (d->device) return d->device->atEnd(); else return !d->dataBuffer.size(); } return (d->atEnd || d->type == QXmlStreamReader::Invalid);}/*! Reads the next token and returns its type. If an error() has been reported, reading is no longer possible. In this case, atEnd() always returns true, and this function will do nothing but return Invalid. The one exception to this rule are errors of type PrematureEndOfDocumentError. Subsequent calls to atEnd() and readNext() will resume this error type and try to read from the device again. This iterative parsing approach makes sense if you can't or don't want to read the entire data in one go, for example, if it is
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -