📄 qstring.cpp
字号:
/******************************************************************************** Copyright (C) 1992-2006 Trolltech ASA. All rights reserved.**** This file is part of the QtCore module of the Qt Toolkit.**** This file may be used under the terms of the GNU General Public** License version 2.0 as published by the Free Software Foundation** and appearing in the file LICENSE.GPL included in the packaging of** this file. Please review the following information to ensure GNU** General Public Licensing requirements will be met:** http://www.trolltech.com/products/qt/opensource.html**** If you are unsure which license is appropriate for your use, please** review the following information:** http://www.trolltech.com/products/qt/licensing.html or contact the** sales department at sales@trolltech.com.**** This file is provided AS IS with NO WARRANTY OF ANY KIND, INCLUDING THE** WARRANTY OF DESIGN, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.******************************************************************************/#include "qstringlist.h"#include "qregexp.h"#include "qunicodetables_p.h"#ifndef QT_NO_TEXTCODEC#include <qtextcodec.h>#endif#include <qdatastream.h>#include <qlist.h>#include "qlocale.h"#include "qlocale_p.h"#include "qstringmatcher.h"#include "qtools_p.h"#include "qhash.h"#include "qdebug.h"#include <limits.h>#include <string.h>#include <stdlib.h>#include <stdio.h>#include <stdarg.h>#ifndef QT_NO_STL# if defined (Q_CC_GNU) && (__GNUC__ - 0 >= 3)# include <string># endif#endif#ifdef truncate#undef truncate#endif#ifndef LLONG_MAX#define LLONG_MAX qint64_C(9223372036854775807)#endif#ifndef LLONG_MIN#define LLONG_MIN (-LLONG_MAX - qint64_C(1))#endif#ifndef ULLONG_MAX#define ULLONG_MAX quint64_C(18446744073709551615)#endif#ifndef QT_NO_TEXTCODECQTextCodec *QString::codecForCStrings;#endif#ifdef QT3_SUPPORTstatic QHash<void *, QByteArray> *asciiCache = 0;#endifstatic int ucstrcmp(const QString &as, const QString &bs){ const QChar *a = as.unicode(); const QChar *b = bs.unicode(); if (a == b) return 0; if (a == 0) return 1; if (b == 0) return -1; int l=qMin(as.length(),bs.length()); while (l-- && *a == *b) a++,b++; if (l==-1) return (as.length()-bs.length()); return a->unicode() - b->unicode();}static int ucstrncmp(const QChar *a, const QChar *b, int l){ while (l-- && *a == *b) a++,b++; if (l==-1) return 0; return a->unicode() - b->unicode();}static int ucstrnicmp(const QChar *a, const QChar *b, int l){ while (l-- && ::lower(*a) == ::lower(*b)) a++,b++; if (l==-1) return 0; return ::lower(*a).unicode() - ::lower(*b).unicode();}inline bool qIsUpper(char ch){ return ch >= 'A' && ch <= 'Z';}inline bool qIsDigit(char ch){ return ch >= '0' && ch <= '9';}inline char qToLower(char ch){ if (ch >= 'A' && ch <= 'Z') return ch - 'A' + 'a'; else return ch;}const QString::Null QString::null = QString::Null();/*! \class QCharRef \reentrant \brief The QCharRef class is a helper class for QString. \internal \ingroup text When you get an object of type QCharRef, if you can assign to it, the assignment will apply to the character in the string from which you got the reference. That is its whole purpose in life. The QCharRef becomes invalid once modifications are made to the string: if you want to keep the character, copy it into a QChar. Most of the QChar member functions also exist in QCharRef. However, they are not explicitly documented here. \sa QString::operator[]() QString::at() QChar*//*! \class QString \reentrant \brief The QString class provides a Unicode character string. \ingroup tools \ingroup shared \ingroup text \mainclass \reentrant QString stores a string of 16-bit \l{QChar}s, where each QChar stores one Unicode 4.0 character. \l{Unicode} is an international standard that supports most of the writing systems in use today. It is a superset of ASCII and Latin-1 (ISO 8859-1), and all the ASCII/Latin-1 characters are available at the same code positions. Behind the scenes, QString uses \l{implicit sharing} (copy-on-write) to reduce memory usage and to avoid the needless copying of data. This also helps reduce the inherent overhead of storing 16-bit characters instead of 8-bit characters. In addition to QString, Qt also provides the QByteArray class to store raw bytes and traditional 8-bit '\\0'-terminated strings. For most purposes, QString is the class you want to use. It is used throughout the Qt API, and the Unicode support ensures that your applications will be easy to translate if you want to expand your application's market at some point. The two main cases where QByteArray is appropriate are when you need to store raw binary data, and when memory conservation is critical (e.g. with Qtopia Core). One way to initialize a QString is simply to pass a \c{const char *} to its constructor. For example, the following code creates a QString of size 5 containing the data "Hello": \code QString str = "Hello"; \endcode QString converts the \c{const char *} data into Unicode using fromAscii(). By default, fromAscii() treats character above 128 as Latin-1 characters, but this can be changed by calling QTextCodec::setCodecForCStrings(). In all of the QString methods that take \c{const char *} parameters, the \c{const char *} is interpreted as a classic C-style '\\0'-terminated string. It is legal for the \c{const char *} parameter to be 0. You can also provide string data as an array of \l{QChar}s: \code static const QChar data[4] = { 0x0055, 0x006e, 0x10e3, 0x03a3 }; QString str(data, 4); \endcode QString makes a deep copy of the QChar data, so you can modify it later without experiencing side effects. (If for performance reasons you don't want to take a deep copy of the character data, use QString::fromRawData() instead.) Another approach is to set the size of the string using resize() and to initialize the data character per character. QString uses 0-based indexes, just like C++ arrays. To access the character at a particular index position, you can use operator[](). On non-const strings, operator[]() returns a reference to a character that can be used on the left side of an assignment. For example: \code QString str; str.resize(4); str[0] = QChar('U'); str[1] = QChar('n'); str[2] = QChar(0x10e3); str[3] = QChar(0x03a3); \endcode For read-only access, an alternative syntax is to use at(): \code for (int i = 0; i < str.size(); ++i) { if (str.at(i) >= QChar('a') && str.at(i) <= QChar('f')) qDebug() << "Found character in range [a-f]"; } \endcode at() can be faster than operator[](), because it never causes a \l{deep copy} to occur. To extract several characters at a time, use left(), right(), or mid(). A QString can embed '\\0' characters (QChar::null). The size() function always returns the size of the whole string, including embedded '\\0' characters. After a call to resize(), newly allocated characters have undefined values. To set all the characters in the string to a particular value, call fill(). QString provides dozens of overloads designed to simplify string usage. For example, if you want to compare a QString with a string literal, you can write code like this and it will work as expected: \code if (str == "auto" || str == "extern" || str == "static" || str == "register") { ... } \endcode You can also pass string literals to functions that take QStrings and the QString(const char *) constructor will be invoked. Similarily, you can pass a QString to a function that takes a \c{const char *} using the \l qPrintable() macro which returns the given QString as a \c{const char *}. This is equivalent to calling <QString>.toAscii().constData(). QString provides the following basic functions for modifying the character data: append(), prepend(), insert(), replace(), and remove(). For example: \code QString str = "and"; str.prepend("rock "); // str == "rock and" str.append(" roll"); // str == "rock and roll" str.replace(5, 3, "&"); // str == "rock & roll" \endcode The replace() and remove() functions' first two arguments are the position from which to start erasing and the number of characters that should be erased. A frequent requirement is to remove whitespace characters from a string ('\\n', '\\t', ' ', etc.). If you want to remove whitespace from both ends of a QString, use trimmed(). If you want to remove whitespace from both ends and replace multiple consecutive whitespaces with a single space character within the string, use simplified(). If you want to find all occurrences of a particular character or substring in a QString, use indexOf() or lastIndexOf(). The former searches forward starting from a given index position, the latter searches backward. Both return the index position of the character or substring if they find it; otherwise, they return -1. For example, here's a typical loop that finds all occurrences of a particular substring: \code QString str = "We must be <b>bold</b>, very <b>bold</b>"; int j = 0; while ((j = str.indexOf("<b>", j)) != -1) { qDebug() << "Found <b> tag at index position" << j; ++j; } \endcode If you want to see if a QString starts or ends with a particular substring use startsWith() or endsWith(). If you simply want to check whether a QString contains a particular character or substring, use contains(). If you want to find out how many times a particular character or substring occurs in the string, use count(). QString provides many functions for converting numbers into strings and strings into numbers. See the arg() functions, the setNum() functions, the number() static functions, and the toInt(), toDouble(), and similar functions. To get an upper- or lowercase version of a string use toUpper() or toLower(). If you want to replace all occurrences of a particular substring with another, use one of the two-parameter replace() overloads. QStrings can be compared using overloaded operators such as operator<(), operator<=(), operator==(), operator>=(), and so on. The comparison is based exclusively on the numeric Unicode values of the characters and is very fast, but is not what a human would expect. QString::localeAwareCompare() is a better choice for sorting user-interface strings. Lists of strings are handled by the QStringList class. You can split a string into a list of strings using split(), and join a list of strings into a single string with an optional separator using QStringList::join(). You can obtain a list of strings from a string list that contain a particular substring or that match a particular QRegExp using QStringList::find(). If you are building a QString gradually and know in advance approximately how many characters the QString will contain, you can call reserve(), asking QString to preallocate a certain amount of memory. You can also call capacity() to find out how much memory QString actually allocated. To obtain a pointer to the actual character data, call data() or constData(). These functions return a pointer to the beginning of the QChar data. The pointer is guaranteed to remain valid until a non-const function is called on the QString. \section1 Conversions between 8-bit strings and Unicode strings QString provides the following four functions that return a \c{const char *} version of the string as QByteArray: toAscii(), toLatin1(), toUtf8(), and toLocal8Bit(). \list \i toAscii() returns an ASCII encoded 8-bit string. \i toLatin1() returns a Latin-1 (ISO 8859-1) encoded 8-bit string. \i toUtf8() returns a UTF-8 encoded 8-bit string. UTF-8 is a superset of ASCII that supports the entire Unicode character set through multibyte sequences. \i toLocal8Bit() returns an 8-bit string using the system's local encoding. \endlist To convert from one of these encodings, QString provides fromAscii(), fromLatin1(), fromUtf8(), and fromLocal8Bit(). Other encodings are supported through QTextCodec. As mentioned above, QString provides a lot of functions and operators that make it easy to interoperate with \c{const char *} strings. This functionaly is a double-edged sword: It makes QString more convenient to use if all strings are ASCII or Latin-1, but there is always the risk that an implicit conversion from or to \c{const char *} is done using the wrong 8-bit encoding. To minimize these risks, you can turn off these implicit conversions by defining these two preprocessor symbols: \list \i \c QT_NO_CAST_FROM_ASCII disables automatic conversions from ASCII to Unicode. \i \c QT_NO_CAST_TO_ASCII disables automatic conversion from QString to ASCII. \endlist One way to define these prepocessor symbols globally for your application is to add the following entry to your \l{qmake Project Files}{qmake project file}: \code DEFINES += QT_NO_CAST_FROM_ASCII \ QT_NO_CAST_TO_ASCII \endcode You then need to explicitly call fromAscii(), fromLatin1(), fromUtf8(), or fromLocal8Bit() to construct a QString from an 8-bit string, or use the lightweight QLatin1String class, for example: \code QString url = QLatin1String("http://www.unicode.org/"); \endcode Similarly, you must call toAscii(), toLatin1(), toUtf8(), or toLocal8Bit() explicitly to convert the QString to an 8-bit string. (Other encodings are supported through QTextCodec.) \section1 Note for C programmers Due to C++'s type system and the fact that QString is \l{implicitly shared}, QStrings may be treated like \c{int}s or other basic types. For example: \code QString boolToString(bool b) { QString result; if (b) result = "True"; else result = "False"; return result; } \endcode The variable, result, is a normal variable allocated on the stack. When return is called, because we're returning by value, The copy constructor is called and a copy of the string is returned. (No actual copying takes place thanks to the implicit sharing.) \section1 Distinction between null and empty strings For historical reasons, QString distinguishes between a null string and an empty string. A \e null string is a string that is initialized using QString's default constructor or by passing (const char *)0 to the constructor. An \e empty string is any string with size 0. A null string is always empty, but an empty string isn't necessarily null: \code QString().isNull(); // returns true QString().isEmpty(); // returns true
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -