📄 str.doc
字号:
TITLE class strDESCRIPTION A simple but highly useful C++ string classFILES str.h class str definition str.cpp class str implementationAUTHOR David NugentCONTACT FidoNet 3:632/348, davidn@csource.pronet.com PO Box 352, Doveton, VIC, Australia Voice +61-3-793-2728STATUS Donated to the public domain, no restrictions on any useSYNOPSYS class str is a simple yet powerful C++ string class, providing many forms of conversions (from other base types) to strings and a large variety of manipulators, making it very useful as a stand-alone string class, for output formatting with iostreams, for cheap copying, concatenation and splitting operations, as a general purpose class useful in data presentation and line-based parsing.GENERAL STRUCTURE class str is designed to be small, which makes it a practical data type which can be used in large arrays even on small memory systems. In addition, the typical implementation will result in a much smaller str again if "VIRTUAL_DESTRUCTOR" is not enabled, which prevents the destructor str::~str() from being declared virtual, avoiding generation of a vtable and vtable pointer for the class. Without a vtable pointer, class str on most implementations is generally the same size as sizeof() a data pointer, ie. as cheap and small as a char*. The disadvantage of not making the destructor virtual is that this places some limitations on use of derived classes - however these are not severe, and the main benefits of code reuse are still available even without a virtual destructor. If a derived class allocates resources, however, some care should be taken to avoid upcasts if at all possible to ensure that the correct destructor is called. These limitations are circumvented by defining VIRTUAL_DESTRUCTOR to any non-zero value, with the disadvantage that an object of class str will (usually) be twice as big. Reference String Class str itself contains a single data member, being a pointer to an internal "reference string". Reference strings (embodied in class refstr) contains the actual string data, and provides a mechanism for cheap copy and assignment - instead of copying the data each time, more than one str object is allowed to reference the same physical data, and delays copying until one of the str objects is modified, at which time a new refstr object is created and copied from the old, and only that copy changed. In many situations, that copy is never modified, so physically copying of the string data never becomes necessary, saving both in execution time and memory. The following diagram shows how 3 str objects share the same reference string: str string1("This is a string"); // const char * __ctor str string2(string1); // str const & __ctor str string3 = string1; // str const & __ctor At this point, the relationship of these three objects and the internal refstr is: string1 refstr --. refs = 3 string2 \ length = 16 refstr ------ refstr1 size = 32 string3 / data = This is a string\0.....\0 refstr --' The reference container object contains a counter for the number of times the object is referenced by the str wrapper. Any attempt to modify a refstr via any string object while the number of references exceeds 1 results in the refstr first being copied and the original left untouched apart from the reference count being decremented. Modifications are made on the copy only, which has a reference count of 1. If, for example, string3 were to be modified by the string " 3" being added (concatenated) to it, the diagram would then look like: string1 refstr --. string2 -- refstr1 2/16/32/This is a string\0.....\0 refstr --' string3 refstr ----- refstr1 1/18/32/This is a string 3\0...\0 Reference strings are a variable sized object. Size variations of refstr objects are handled via a placement operator new, which causes an additional amount of memory to be allocated for the string data itself. Members of class refstr (refs, length size, flags) coexist in memory contiguous with the data itself, which is appended to the end. Reference to and calculation of offsets of the data itself is handled by wrapper functions in class refstr. The use of reference strings in this case provides a data type which is almost always cheaper than using references for parameter passing. For example, given two functions: void hello1(str x); void hello2(str & x); str mystring("Hello world"); hello1(mystring); hello2(mystring); While the actual call to function hello2() is slightly cheaper since it involves passing reference only, the code within hello1 will not have to deal with the additional indirection. Also the parameter passed to hello1() may be freely modified without affecting the original string even if both variables (the original and the copy passed on the stack) initially reference the same physical reference string. Pre-allocated size Note that by default, strings have a pre-allocated internal size of at least STDLEN, which is defined in str.cpp, the implementation file. This can be overridden as desired using the additional optional parameter for most of the class constructors. As shipped, the default size of the memory allocated for the data in a refstr object, unless overridden by the constructor, is 32 bytes. This amount is automatically grown to accommodate insertions and concatenations (see the internal function _chksize() for details). Preallocating memory for strings leads to efficiencies in string manipulation, avoiding having to call for reallocation for trivial modification. Conversions To accommodate conversion of a str object to a C style 'string' (a variable length char array with a NUL terminator), class refstr is maintained at least one byte larger than the amount required to contain the exact string length. This fixes the overhead of adding the NUL terminator as required rather than the overhead being dependent on allowing the refstr object to become larger in order to accommodate it if needed. Note that although this additional byte is maintained, the string is NOT NECESSARILY NUL TERMINATED, and that is exactly why the c_str() member assures that it is. Dealing with the data in this way and maintaining a separate length variable in class refstr tends to eliminate any possibility of continually scanning the string to determine length, as is typical of a lot of C and C++ code which uses char*'s. The member function to obtain the string length is a very cheap operation. Conversion to char const * Class str provides no automatic type conversion operators which are a common feature of many string classes. This was considered far too dangerous to implement, as it can occasionally cause invalid memory access - modification or reading of string data which no longer 'exists'. Instead, this functionality was moved to memory c_str(), which must be explicitly called and yet still has a few caveats. See notes under the explanation of c_str() below. Maximum string size For compactness, the maximum size of a string is fixed at 32K, even on 32-bit systems. This is a design feature intended to meet the intended use of this class. Manipulation of larger buffers is best done with classes designed for this; the algorithms incorporated in class str are not at all optimised for large text buffer manipulation. Binary strings Because class str is not dependent on a terminating NUL, it can be used for manipulation of binary strings. Note, however, that any conversion to char const * via c_str() will negate this advantage if the string contains a NUL - a pointer to the string data may still be obtained by c_str() or (since the NUL is not expected) c_ptr() which is slightly cheaper.DESCRIPTION OF MEMBERS PUBLIC INTERFACE Class str was written primarily for practical use where strings are most often used in other languages - data presentation. After all, a string is not a computational entity, but (usually) one which contains data that is manipulated and presented in some way to a human, or at least readable by machine or human. Consequently, the emphasis of members included in the class provide direct conversion of built-in types to strings via a large number of constructors. For conversion from other classes, it is suggested that an "operator str() const" be implemented for the class, allowing a string to be directly created from it. Formatting output using class str, even with iostreams, is much more easily done with class str than with somewhat more clumsy iomanipulators. Padding, filling and justification functions are all provided and are easy, intuitive and safe to use, even with temporaries. Moreover, str's are perfect for formatting before insertion into an ostream - some manipulators, such as width (via setw()), operate only on the next insertion, and formatting within a string variable ensures that one entire object is inserted in one insertion, therefore respecting the state of the stream. Similarly, stream justification, fill and other characteristics do not need to be saved, set and restored around insertion operations. Constructors class str defines a default constructor, providing convenient support for allocation of arrays. All other constructors provide some form of conversion, including conversion of char*'s, all built-in integral types (short, int, long and unsigned versions thereof), and all forms of char. Note that char is not handled as an integral. Conversion of integral types allows specification of a radix, so support for non- decimal numeric conversions other than base 10 are fully supported. str (void); Default constructor. Allocates a zero-length string with an internal size determined by the pre-allocated length. In almost all cases it is more efficient to use one of the initialising constructors if possible. str mystring; str (char const * s, short len =-1); str (unsigned char const * s, short len =-1); str (signed char const * s, short len =-1); These constructors provide conversion from C strings. The length parameter allows extraction of only a portion of a string (sub- string). The default value of -1 assumes that the source string is NUL terminated. str mystring("Hello world!"); str mystring = "Hello world!"); str mystring("Hello world!", 5); // Contains 'Hello' only str (int val, int radix =10); str (unsigned int val, int radix =10); str (short val, int radix =10); str (unsigned short val, int radix =10); str (long val, int radix =10); str (unsigned long val, int radix =10); These provide automatic conversion for all integral types with a default radix of 10. Negative values of signed types will cause the resulting string to have a leading '-' sign. For a non-decimal radix, no indication of the number's base is automatically inserted. If you wish to use "0x" for hexadecimal numbers, for example, you will need to use one of the insert() members. When using a non-decimal radix, it is highly recommended that one of the unsigned converters be used to prevent generation of the sign prefix. str mystring(10999); // result '10999' str mystring(255,16); // result 'ff' str mystring = -14587; // result '-14587' str mystring = (unsigned)-1;// implementation dependent str (char c); str (unsigned char c); str (signed char c); These constructors are not integral conversions but convert a single character into a string with a length of 1. str mystring('g'); // 'g' str mystring = 'k' // 'k' str (str const & s); This is the copy constructor. Note that this causes the new string to be initialised with the same refstr as contained by the string being copied, and so is the cheapest constructor available. str mystring1 = "Hello world!"; str mystring2 = mystring1; // 'Hello world!' str mystring3(mystring1); // 'Hello world!' ~str (void); Class destructor. This will deallocate the contained reference string if it is the only string which references it, otherwise only the reference strings 'reference count' is decremented. str & clear(void); This member provides the ability to quickly clear the contents of a string. Not that if the contained reference string is not referenced by any other str object, the length is reduced to zero but the string size (size of the actual refstr) is untouched. This makes this function suitable when using a string as a temp variable - it will grow to accommodate the largest item placed into it and therefore not require constant reallocation in a loop for example, except where the items are made progressively bigger. str tempstr;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -