📄 index.html
字号:
<tt> int year;</tt>
<tt> //constructor and destructor</tt>
<tt> Date(); //current date</tt>
<tt> ~Date();</tt>
<tt> //a non-virtual member function</tt>
<tt> bool isLeap() const;</tt>
<tt> bool operator == (const Date& other);</tt>
<tt>};</tt>
</pre>
<p>The Standard guarantees that within every instance of class <tt>Date</tt>,
data members are set down in the order of their declarations (static data members
are stored outside the object and are therefore ignored). There is no requirement
that members be set down in contiguous memory regions; the compiler can insert
additional padding bytes (more on this in Chapter 11, "Memory Management") between
data members to ensure proper alignment. However, this is also the practice
in C, so you can safely assume that a <tt>Date</tt> object has a memory layout
that is identical to that of the following C struct:</p>
<pre>
<tt>/*** filename POD_Date.h***/</tt>
<tt>struct POD_Date</tt>
<tt>/* the following struct has memory layout that is identical</tt>
<tt>to a Date object */</tt>
<tt>{</tt>
<tt> int day;</tt>
<tt> int month;</tt>
<tt> int year;</tt>
<tt>};</tt>
<tt>/*** POD_Date.h***/</tt>
</pre>
<p>Consequently, a <tt>Date</tt> object can be passed to C code and treated as
if it were an instance of <tt>POD_Date</tt>. That the memory layout in C and
C++ is identical in this case might seem surprising; class <tt>Date</tt> defines
member functions in addition to data members, yet there is no trace of these
member functions in the object's memory layout. Where are these member functions
stored? C++ treats nonstatic member functions as static functions. In other
words, member functions are ordinary functions. They are no different from global
functions, except that they take an implicit <tt>this</tt> argument, which ensures
that they are called on an object and that they can access its data members.
An invocation of a member function is transformed to a function call, whereby
the compiler inserts an additional argument that holds the address of the object.
Consider the following example:</p>
<pre>
<tt>void func()</tt>
<tt>{</tt>
<tt> Date d;</tt>
<tt> bool leap = d.isLeap(); //1</tt>
<tt>}</tt>
</pre>
<p>The invocation of the member function <tt>isLeap()</tt> in (1) is transformed
by a C++ compiler into something such as</p>
<pre>
<tt>_x_isLeap?Date@KPK_Date@(&d); //pseudo C++ code</tt>
</pre>
<p>What was that again? Parse it carefully. The parentheses contain the <tt>this</tt>
argument, which is inserted by the compiler in every nonstatic member function
call. As you already know, function names are mangled. <tt>_x_isLeap?Date@KPK_Date@</tt>
is a hypothetical mangled name of the member function <tt>bool Date::isLeap()
const;</tt>. In the hypothetical C++ compiler, every mangled name begins with
an underscore to minimize the potential for conflicts with user-given names.
Next, the <tt>x</tt> indicates a function, as opposed to a data variable. <tt>isLeap</tt>
is the user-given name of the function. The <tt>?</tt> is a delimiter that precedes
the name of the class. The <tt>@</tt> that follows the class name indicates
the parameter list, which begins with a <tt>KPK</tt> and <tt>Date</tt> to indicate
a <tt>const</tt> pointer to a <tt>const</tt> <tt>Date</tt> (the <tt>this</tt>
argument of a <tt>const</tt> member function is a <tt>const</tt> pointer to
a <tt>const</tt> object). Finally, a closing <tt>@</tt> indicates the end of
the parameter list. <tt>_x_isLeap?Date@KPK_Date@</tt> is, therefore, the underlying
name of the member function <tt>bool Date::isLeap() const;</tt>. Other compilers
are likely to use different name mangling schemes, but the details are quite
similar to the example presented here. You must be thinking: "This is very similar
to the way procedural programming manipulates data." It is. The crucial difference
is that the compiler, rather than the human programmer, takes care of these
low-level details.</p>
<h3> <a name="Heading32">The C++ Object Model is Efficient</a></h3>
<p>The object model of C++ is the underlying mechanism that supports object-oriented
concepts such as constructors and destructors, encapsulation, inheritance, and
polymorphism. The underlying representation of class member functions has several
advantages. It is very efficient in terms of execution speed and memory usage
because an object does not store pointers to its member functions. In addition,
the invocation of a nonvirtual member function does not involve additional lookup
and pointer dereferencing. A third advantage is backward compatibility with
C; an object of type <tt>Date</tt> can be passed to C code safely because the
binary representation of such an object complies with the binary representation
of a corresponding C struct. Other object-oriented languages use a radically
different object model, which might not be compatible with either C or C++.
Most of them use <i>reference semantics</i>. In a reference-based object model,
an object is represented as a reference (a pointer or a handle) that refers
to a memory block in which data members and pointers to functions are stored.
There are some advantages to reference semantics; for example, reference counting
and garbage collection are easier to implement in such languages, and indeed
such languages usually provide automatic reference counting and garbage collection.
However, garbage collection also incurs additional runtime overhead, and a reference-based
model breaks down backward compatibility with C. The C++ object model, on the
other hand, enables C++ compilers to be written in C, and (as you read in Chapter
6, "Exception<i> </i>Handling,") early C++ compilers were essentially C++-to-C
translators. </p>
<h3> <a name="Heading33">Memory Layout of Derived Objects</a></h3>
<p>The Standard does not specify the memory layout of base class subobjects in
a derived class. In practice, however, all C++ compilers use the same convention:
The base class subobject appears first (in left-to-right order in the event
of multiple inheritance), and data members of the derived class follow. C code
can access derived objects, as long as the derived class abides by the same
restrictions that were specified previously. For example, consider a nonpolymorphic
class that inherits from <tt>Date</tt> and has additional data members:</p>
<pre>
<tt>class DateTime: public Date</tt>
<tt>{</tt>
<tt>public: //additional members</tt>
<tt>long time;</tt>
<tt>bool PM; //display time in AM or PM?</tt>
<tt>DateTime();</tt>
<tt>~DateTime();</tt>
<tt>long getTime() const;</tt>
<tt>};</tt>
</pre>
<p>The two additional data members of <tt>DateTime</tt> are appended after the
three members of the base class <tt>Time</tt>, so the memory layout of a <tt>DateTime</tt>
object is equivalent to the following C struct:</p>
<pre>
<tt>/*** filename POD_Date.h***/</tt>
<tt>struct POD_DateTime</tt>
<tt>{</tt>
<tt> int day;</tt>
<tt> int month;</tt>
<tt> int year;</tt>
<tt> long time</tt>
<tt> bool PM;</tt>
<tt>};</tt>
<tt>/*** POD_Date.h***/</tt>
</pre>
<p>In a similar vein, the nonpolymorphic member functions of <tt>DateTime</tt>
have no effect on the size or memory layout of the object.</p>
<p>The compatible memory layout of nonpolymorphic C++ objects and C structs has
many useful applications. For example, it enables relational databases to retrieve
and insert objects into a database table. Data Manipulation Languages, such
as SQL, that do not support object semantics, can still treat a "live" object
as a raw chunk of memory. In fact, several commercial databases rely on this
compatibility to provide an object-oriented interface with an underlying relational
data model. Another application is the capability to transmit objects as a stream
of bytes from one machine to another.</p>
<h3> <a name="Heading34">Support for Virtual Member Functions</a></h3>
<p>What happens when an object becomes polymorphic? In this case, backward compatibility
with C is trickier. As was noted previously, the compiler is allowed to insert
additional data members to a class in addition to user-declared data members.
These members can be padding bytes that ensure proper alignment. In the case
of virtual functions, an additional member is inserted into the class: a pointer
to the virtual table, or <tt>_vptr</tt>. The <tt>_vptr</tt> holds the address
of a static table of function pointers (as well as the runtime type information
of a polymorphic class; see Chapter 7, "Runtime<i> </i>Type<i> </i>Identification").
The exact position of the <tt>_vptr</tt> is implementation-dependent. Traditionally,
it was placed after the class's user-declared data members. However, some compilers
have moved it to the beginning of the class for performance reasons. Theoretically,
the _<tt>vptr</tt> can be located anywhere inside the class -- even among user-declared
members.</p>
<p>A virtual member function, like a nonvirtual member function, is an ordinary
function. When a derived class overrides it, however, multiple distinct versions
of the function exist. It is not always possible to determine at compile time
which of these functions needs to be invoked. For example</p>
<pre>
<tt>#include <iostream></tt>
<tt>using namespace std;</tt>
<tt>class PolyDate</tt>
<tt>{</tt>
<tt>public:</tt>
<tt>//PolyDate has the same members as Date but it's polymorphic</tt>
<tt>virtual void name() const { cout<<"PolyDate"<<endl;}</tt>
<tt>};</tt>
<tt>class PolyDateTime: public PolyDate</tt>
<tt>{</tt>
<tt>public:</tt>
<tt>// the same members as DateTime but it's polymorphic</tt>
<tt>void name() const { cout<<"PolyDateTime"<<endl;} //override PolyDate::name()</tt>
<tt>};</tt>
</pre>
<p>When these classes are compiled, the hypothetical compiler generates two underlying
functions that correspond to <tt>PolyDate::name()</tt> and <tt>PolyDateTime()::name()</tt>:</p>
<pre>
<tt> // mangled name of void PolyDate::name() const</tt>
<tt>_x_name?PolyDate@KPK_PolyDate@</tt>
<tt> // mangled name of void PolyDateTime::name() const;</tt>
<tt>_x_name?PolyDateTime@KPK_PolyDateTime@</tt>
</pre>
<p>So far, there's nothing unusual about this. You already know that a member
function is an ordinary function that takes an implicit <tt>this</tt> argument.
Because you have defined two versions of the same virtual function, you also
expect to find two corresponding functions, each of which has a distinct mangled
name. However, unlike nonvirtual functions, the compiler cannot always transform
an invocation of a virtual member function into a direct function call. For
example</p>
<pre>
<tt>void func(const PolyDate* pd)</tt>
<tt>{</tt>
<tt> pd->name();</tt>
<tt>}</tt>
</pre>
<p><tt>func()</tt> can be located in a separate source file, which might have
been compiled before class <tt>PolyDateTime</tt> was defined. Therefore, the
invocation of the virtual function <tt>name()</tt> has to be deferred until
runtime. The compiler transforms the function call into something such as</p>
<pre>
<tt>(* pd->_vptr[2]) (pd);</tt>
</pre>
<p>Analyze it; the member <tt>_vptr</tt> points to the internally-generated virtual
table. The first member of the virtual table is usually saved for the address
of the destructor, and the second might store the address of the class's <tt>type_info</tt>.
Any other user-defined virtual member functions are located in higher positions.
In this example, the address of <tt>name()</tt> is stored at the third position
in the virtual table (in practice, the name of the _<tt>vptr</tt> is also mangled).
Thus, the expression <tt>pd->_vptr[2]</tt> returns the address of the function
<tt>name()</tt> associated with the current object. <tt>pd</tt>, in the second
occurrence, represents the <tt>this</tt> argument.</p>
<p>Clearly, defining a corresponding C struct is more precarious in this case
and requires intimate acquaintance with the compiler's preferred position of
the <tt>_vptr</tt> as well as with its size. There is another hazard here: The
value of the <tt>_vptr</tt> is transient, which means that it might have a different
value, according to the address space of the process that executes the program.
. Consequently, when an entire polymorphic object is stored in a file and retrieved
later, the retrieved data cannot be used as a valid object. For all these reasons,
accessing polymorphic objects from C code is dangerous and generally needs to
be avoided.</p>
<h3> <a name="Heading35">Virtual Inheritance</a></h3>
<p>C code does not access objects that have a virtual base class either. The reason
is that a virtual base is usually represented in the form of a pointer to a
shared instance of the virtual subobject. Here again, the position of this pointer
among user-defined data members is implementation-dependent. Likewise, the pointer
holds a transient value, which can change from one execution of the program
to another.</p>
<h3> <a name="Heading36">Different Access Specifiers</a></h3>
<p>The fourth restriction on the legality of accessing C++ objects from C code
states that all the data members of the class are declared without an intervening
access specifier. This means, theoretically, that the memory layout of a class
that looks similar to the following</p>
<pre>
<tt>class AnotherDate</tt>
<tt>{</tt>
<tt>private:</tt>
<tt> int day;</tt>
<tt>private:</tt>
<tt> int month;</tt>
<tt>private:</tt>
<tt> int year;</tt>
<tt>public:</tt>
<tt> //constructor and destructor</tt>
<tt> AnotherDate(); //current date</tt>
<tt> ~AnotherDate();</tt>
<tt> //a non-virtual member function</tt>
<tt> bool isLeap() const;</tt>
<tt> bool operator == (const Date& other);</tt>
<tt>};</tt>
</pre>
<p>might differ from a class that has the same data members declared in the same
order, albeit without any intervening access specifiers. In other words, for
class <tt>AnotherDate</tt>, an implementation is allowed to place the member
<tt>month</tt> before the member <tt>day</tt>, <tt>year</tt> before <tt>month</tt>,
or whatever. Of course, this nullifies any compatibility with C code. However,
in practice, all current C++ compilers ignore the access specifiers and store
the data members in the order of declaration. So C code that accesses a class
object that has multiple access specifiers might work -- but there is no guarantee
that the compatibility will remain in the future.</p>
<h2> <a name="Heading37">Conclusions</a></h2>
<p>The creators of C++ have attempted to preserve, as closely as possible, backward
compatibility with C. Indeed, almost without exception, every C program is also
a valid C++ program. Still, there are some subtle differences between the seemingly
common denominator of the two languages. Most of them, as you might have noted,
derive from the improved type-safety of C++. -- for example, the obligatory
declaration of a function prior to its usage, the need to use explicit cast
of <tt>void</tt> pointers to the target pointer, the deprecation of implicit
<tt>int</tt> declarations, and the enforcement of a null terminator in a string
literal. Other discrepancies between the two languages derive from the different
rules of type definition.</p>
<p>C code can be called directly from C++ code. Calling C++ code from C is also
possible under certain conditions, but it requires additional adjustments regarding
the linkage type and it is confined to global functions exclusively. C++ objects
can be accessed from C code, as you have seen, but here again, there are stringent
constraints to which you must adhere. </p>
<CENTER>
<P>
<HR>
<A HREF="/publishers/que/series/professional/0789720221/index.htm"><img src="/publishers/que/series/professional/0789720221/button/contents.gif" WIDTH="128"
HEIGHT="28" ALIGN="BOTTOM" ALT="Contents" BORDER="0"></A> <BR>
<BR>
<BR>
<p></P>
<P>© <A HREF="/publishers/que/series/professional/0789720221/copy.htm">Copyright 1999</A>, Macmillan Computer Publishing. All
rights reserved.</p>
</CENTER>
</BODY>
</HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -