📄 lib0028.html
字号:
<html>
<META http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Heap Allocation</title>
<link rel="STYLESHEET" type="text/css" href="images/xpolecat.css">
<link rel="STYLESHEET" type="text/css" href="images/ie.content.books24x7.css">
</head>
<body >
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<td><div STYLE="MARGIN-LEFT: 0.15in;">
<a href="toc.html"><img src="images/teamlib.gif" width="62" height="15" border="0" align="absmiddle" alt="Team LiB"></a></div></td>
<td valign="top" class="v2" align="right"><div STYLE="MARGIN-RIGHT: 0.15in"><a href="LiB0027.html"><img src="images/previous.gif" width="62" height="15" border="0" align="absmiddle" alt="Previous Section"></a>
<a href="LiB0029.html"><img src="images/next.gif" width="41" height="15" border="0" align="absmiddle" alt="Next Section"></a>
</div></td></tr>
</table>
<div class="chapter">
<a name="ch03"></a>
<div class="section">
<h2 class="first-section-title"><a name="335"></a><a name="ch03lev1sec3"></a>Heap Allocation</h2><p class="first-para">Heap memory allocation, also known as <i class="emphasis">dynamic memory allocation</i> (DMA), consists of requesting memory while an application is running from a repository known as the <i class="emphasis">heap.</i> A heap is just a collection of available bytes (i.e., a bunch of bytes piled into a heap). Unlike the data segment or the stack, the size and life span of memory allocated from the heap is completely unpredictable. This requires the agent that manages the heap to be flexible, and this, in turn, translates into a lot of extra memory management code.</p>
<p class="para">The data segment requires no special management code, and stack management is limited primarily to the prologue and epilogue code of functions. The heap, however, normally has its own dedicated set of elaborate routines to service memory requests.</p>
<a name="336"></a><a name="ch03table03"></a>
<table class="table" border="1">
<caption class="table-title">
<span class="table-title"><span class="table-titlelabel">Table 3.3</span></span>
</caption>
<thead>
<tr valign="top">
<th class="th" scope="col" align="left">
<p class="table-para">
<b class="bold">Storage</b>
</p>
</th><th class="th" scope="col" align="left">
<p class="table-para">
<b class="bold">Size</b>
</p>
</th><th class="th" scope="col" align="left">
<p class="table-para">
<b class="bold">Life Span</b>
</p>
</th><th class="th" scope="col" align="left">
<p class="table-para">
<b class="bold">Bookkeeping</b>
</p>
</th>
</tr>
</thead>
<tbody>
<tr valign="top">
<td class="td" align="left">
<p class="table-para">data section</p>
</td><td class="td" align="left">
<p class="table-para">fixed</p>
</td><td class="td" align="left">
<p class="table-para">program life span</p>
</td><td class="td" align="left">
<p class="table-para">none</p>
</td>
</tr>
<tr valign="top">
<td class="td" align="left">
<p class="table-para">stack</p>
</td><td class="td" align="left">
<p class="table-para">fixed size stack frames</p>
</td><td class="td" align="left">
<p class="table-para">function-based</p>
</td><td class="td" align="left">
<p class="table-para">all at compile time</p>
</td>
</tr>
<tr valign="top">
<td class="td" align="left">
<p class="table-para">heap</p>
</td><td class="td" align="left">
<p class="table-para">varies</p>
</td><td class="td" align="left">
<p class="table-para">varies</p>
</td><td class="td" align="left">
<p class="table-para">significant at run time</p>
</td>
</tr>
</tbody>
</table>
<p class="para">The heap relies heavily on user mode libraries. These libraries (like <span class="fixed">malloc ()</span> and <span class="fixed">free ()</span> declared in <span class="fixed">stdlib.h</span>) may be invoked directly by programs or called indirectly by a virtual machine. Either way, these libraries normally end up utilizing facilities provided by the underlying operating system. Thus, before we dive straight into managing memory through user libraries, it would help to understand how they communicate with the operating system.</p>
<div class="section">
<h3 class="sect3-title">
<a name="337"></a><a name="ch03lev2sec4"></a>System Call Interface</h3>
<p class="first-para">Most user applications are blissfully unaware of what really goes on to support their execution. They never see the GDT or the page table entries. User applications are strictly memory <i class="emphasis">consumers.</i> Like a pizza-hungry college freshman, applications ask for memory takeout, and the operating system gives it to them; they don't care <a name="338"></a><a name="IDX-152"></a>about the details. User programs have a perspective that is roughly 10,000 feet above the operating system level. At this altitude, the system call interface is all that user-space code sees.</p>
<p class="para">In his book on MINIX, <I>Operating Systems: Design and Implementation</I>, Tanenbaum asserts that the system call interface is what defines an operating system. I would tend to agree with him. The system call interface of an operating system is a set of function prototypes that completely specify every service that the operating system can provide to the outside world. An operating system is nothing more than the implementation of its system calls. If you were a human resources professional, you would view the system call interface as the kernel's formal job description. It dictates the actions that the operating system must be able to perform.</p>
<p class="para">Let us look at a simple example. Take, for instance, the NACHOS operating system. NACHOS was developed by Tom Anderson at Berkeley for instructional use in computer science courses. Its system call interface consists of just 11 routines.</p>
<p class="para">Process Management</p>
<div class="informalexample">
<pre class="literallayout">
void Halt()
void Exit(int status)
SpaceId Exec(char *name)
int Join(SpaceId id)
</pre>
</div>
<p class="para">File Input/Output</p>
<div class="informalexample">
<pre class="literallayout">
void Create(cha *name)
OpenFileId Open(char *name)
void Write(char *buffer, int size, OpenFileId id)
int Read(char *buffer, int size, OpenFileId id)
void Close(OpenFileId id)
</pre>
</div>
<p class="para">Threads</p>
<div class="informalexample">
<pre class="literallayout">
void Fork(void (*func)())
void Yield()
</pre>
</div>
<p class="para">That is it. Everything that NACHOS is capable of doing is described by the previous 11 functions. Naturally, production grade operating systems have a system call interface that is much larger. Linux, for example, has more than 200 routines defined in its system call interface. You can read descriptions of these 200+ system calls in the Linux man pages (i.e., <span class="fixed">man2</span>).</p>
<table border="0" cellspacing="0" cellpadding="0" class="note">
<tr>
<td valign="top" class="admon-check"></td><td valign="top" class="admon-title">Note </td><td valign="top" class="admon-body">
<p class="first-para">In case you are wondering, NACHOS stands for Not Another Completely Heuristic Operating System. I think Richard Burgess is correct; we are running out of good acronyms.</p>
</td>
</tr>
</table>
<a name="339"></a><a name="IDX-153"></a>
<p class="para">System calls are not always spelled out with tidy C prototypes. Some operating systems, like DOS, have a system call interface that is specified strictly in terms of interrupts. Consider the following DOS system call:</p>
<div class="informalexample">
<pre class="literallayout">
Interrupt: 0x21 (i.e., INT 0x21)
Function: 0x09
Description: prints a string terminated by a $
Inputs: AH = 9
DS = segment address of string
DX = offset address of string
</pre>
</div>
<p class="para">A wise system engineer will attempt to ward off complexity by banishing the system-related assembly code to the basement of the OS. There, in the darkness, only a trained plumber with a flashlight can muck around with the pipes. Even then, an experienced developer will attempt to wrap the assembly code in C/C++ to make it more palatable. Tanenbaum, for instance, did an excellent job of wrapping assembly routines when he implemented MINIX.</p>
<table border="0" cellspacing="0" cellpadding="0" class="note">
<tr>
<td valign="top" class="admon-check"></td><td valign="top" class="admon-title">Note </td><td valign="top" class="admon-body">
<p class="first-para">I had the opportunity to speak with an engineer who helped manage the construction of the original OS/2 platform. He told me that around 20% of the kernel code was assembler. This is a lot of assembler, especially when you consider that UNIX operating systems, like FreeBSD, have less than 2% of the kernel coded in assembly language. I am sure that the proliferation of assembly code in OS/2 had an impact on the development team's ability to port the code and institute design changes.</p>
</td>
</tr>
</table>
<table border="0" cellspacing="0" cellpadding="0" class="note">
<tr>
<td valign="top" class="admon-check"></td><td valign="top" class="admon-title">Note </td><td valign="top" class="admon-body">
<p class="first-para">Cloning is not limited to the bio-tech sector. An operating system <i class="emphasis">clone</i> is typically constructed by taking the system call interface of the original OS and performing a clean-room implementation of those calls. The clone differs from the original because those system calls are implemented using different algorithms and data structures. For example, FreeDOS is a clone of Microsoft's DOS. Tanenbaum's MINIX is actually a UNIX clone. It is a well-documented fact that Microsoft's 1982 release of its DOS operating system was a clone of IBM's PC-DOS.</p>
</td>
</tr>
</table>
<p class="para">System calls are the atomic building blocks that all other APIs rely on. The user libraries that help to manage memory are built upon the relevant system calls. The layering effect that is generated by building one set of functions on top of another is illustrated in <a class="internaljump" href="#ch03fig12">Figure 3.12</a>.</p>
<div class="figure">
<a name="340"></a><a name="ch03fig12"></a><span class="figuremediaobject"><a href="images/fig182%5F01%5F0%2Ejpg" NAME="IMG_61" target="_parent"><img src="images/fig182_01.jpg" height="146" width="214" alt="Click To expand" border="0"></a></span>
<br style="line-height: 1">
<span class="figure-title"><span class="figure-titlelabel">Figure 3.12</span></span>
</div>
<a name="341"></a><a name="IDX-154"></a>
<p class="para">User libraries cannot directly access system calls. They must all travel through a choke point called the <i class="emphasis">system call gate.</i> If an operating system were a fortress, the system call gate would be its drawbridge. Everything outside the gate runs in user mode, and everything inside the fortress runs in kernel mode.</p>
<p class="para">The system call gate is the only way in and out of the kernel. This is because memory management at the operating system level, in conjunction with the processor, prevents user code from making a <span class="fixed">FAR JMP</span> directly to kernel functions. For the most part, this keeps the Viking pillagers at a safe distance from the inhabitants of the kernel. Occasionally, however, there are curious explorers like Sven Schreiber who find a hole in the castle wall. Sven found a way around the Windows 2000 system call gate. He describes this discovery in his book, <I>Undocumented Windows 2000 Secrets</I>.</p>
<table border="0" cellspacing="0" cellpadding="0" class="note">
<tr>
<td valign="top" class="admon-check"></td><td valign="top" class="admon-title">Note </td><td valign="top" class="admon-body">
<p class="first-para">In an operating system like DOS, which has no memory protection, it is possible to execute an interrupt service routine by using a <span class="fixed">FAR JMP</span> instruction with some assorted assembly language acrobatics. There's nothing in place to prevent a program from jumping to the location of the system call's instructions and executing them.</p>
</td>
</tr>
</table>
<p class="para">Typically, a system call gate is implemented as an interrupt handler. The ISR that mans the system call drawbridge checks to see if the user request is valid. If the service request is valid, the call gate ISR then reroutes the request and its arguments to the appropriate system call in kernel space. When the requested system call is done, it hands off execution back to the system call gate, which then returns execution control to the user program.</p>
<div class="informaltable">
<table border="0">
<tbody>
<tr valign="top">
<td class="td" align="left">
<p class="table-para">user library</p>
</td><td class="td" align="left">
<p class="table-para">
<span class="unicode">↔</span>
</p>
</td><td class="td" align="left">
<p class="table-para">system call gate</p>
</td><td class="td" align="left">
<p class="table-para">
<span class="unicode">↔</span>
</p>
</td><td class="td" align="left">
<p class="table-para">system call</p>
</td>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -