📄 gcdescr.html
字号:
<HTML><HEAD> <TITLE> Conservative GC Algorithmic Overview </TITLE> <AUTHOR> Hans-J. Boehm, HP Labs (Much of this was written at SGI)</author></HEAD><BODY><H1> <I>This is under construction, and may always be.</i> </h1><H1> Conservative GC Algorithmic Overview </h1><P>This is a description of the algorithms and data structures used in ourconservative garbage collector. I expect the level of detail to increasewith time. For a survey of GC algorithms, see for example<A HREF="ftp://ftp.cs.utexas.edu/pub/garbage/gcsurvey.ps"> Paul Wilson'sexcellent paper</a>. For an overview of the collector interface,see <A HREF="gcinterface.html">here</a>.<P>This description is targeted primarily at someone trying to understand thesource code. It specifically refers to variable and function names.It may also be useful for understanding the algorithms at a higher level.<P>The description here assumes that the collector is used in default mode.In particular, we assume that it used as a garbage collector, and not justa leak detector. We initially assume that it is used in stop-the-world,non-incremental mode, though the presence of the incremental collectorwill be apparent in the design.We assume the default finalization model, but the code affected by thatis very localized.<H2> Introduction </h2>The garbage collector uses a modified mark-sweep algorithm. Conceptuallyit operates roughly in four phases, which are performed occasionallyas part of a memory allocation:<OL><LI><I>Preparation</i> Each object has an associated mark bit.Clear all mark bits, indicating that all objectsare potentially unreachable.<LI><I>Mark phase</i> Marks all objects that can be reachable via chains ofpointers from variables. Often the collector has no real informationabout the location of pointer variables in the heap, so itviews all static data areas, stacks and registers as potentially containingpointers. Any bit patterns that represent addresses insideheap objects managed by the collector are viewed as pointers.Unless the client program has made heap object layout informationavailable to the collector, any heap objects found to be reachable fromvariables are again scanned similarly.<LI><I>Sweep phase</i> Scans the heap for inaccessible, and hence unmarked,objects, and returns them to an appropriate free list for reuse. This isnot really a separate phase; even in non incremental mode this is operationis usually performed on demand during an allocation that discovers an emptyfree list. Thus the sweep phase is very unlikely to touch a page thatwould not have been touched shortly thereafter anyway.<LI><I>Finalization phase</i> Unreachable objects which had been registeredfor finalization are enqueued for finalization outside the collector.</ol><P>The remaining sections describe the memory allocation data structures,and then the last 3 collection phases in more detail. We conclude byoutlining some of the additional features implemented in the collector.<H2>Allocation</h2>The collector includes its own memory allocator. The allocator obtainsmemory from the system in a platform-dependent way. Under UNIX, ituses either <TT>malloc</tt>, <TT>sbrk</tt>, or <TT>mmap</tt>.<P>Most static data used by the allocator, as well as that needed by therest of the garbage collector is stored inside the<TT>_GC_arrays</tt> structure.This allows the garbage collector to easily ignore the collectors owndata structures when it searches for root pointers. Other allocatorand collector internal data structures are allocated dynamicallywith <TT>GC_scratch_alloc</tt>. <TT>GC_scratch_alloc</tt> does notallow for deallocation, and is therefore used only for permanent datastructures.<P>The allocator allocates objects of different <I>kinds</i>.Different kinds are handled somewhat differently by certain partsof the garbage collector. Certain kinds are scanned for pointers,others are not. Some may have per-object type descriptors thatdetermine pointer locations. Or a specific kind may correspondto one specific object layout. Two built-in kinds are uncollectable.One (<TT>STUBBORN</tt>) is immutable without special precautions.In spite of that, it is very likely that most C clients of thecollector currentlyuse at most two kinds: <TT>NORMAL</tt> and <TT>PTRFREE</tt> objects.The <A HREF="http://gcc.gnu.org/java">gcj</a> runtime also makesheavy use of a kind (allocated with GC_gcj_malloc) that storestype information at a known offset in method tables.<P>The collector uses a two level allocator. A large block is defined tobe one larger than half of <TT>HBLKSIZE</tt>, which is a power of 2,typically on the order of the page size.<P>Large block sizes are rounded up tothe next multiple of <TT>HBLKSIZE</tt> and then allocated by<TT>GC_allochblk</tt>. Recent versions of the collectoruse an approximate best fit algorithm by keeping free lists forseveral large block sizes.The actualimplementation of <TT>GC_allochblk</tt>is significantly complicated by black-listing issues(see below).<P>Small blocks are allocated in chunks of size <TT>HBLKSIZE</tt>.Each chunk isdedicated to only one object size and kind. The allocator maintainsseparate free lists for each size and kind of object.<P>Once a large block is split for use in smaller objects, it can onlybe used for objects of that size, unless the collector discovers a completelyempty chunk. Completely empty chunks are restored to the appropriatelarge block free list.<P>In order to avoid allocating blocks for too many distinct object sizes,the collector normally does not directly allocate objects of every possiblerequest size. Instead request are rounded up to one of a smaller numberof allocated sizes, for which free lists are maintained. The exactallocated sizes are computed on demand, but subject to the constraintthat they increase roughly in geometric progression. Thus objectsrequested early in the execution are likely to be allocated with exactlythe requested size, subject to alignment constraints.See <TT>GC_init_size_map</tt> for details.<P>The actual size rounding operation during small object allocation isimplemented as a table lookup in <TT>GC_size_map</tt>.<P>Both collector initialization and computation of allocated sizes arehandled carefully so that they do not slow down the small object fastallocation path. An attempt to allocate before the collector is initialized,or before the appropriate <TT>GC_size_map</tt> entry is computed,will take the same path as an allocation attempt with an empty free list.This results in a call to the slow path code (<TT>GC_generic_malloc_inner</tt>)which performs the appropriate initialization checks.<P>In non-incremental mode, we make a decision about whether to garbage collectwhenever an allocation would otherwise have failed with the current heap size.If the total amount of allocation since the last collection is less thanthe heap size divided by <TT>GC_free_space_divisor</tt>, we try toexpand the heap. Otherwise, we initiate a garbage collection. This ensuresthat the amount of garbage collection work per allocated byte remainsconstant.<P>The above is in fact an oversimplification of the real heap expansionand GC triggering heuristic, which adjusts slightly for root sizeand certain kinds offragmentation. In particular:<UL><LI> Programs with a large root set size andlittle live heap memory will expand the heap to amortize the cost ofscanning the roots. <LI> Versions 5.x of the collector actually collect more frequently innonincremental mode. The large block allocator usually refuses to splitlarge heap blocks once the garbage collection threshold isreached. This often has the effect of collecting well before theheap fills up, thus reducing fragmentation and working set size at theexpense of GC time. Versions 6.x choose an intermediate strategy dependingon how much large object allocation has taken place in the past.(If the collector is configured to unmap unused pages, versions 6.xuse the 5.x strategy.)<LI> In calculating the amount of allocation since the last collection wegive partial credit for objects we expect to be explicitly deallocated.Even if all objects are explicitly managed, it is often desirable to collecton rare occasion, since that is our only mechanism for coalescing completelyempty chunks.</ul><P>It has been suggested that this should be adjusted so that we favorexpansion if the resulting heap still fits into physical memory.In many cases, that would no doubt help. But it is tricky to do thisin a way that remains robust if multiple application are contendingfor a single pool of physical memory.<H2>Mark phase</h2>At each collection, the collector marks all objects that arepossibly reachable from pointer variables. Since it cannot generallytell where pointer variables are located, it scans the following<I>root segments</i> for pointers:<UL><LI>The registers. Depending on the architecture, this may be done usingassembly code, or by calling a <TT>setjmp</tt>-like function which savesregister contents on the stack.<LI>The stack(s). In the case of a single-threaded application,on most platforms thisis done by scanning the memory between (an approximation of) the currentstack pointer and <TT>GC_stackbottom</tt>. (For Itanium, the register stackscanned separately.) The <TT>GC_stackbottom</tt> variable is set ina highly platform-specific way depending on the appropriate configurationinformation in <TT>gcconfig.h</tt>. Note that the currently activestack needs to be scanned carefully, since callee-save registers ofclient code may appear inside collector stack frames, which maychange during the mark process. This is addressed by scanningsome sections of the stack "eagerly", effectively capturing a snapshotat one point in time.<LI>Static data region(s). In the simplest case, this is the regionbetween <TT>DATASTART</tt> and <TT>DATAEND</tt>, as defined in<TT>gcconfig.h</tt>. However, in most cases, this will also involvestatic data regions associated with dynamic libraries. These areidentified by the mostly platform-specific code in <TT>dyn_load.c</tt>.</ul> The marker maintains an explicit stack of memory regions that are knownto be accessible, but that have not yet been searched for contained pointers.Each stack entry contains the starting address of the block to be scanned,as well as a descriptor of the block. If no layout information isavailable for the block, then the descriptor is simply a length.(For other possibilities, see <TT>gc_mark.h</tt>.)<P>At the beginning of the mark phase, all root segments(as described above) are pushed on thestack by <TT>GC_push_roots</tt>. (Registers and eagerly processedstack sections are processed by pushing the referenced objects insteadof the stack section itself.) If <TT>ALL_INTERIOR_PTRS</tt> is notdefined, then stack roots require special treatment. In this case, thenormal marking code ignores interior pointers, but <TT>GC_push_all_stack</tt>explicitly checks for interior pointers and pushes descriptors for targetobjects.<P>The marker is structured to allow incremental marking.Each call to <TT>GC_mark_some</tt> performs a small amount ofwork towards marking the heap.It maintainsexplicit state in the form of <TT>GC_mark_state</tt>, whichidentifies a particular sub-phase. Some other pieces of state, mostnotably the mark stack, identify how much work remains to be donein each sub-phase. The normal progression of mark states fora stop-the-world collection is:<OL><LI> <TT>MS_INVALID</tt> indicating that there may be accessible unmarkedobjects. In this case <TT>GC_objects_are_marked</tt> will simultaneouslybe false, so the mark state is advanced to<LI> <TT>MS_PUSH_UNCOLLECTABLE</tt> indicating that it suffices to pushuncollectable objects, roots, and then mark everything reachable from them.<TT>Scan_ptr</tt> is advanced through the heap until all uncollectableobjects are pushed, and objects reachable from them are marked.At that point, the next call to <TT>GC_mark_some</tt> calls<TT>GC_push_roots</tt> to push the roots. It the advances themark state to<LI> <TT>MS_ROOTS_PUSHED</tt> asserting that once the mark stack isempty, all reachable objects are marked. Once in this state, we workonly on emptying the mark stack. Once this is completed, the statechanges to<LI> <TT>MS_NONE</tt> indicating that reachable objects are marked.</ol>The core mark routine <TT>GC_mark_from</tt>, is calledrepeatedly by several of the sub-phases when the mark stack starts to fillup. It is also called repeatedly in <TT>MS_ROOTS_PUSHED</tt> stateto empty the mark stack.The routine is designed to only perform a limited amount of marking ateach call, so that it can also be used by the incremental collector.It is fairly carefully tuned, since it usually consumes a large majorityof the garbage collection time.<P>The fact that it perform a only a small amount of work per call alsoallows it to be used as the core routine of the parallel marker. In thatcase it is normally invoked on thread-private mark stacks instead of theglobal mark stack. More details can be found in<A HREF="scale.html">scale.html</a><P>The marker correctly handles mark stack overflows. Whenever the mark stackoverflows, the mark state is reset to <TT>MS_INVALID</tt>.Since there are already marked objects in the heap,this eventually forces a completescan of the heap, searching for pointers, during which any unmarked objectsreferenced by marked objects are again pushed on the mark stack. Thisprocess is repeated until the mark phase completes without a stack overflow.Each time the stack overflows, an attempt is made to grow the mark stack.All pieces of the collector that push regions onto the mark stack have to becareful to ensure forward progress, even in case of repeated mark stackoverflows. Every mark attempt results in additional marked objects.<P>Each mark stack entry is processed by examining all candidate pointers
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -