📄 kernmalloc.t

📁 早期freebsd实现
💻 T
📖 第 1 页 / 共 2 页
字号:
上一页 12
except that the kernel can explicitly control the set of pagesallocated to its address space.The result is that the ``working set'' of pages in use by thekernel exactly corresponds to the set of pages that it is really using..FI "One day memory usage on a Berkeley time-sharing machine".so usage.tbl.Fe.PPA final special condition that applies to the kernel is thatall of the different uses of dynamic memory are known in advance.Each one of these uses of dynamic memory can be assigned a type.For each type of dynamic memory that is allocated,the kernel can provide allocation limits.One reason given for having separate allocators is thatno single allocator could starve the rest of the kernel of allits available memory and thus a single runawayclient could not paralyze the system.By putting limits on each type of memory,the single general purpose memory allocator can provide the sameprotection against memory starvation.\(dg.FS\(dgOne might seriously ask the question what good it is if ``only''one subsystem within the kernel hangs if it is something like thenetwork on a diskless workstation..FE.PP\*(Lb shows the memory usage of the kernel over a one day periodon a general timesharing machine at Berkeley.The ``In Use'', ``Free'', and ``Mem Use'' fields are instantaneous values;the ``Requests'' field is the number of allocations since system startup;the ``High Use'' field is the maximum value of the ``Mem Use'' field since system startup.The figure demonstrates that mostallocations are for small objects.Large allocations occur infrequently, and are typically for long-lived objectssuch as buffers to hold the superblock fora mounted file system.Thus, a memory allocator only needs to befast for small pieces of memory..H 1 "Implementation of the Kernel Memory Allocator.PPIn reviewing the available memory allocators,none of their strategies could be used without some modification.The kernel memory allocator that we ended up with is a hybridof the fast memory allocator found in the 4.2BSD C libraryand a slower but more-memory-efficient first-fit allocator..PPSmall allocations are done using the 4.2BSD power-of-two list strategy;the typical allocation requires only a computation ofthe list to use and the removal of an element if it is available,so it is quite fast.Macros are provided to avoid the cost of a subroutine call.Only if the request cannot be fulfilled from a list is a callmade to the allocator itself.To ensure that the allocator is always called for large requests,the lists corresponding to large allocations are always empty.Appendix A shows the data structures and implementation of the macros..PPSimilarly, freeing a block of memory can be done with a macro.The macro computes the list on which to place the requestand puts it there.The free routine is called only if the block of memory isconsidered to be a large allocation.Including the cost of blocking out interrupts,the allocation and freeing macros generate respectivelyonly nine and sixteen (simple) VAX instructions..PPBecause of the inefficiency of power-of-two allocation strategiesfor large allocations,a different strategy is used for allocations larger than two kilobytes.The selection of two kilobytes is derived from our statistics onthe utilization of memory within the kernel,that showed that 95 to 98% of allocations are of size one kilobyte or less.A frequent caller of the memory allocator(the name translation function)always requests a one kilobyte block.Additionally the allocation method for large blocks is based on allocatingpieces of memory in multiples of pages.Consequently the actual allocation size for requests of size$2~times~pagesize$ or less are identical.\(dg.FS\(dgTo understand why this number is $size 8 {2~times~pagesize}$ oneobserves that the power-of-two algorithm yields sizes of 1, 2, 4, 8, \&...pages while the large block algorithm that allocates in multiplesof pages yields sizes of 1, 2, 3, 4, \&... pages.Thus for allocations of sizes between one and two pagesboth algorithms use two pages;it is not until allocations of sizes between two and three pagesthat a difference emerges where the power-of-two algorithm will usefour pages while the large block algorithm will use three pages..FEIn 4.3BSD on the VAX, the (software) page size is one kilobyte,so two kilobytes is the smallest logical cutoff..PPLarge allocations are first rounded up to be a multiple of the page size.The allocator then uses a first-fit algorithm to find space in thekernel address arena set aside for dynamic allocations.Thus a request for a five kilobyte piece of memory will use exactlyfive pages of memory rather than eight kilobytes as withthe power-of-two allocation strategy.When a large piece of memory is freed,the memory pages are returned to the free memory pool,and the address space is returned to the kernel address arenawhere it is coalesced with adjacent free pieces..PPAnother technique to improve both the efficiency of memory utilizationand the speed of allocationis to cluster same-sized small allocations on a page.When a list for a power-of-two allocation is empty,a new page is allocated and divided into pieces of the needed size.This strategy speeds future allocations as several pieces of memorybecome available as a result of the call into the allocator..PP.FI "Calculation of allocation size".so alloc.fig.FeBecause the size is not specified when a block of memory is freed,the allocator must keep track of the sizes of the pieces it has handed out.The 4.2BSD user-level allocator stores the size of each blockin a header just before the allocation.However, this strategy doubles the memory requirement for allocations thatrequire a power-of-two-sized block.Therefore,instead of storing the size of each piece of memory with the piece itself,the size information is associated with the memory page.\*(Lb shows how the kernel determinesthe size of a piece of memory that is being freed,by calculating the page in which it resides,and looking up the size associated with that page.Eliminating the cost of the overhead per piece improved utilizationfar more than expected.The reason is that many allocations in the kernel are for blocks of memory whose size is exactly a power of two.These requests would be nearly doubled if the user-level strategy were used.Now they can be accommodated with no wasted memory..PPThe allocator can be called both from the top half of the kernel,which is willing to wait for memory to become available,and from the interrupt routines in the bottom half of the kernelthat cannot wait for memory to become available.Clients indicate their willingness (and ability) to wait with a flagto the allocation routine.For clients that are willing to wait,the allocator guarrentees that their request will succeed.Thus, these clients can need not check the return value from the allocator.If memory is unavailable and the client cannot wait,the allocator returns a null pointer.These clients must be prepared to cope with this(hopefully infrequent) condition(usually by giving up and hoping to do better later)..H 1 "Results of the Implementation.PPThe new memory allocator was written about a year ago.Conversion from the old memory allocators to the new allocatorhas been going on ever since.Many of the special purpose allocators have been eliminated.This list includes.RN calloc ,.RN wmemall ,and .RN zmemall .Many of the special purpose memory allocators built ontop of other allocators have also been eliminated.For example, the allocator that was built on top of the buffer pool allocator.RN geteblkto allocate pathname buffers in .RN nameihas been eliminated.Because the typical allocation is so fast,we have found that none of the special purpose pools are needed.Indeed, the allocation is about the same as the previous cost ofallocating buffers from the network pool (\fImbuf\fP\^s).Consequently applications that used to allocate networkbuffers for their own uses have been switched over to usingthe general purpose allocator without increasing their running time..PPQuantifying the performance of the allocator is difficult becauseit is hard to measure the amount of time spent allocatingand freeing memory in the kernel.The usual approach is to compile a kernel for profilingand then compare the running time of the routines thatimplemented the old abstraction versus those that implement the new one.The old routines are difficult to quantify becauseindividual routines were used for more than one purpose.For example, the.RN geteblkroutine was used both to allocate one kilobyte memory blocksand for its intended purpose of providing buffers to the filesystem.Differentiating these uses is often difficult.To get a measure of the cost of memory allocation beforeputting in our new allocator,we summed up the running time of all the routines whoseexclusive task was memory allocation.To this total we added the fractionof the running time of the multi-purpose routines that couldclearly be identified as memory allocation usage.This number showed that approximately three percent ofthe time spent in the kernel could be accounted to memory allocation..PPThe new allocator is difficult to measurebecause the usual case of the memory allocator is implemented as a macro.Thus, its running time is a small fraction of the running time of thenumerous routines in the kernel that use it.To get a bound on the cost,we changed the macro always to call the memory allocation routine.Running in this mode, the memory allocator accounted for six percentof the time spent in the kernel.Factoring out the cost of the statistics collection and thesubroutine call overhead for the cases that couldnormally be handled by the macro,we estimate that the allocator would account forat most four percent of time in the kernel.These measurements show that the new allocator does not introducesignificant new run-time costs..PPThe other major success has been in keeping the size informationon a per-page basis.This technique allows the most frequently requested sizes to beallocated without waste.It also reduces the amount of bookkeeping information associatedwith the allocator to four kilobytes of informationper megabyte of memory under management (with a one kilobyte page size)..H 1 "Future Work.PPOur next project is to convert many of the statickernel tables to be dynamically allocated.Static tables include the process table, the file table,and the mount table.Making these tables dynamic will have two benefits.First, it will reduce the amount of memorythat must be statically allocated at boot time.Second, it will eliminate the arbitrary upper limit imposedby the current static sizing(although a limit will be retained to constrain runaway clients).Other researchers have already shown the memory savingsachieved by this conversion [Rodriguez88]..PPUnder the current implementation,memory is never moved from one size list to another.With the 4.2BSD memory allocator this causes problems,particularly for large allocations where a process may usea quarter megabyte piece of memory once,which is then never available for any other size request.In our hybrid scheme,memory can be shuffled between large requests so that large blocksof memory are never stranded as they are with the 4.2BSD allocator.However, pages allocated to small requests are allocated onceto a particular size and never changed thereafter.If a burst of requests came in for a particular size,that size would acquire a large amount of memorythat would then not be available for other future requests..PPIn practice, we do not find that the free lists become too large.However, we have been investigating ways to handle such problemsif they occur in the future.Our current investigations involve a routinethat can run as part of the idle loop that would sort the elementson each of the free lists into order of increasing address.Since any given page has only one size of elements allocated from it,the effect of the sorting would be to sort the list into distinct pages.When all the pieces of a page became free,the page itself could be released back to the free pool so that it could be allocated to another purpose.Although there is no guarantee that all the pieces of a page would everbe freed,most allocations are short-lived, lasting only for the duration ofan open file descriptor, an open network connection, or a system call.As new allocations would be made from the page sorted tothe front of the list,return of elements from pages at the back would eventuallyallow pages later in the list to be freed..PPTwo of the traditional UNIXmemory allocators remain in the current system.The terminal subsystem uses \fIclist\fP\^s (character lists).That part of the system is expected to undergo major revision withinthe the next year or so, and it will probably be changed to use\fImbuf\fP\^s as it is merged into the network system.The other major allocator that remains is.RN getblk ,the routine that manages the filesystem buffer pool memoryand associated control information.Only the filesystem uses.RN getblkin the current system;it manages the constant-sized buffer pool.We plan to merge the filesystem buffer cache into the virtual memory system'spage cache in the future.This change will allow the size of the buffer pool to be changedaccording to memory load,but will require a policy for balancing memory needswith filesystem cache performance..H 1 "Acknowledgments.PPIn the spirit of community support,we have made various versions of our allocator available to our test sites.They have been busily burning it in and givingus feedback on their experiences.We acknowledge their invaluable input.The feedback from the Usenix program committee on the initial draft ofour paper suggested numerous important improvements..H 1 "References.LP.IP Korn85 \w'Rodriguez88\0\0'uDavid Korn, Kiem-Phong Vo,``In Search of a Better Malloc''\fIProceedings of the Portland Usenix Conference\fP,pp 489-506, June 1985..IP McKusick85M. McKusick, M. Karels, S. Leffler,``Performance Improvements and Functional Enhancements in 4.3BSD''\fIProceedings of the Portland Usenix Conference\fP,pp 519-531, June 1985..IP Rodriguez88Robert Rodriguez, Matt Koehler, Larry Palmer, Ricky Palmer,``A Dynamic UNIX Operating System''\fIProceedings of the San Francisco Usenix Conference\fP,June 1988..IP Thompson78Ken Thompson,``UNIX Implementation''\fIBell System Technical Journal\fP, volume 57, number 6,pp 1931-1946, 1978.
上一页 12
💿 文件大小 40554 K
👤 上传用户 luyibo54618
📂 所属分类 Linux/Unix编程
📄 代码行数 650 行
💻 语言类型 T
🏷️ 相关标签

#freebsd
更多freebsd资源 →
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -