slab.ps

来自「这是一个介绍 linux 编程知识的文章。」· PS 代码 · 共 1,690 行 · 第 1/5 页

PS
1,690
字号
( bus balance has gone largely)5 1517(utilization and)1 643 2 3110 4902 t
(unrecognized.)3110 5022 w
11 B f
( Address Distribution)2 1069( of Buffer)2 501(4.1. Impact)1 590 3 3110 5334 t
(on Cache Utilization)2 976 1 3110 5454 t
10 R f
( can)1 205(The address distribution of mid-size buffers)5 1955 2 3110 5616 t
( par-)1 196( In)1 152(affect the system's overall cache utilization.)5 1812 3 3110 5736 t
( where all buffers)3 736(ticular, power-of-two allocators \320)3 1424 2 3110 5856 t
(are 2)1 227 1 3110 5976 t
8 I f
(n)3337 5926 w
10 R f
(bytes and are 2)3 691 1 3433 5976 t
8 I f
(n)4124 5926 w
10 R f
(-byte aligned \320 are pes-)4 1106 1 4164 5976 t
( that every inode)3 794( for example,)2 611(simal.* Suppose,)1 755 3 3110 6096 t
(\()3110 6216 w
10 S f
()3143 6216 w
cleartomark restore
%%BeginGlobal
%ident	"@(#)lp:filter/postscript/font/devpost/charlib/~=	1.2"
/build_~= {
    pop
    (\176) stringwidth pop neg size -.15 mul (\176\055) ashow
} def
%%EndGlobal
save mark
10 S f
3143 6216 m
55 build_~=
3198 6216 m
10 R f
(300 bytes\) is assigned a 512-byte buffer, 512-byte)7 2072 1 3198 6216 t
(aligned, and that only the \256rst dozen \256elds of an)9 2160 1 3110 6336 t
( Then)1 296(inode \(48 bytes\) are frequently referenced.)5 1864 2 3110 6456 t
( traf\256c will be)3 605(the majority of inode-related memory)4 1555 2 3110 6576 t
8 S1 f
(____________________________________)3110 6672 w
8 R f
( because they are easy to)5 897(* Such allocators are common)4 1047 2 3326 6792 t
( SVr4 both employ)3 742( example, 4.4BSD and)3 856(implement. For)1 562 3 3110 6888 t
(power-of-two methods [McKusick88, Lee89].)3 1488 1 3110 6984 t
cleartomark
showpage
restore
%%EndPage: 6 6
%%Page: 7 7
save
mark
7 pagesetup
10 R f
( Thus)1 279( 0 and 47 modulo 512.)5 1005(at addresses between)2 876 3 590 696 t
( 512-byte boundaries will be)4 1275(the cache lines near)3 885 2 590 816 t
( effect)1 275( In)1 165( the rest lie fallow.)4 842(heavily loaded while)2 878 4 590 936 t
( \(48/512\) of the cache will be usable by)8 1796(only 9%)1 364 2 590 1056 t
( would not suffer)3 781( caches)1 321(inodes. Fully-associative)1 1058 3 590 1176 t
( trends are toward)3 741(this problem, but current hardware)4 1419 2 590 1296 t
(simpler rather than more complex caches.)5 1708 1 590 1416 t
( there's nothing special about)4 1432(Of course,)1 478 2 840 1578 t
( many other mid-size)3 936( kernel contains)2 694(inodes. The)1 530 3 590 1698 t
( bytes\) with the same)4 949(data structures \(e.g. 100-500)3 1211 2 590 1818 t
(essential qualities: there are many of them, they)7 2160 1 590 1938 t
( heavily used \256elds, and those)5 1339(contain only a few)3 821 2 590 2058 t
(\256elds are grouped together at or near the beginning)8 2160 1 590 2178 t
( artifact of the way data struc-)6 1262( This)1 247(of the structure.)2 651 3 590 2298 t
( has not previously been recognized as)6 1655(tures evolve)1 505 2 590 2418 t
(an important factor in allocator design.)5 1593 1 590 2538 t
11 B f
( Address Distribution)2 1069( of Buffer)2 501(4.2. Impact)1 590 3 590 2850 t
(on Bus Balance)2 741 1 590 2970 t
10 R f
( across multi-)2 566(On a machine that interleaves memory)5 1594 2 590 3132 t
( effects described above also)4 1286(ple main buses, the)3 874 2 590 3252 t
( The)1 243( bus utilization.)2 680(have a signi\256cant impact on)4 1237 3 590 3372 t
( for example, employs 256-byte)4 1333(SPARCcenter 2000,)1 827 2 590 3492 t
( main buses [Cekleov92].)3 1150(interleaving across two)2 1010 2 590 3612 t
( example above, we see that any)6 1522(Continuing the)1 638 2 590 3732 t
(power-of-two allocator maps the \256rst half of every)7 2160 1 590 3852 t
( and the second half to)5 966(inode \(the hot part\) to bus 0)6 1194 2 590 3972 t
( almost all inode-related cache misses)5 1621( Thus)1 279(bus 1.)1 260 3 590 4092 t
( is exacerbated)2 620( situation)1 385( The)1 228(are serviced by bus 0.)4 927 4 590 4212 t
( since all of the inodes are)6 1144(by an in\257ated miss rate,)4 1016 2 590 4332 t
(\256ghting over a small fraction of the cache.)7 1747 1 590 4452 t
( a)1 145( On)1 256(These effects can be dramatic.)4 1509 3 840 4614 t
( running LADDIS under a)4 1308(SPARCcenter 2000)1 852 2 590 4734 t
( kernel, replacing the old)4 1123(SunOS 5.4 development)2 1037 2 590 4854 t
( power-of-two buddy-system [Lee89]\))3 1664(allocator \(a)1 496 2 590 4974 t
( allocator reduced bus imbalance from)5 1617(with the slab)2 543 2 590 5094 t
( the primary cache)3 817( addition,)1 405( In)1 168(43% to just 17%.)3 770 4 590 5214 t
(miss rate dropped by 13%.)4 1100 1 590 5334 t
11 B f
( Coloring)1 452(4.3. Slab)1 448 2 590 5634 t
10 R f
( incorporates a simple coloring)4 1377(The slab allocator)2 783 2 590 5796 t
( buffers evenly throughout)3 1170(scheme that distributes)2 990 2 590 5916 t
( utilization)1 468(the cache, resulting in excellent cache)5 1692 2 590 6036 t
( concept is simple: each time)5 1237( The)1 229(and bus balance.)2 694 3 590 6156 t
( buffer addresses start at a)5 1103(a new slab is created, the)5 1057 2 590 6276 t
(slightly different offset \(color\) from the slab base)7 2160 1 590 6396 t
( example, for a)3 640( For)1 211( page-aligned\).)1 612(\(which is always)2 697 4 590 6516 t
( objects with 8-byte alignment, the)5 1428(cache of 200-byte)2 732 2 590 6636 t
( 200,)1 231(\256rst slab's buffers would be at addresses 0,)7 1929 2 590 6756 t
( next slab's)2 505( The)1 238( base.)1 252(400, ... relative to the slab)5 1165 4 590 6876 t
( ... and so)3 449(buffers would be at offsets 8, 208, 408,)7 1711 2 590 6996 t
( maximum slab color is determined by the)7 1804(on. The)1 356 2 3110 696 t
( this exam-)2 478( In)1 158( unused space in the slab.)5 1100(amount of)1 424 4 3110 816 t
( can \256t 20 200-byte)4 909(ple, assuming 4K pages, we)4 1251 2 3110 936 t
( buffers consume)2 735( The)1 237( slab.)1 235(buffers in a 4096-byte)3 953 4 3110 1056 t
(4000 bytes, the)2 684 1 3110 1176 t
10 CW f
(kmem_slab)3884 1176 w
10 R f
(data consumes 32)2 784 1 4486 1176 t
( for)1 160(bytes, and the remaining 64 bytes are available)7 2000 2 3110 1296 t
( slab color is 64, and)5 912( the maximum)2 614(coloring. Thus)1 634 3 3110 1416 t
( 32, 40, 48,)3 504(the slab color sequence is 0, 8, 16, 24,)8 1656 2 3110 1536 t
(56, 64, 0, 8, ...)4 607 1 3110 1656 t
( of this coloring)3 672(One particularly nice property)3 1238 2 3360 1818 t
(scheme is that mid-size power-of-two buffers)5 2160 1 3110 1938 t
( of coloring, since)3 847(receive the maximum amount)3 1313 2 3110 2058 t
( example, while 128)3 870( For)1 217(they are the worst-\256tting.)3 1073 3 3110 2178 t
(bytes goes perfectly into 4096, it goes near-)7 2160 1 3110 2298 t
( is what's actually)3 769(pessimally into 4096 - 32, which)5 1391 2 3110 2418 t
(available \(because of the embedded slab data\).)6 1906 1 3110 2538 t
11 B f
( Management)1 651(4.4. Arena)1 530 2 3110 2838 t
10 R f
( arena management strategy deter-)4 1549(An allocator's)1 611 2 3110 3000 t
( strategies)1 415( These)1 310(mines its dynamic cache footprint.)4 1435 3 3110 3120 t
( sequential-\256t)1 647( broad categories:)2 843(fall into three)2 670 3 3110 3240 t
( segregated-storage)1 819(methods, buddy methods, and)3 1341 2 3110 3360 t
(methods [Standish80].)1 908 1 3110 3480 t
( must typically search)3 901(A sequential-\256t allocator)2 1009 2 3360 3642 t
( Such)1 289( to \256nd a good-\256tting buffer.)5 1299(several nodes)1 572 3 3110 3762 t
( nature, condemned to a large cache)6 1503(methods are, by)2 657 2 3110 3882 t
( to examine a signi\256cant number)5 1352(footprint: they have)2 808 2 3110 4002 t
( nowhere near each other.)4 1066(of nodes that are generally)4 1094 2 3110 4122 t
( only cache misses, but TLB misses)6 1514(This causes not)2 646 2 3110 4242 t
( coalescing stages of buddy-system)4 1565( The)1 252(as well.)1 343 3 3110 4362 t
(allocators [Knuth68, Lee89] have similar properties.)5 2133 1 3110 4482 t
( allocator, such as the)4 1026(A segregated-storage)1 884 2 3360 4644 t
( for dif-)2 370(slab allocator, maintains separate freelists)4 1790 2 3110 4764 t
( generally have)2 643( allocators)1 429( These)1 312(ferent buffer sizes.)2 776 4 3110 4884 t
( because allocating a buffer is so)6 1362(good cache locality)2 798 2 3110 5004 t
( do is determine the)4 840( the allocator has to)4 830(simple. All)1 490 3 3110 5124 t
( or)1 133(right freelist \(by computation, by table lookup,)6 2027 2 3110 5244 t
( take a)2 312(by having it supplied as an argument\) and)7 1848 2 3110 5364 t
( similarly)1 440( a buffer is)3 609( Freeing)1 423(buffer from it.)2 688 4 3110 5484 t
( handful of)2 564( are only a)3 604(straightforward. There)1 992 3 3110 5604 t
(pointers to load, so the cache footprint is small.)8 1957 1 3110 5724 t
( advan-)1 325(The slab allocator has the additional)5 1585 2 3360 5886 t
( mid-size buffers, most of the)5 1261(tage that for small to)4 899 2 3110 6006 t
( bufctls, and)2 543(relevant information \320 the slab data,)5 1617 2 3110 6126 t
( page.)1 278(buffers themselves \320 resides on a single)6 1882 2 3110 6246 t
(Thus a single TLB entry covers most of the action.)9 2103 1 3110 6366 t
cleartomark
showpage
restore
%%EndPage: 7 7
%%Page: 8 8
save
mark
8 pagesetup
11 B f
(5. Performance)1 761 1 590 696 t
10 R f
( performance of the slab)4 1057(This section compares the)3 1103 2 590 858 t
(allocator to three other well-known kernel memory)6 2160 1 590 978 t
(allocators:)590 1098 w
10 B f
(SunOS 4.1.3)1 584 1 790 1260 t
10 R f
(, based on [Stephenson83], a)4 1376 1 1374 1260 t
(sequential-\256t method;)1 883 1 790 1380 t
10 B f
(4.4BSD)790 1542 w
10 R f
( on [McKusick88], a power-of-)4 1341(, based)1 299 2 1110 1542 t
(two segregated-storage method;)2 1291 1 790 1662 t
10 B f
(SVr4)790 1824 w
10 R f
( on [Lee89], a power-of-two)4 1396(, based)1 342 2 1012 1824 t
( was)1 244( allocator)1 439( This)1 301(buddy-system method.)1 976 4 790 1944 t
(employed in all previous)3 1015 1 790 2064 t
10 B f
(SunOS 5.x)1 460 1 1838 2064 t
10 R f
(releases.)2331 2064 w
( each of these allocators)4 1058(To get a fair comparison,)4 1102 2 590 2226 t
( the same SunOS 5.4 base system.)6 1498(was ported into)2 662 2 590 2346 t
( comparing just allocators,)3 1120(This ensures that we are)4 1040 2 590 2466 t
(not entire operating systems.)3 1173 1 590 2586 t
11 B f
( Comparison)1 617(5.1. Speed)1 520 2 590 2886 t
10 R f
( the time required to allocate)5 1240(On a SPARCstation-2)2 920 2 590 3048 t
( is as)2 236(and free a buffer under the various allocators)7 1924 2 590 3168 t
(follows:)590 3288 w
10 S f
(_ ___________________________________________)1 2157 1 590 3350 t
10 R f
(Memory Allocation + Free Costs)4 1354 1 991 3470 t
10 S f
(_ ___________________________________________)1 2157 1 590 3480 t
(_ ___________________________________________)1 2157 1 590 3500 t
10 R f
( \()1 66(allocator time)1 848 2 641 3610 t
10 S f
(m)1555 3610 w
10 R f
(sec\) interface)1 661 1 1613 3610 t
10 S f
(_ ___________________________________________)1 2157 1 590 3630 t
10 R f
( kmem)1 547(slab 3.8)1 988 2 641 3750 t
10 S f
(_)2176 3750 w
10 R f
(cache)2226 3750 w
10 S f
(_)2452 3750 w
10 R f
(alloc)2502 3750 w
( kmem)1 547(4.4BSD 4.1)1 988 2 641 3870 t
10 S f
(_)2176 3870 w
10 R f
(alloc)2226 3870 w
( kmem)1 547(slab 4.7)1 988 2 641 3990 t
10 S f
(_)2176 3990 w
10 R f
(alloc)2226 3990 w
( kmem)1 547(SVr4 9.4)1 988 2 641 4110 t
10 S f
(_)2176 4110 w
10 R f
(alloc)2226 4110 w
( kmem)1 547( 25.0)1 471(SunOS 4.1.3)1 517 3 641 4230 t
10 S f
(_)2176 4230 w
10 R f
(alloc)2226 4230 w
10 S f
( \347)1 -2157(_ ___________________________________________)1 2157 2 590 4250 t
(\347)590 4150 w
(\347)590 4050 w
(\347)590 3950 w
(\347)590 3850 w
(\347)590 3750 w
(\347)590 3650 w
(\347)590 3550 w
(\347)590 3450 w
(\347)1234 4250 w
(\347)1234 4200 w
(\347)1234 4100 w
(\347)1234 4000 w
(\347)1234 3900 w
(\347)1234 3800 w
(\347)1234 3700 w
(\347)1234 3600 w
(\347)1849 4250 w
(\347)1849 4200 w
(\347)1849 4100 w
(\347)1849 4000 w
(\347)1849 3900 w
(\347)1849 3800 w
(\347)1849 3700 w
(\347)1849 3600 w
(\347)2747 4250 w
(\347)2747 4150 w
(\347)2747 4050 w
(\347)2747 3950 w
(\347)2747 3850 w
(\347)2747 3750 w
(\347)2747 3650 w
(\347)2747 3550 w
(\347)2747 3450 w
10 R f
( 4.4BSD allocator offers both functional)5 1734(Note: The)1 426 2 590 4454 t
( measure-)1 407( These)1 313(and preprocessor macro interfaces.)3 1440 3 590 4574 t
( Non-binary)1 551( functional version.)2 840(ments are for the)3 769 3 590 4694 t
( since)1 274(interfaces in general were not considered,)5 1886 2 590 4814 t
( to drivers without expos-)4 1111(these cannot be exported)3 1049 2 590 4934 t
( was)1 195( 4.4BSD allocator)2 751( The)1 229(ing the implementation.)2 985 4 590 5054 t
(compiled)590 5174 w
10 I f
(without)1006 5174 w
10 CW f
(KMEMSTATS)1378 5174 w
10 R f
( by)1 145(de\256ned \(it's on)2 643 2 1962 5174 t
(default\) to get the fastest possible code.)6 1626 1 590 5294 t
(A)840 5456 w
10 CW f
(mutex_enter\(\)/mutex_exit\(\))987 5456 w
10 R f
(pair)2595 5456 w
(costs 1.0)1 375 1 590 5576 t
10 S f
(m)1015 5576 w
10 R f
(sec, so the locking required to allocate)6 1677 1 1073 5576 t
( a lower bound of 2.0)5 1019(and free a buffer imposes)4 1141 2 590 5696 t
10 S f
(m)590 5816 w
10 R f
( slab and 4.4BSD allocators are both very)7 1727(sec. The)1 375 2 648 5816 t
( little work)2 477(close to this limit because they do very)7 1683 2 590 5936 t
( implementation)1 667( 4.4BSD)1 360( The)1 228(in the common cases.)3 905 4 590 6056 t
(of)590 6176 w
10 CW f
(kmem_alloc\(\))746 6176 w
10 R f
(is slightly faster, since it has)5 1238 1 1512 6176 t
( reclaims memory\).)2 829(less accounting to do \(it never)5 1331 2 590 6296 t
(The slab allocator's)2 947 1 590 6416 t
10 CW f
(kmem_cache_alloc\(\))1670 6416 w
10 R f
( it doesn't)2 428(interface is even faster, however, because)5 1732 2 590 6536 t
( to use \320)3 452(have to determine which freelist \(cache\))5 1708 2 590 6656 t
(the cache descriptor is passed as an argument to)8 2160 1 590 6776 t
10 CW f
(kmem_cache_alloc\(\))590 6896 w
10 R f
( event, the differ-)3 725( any)1 179(. In)1 176 3 1670 6896 t
( between the slab and 4.4BSD)5 1474(ences in speed)2 686 2 590 7016 t
( is to be expected, since)5 1058( This)1 258( small.)1 289(allocators are)1 555 4 3110 696 t
( are operationally)2 810(all segregated-storage methods)2 1350 2 3110 816 t
( good segregated-storage implementa-)3 1602(similar. Any)1 558 2 3110 936 t
(tion should achieve excellent performance.)4 1747 1 3110 1056 t
(The SVr4 allo

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?