📄 bdb_usenix.ps
字号:
79.2 595.2 R 3.139(yr)-.15 G .639(ead operation will need to visit the disk)-3.139 F 1.201(in the steady state.)79.2 607.2 R 1.201(The programmer declares the size)6.201 F(of the cache re)79.2 619.2 Q(gion at startup.)-.15 E(Finally)79.2 635.4 Q 7.048(,m)-.65 G(an)-7.048E 7.048(yo)-.15 G 4.548(perating systems pro)-7.048 F 4.548(vide memory-)-.15 F 2.532(mapped \214le services that are much f)79.2647.4 R 2.533(aster than their)-.1 F 2.602(general-purpose \214le system interf)79.2 659.4 R 5.102(aces. Berk)-.1F(ele)-.1 E 5.102(yD)-.15 G(B)-5.102 E 5.118(can memory-map its database \214les for read-only)79.2 671.4 R 3.917(database use.)79.2 683.4 R 3.917(The application operates on records)8.917 F 2.069(stored directly on the pages, with no cache manage-)79.2695.4 R 1.557(ment o)79.2 707.4 R -.15(ve)-.15 G 4.057(rhead. Because).15 F 1.556(the application gets pointers)4.057 F 1.265(directly into the Berk)323.2 84 R(ele)-.1 E 3.765(yD)-.15 G 3.765(Bp)-3.765 G 1.265(ages, writes cannot be)-3.765 F 3.775(permitted. Otherwise,)323.2 96 R 1.275(changes could bypass the lock-)3.775 F .23(ing and logging systems, and softw)323.2 108 R .23(are errors could cor)-.1 F(-)-.2 E 4.007(rupt the database.)323.2 120 R4.006(Read-only applications can use)9.007 F(Berk)323.2 132 Q(ele)-.1 E2.893(yD)-.15 G(B')-2.893 E 2.893(sm)-.55 G .393(emory-mapped \214le service to impro)-2.893 F -.15(ve)-.15 G(performance on most architectures.)323.2 144 Q F1 3(3.6. Con\214gurable)323.2 174 R(page size)3 E F0 .111(Programmers declare the size of the pages used by their)323.2 190.2 R.403(access methods when the)323.2 202.2 R 2.903(yc)-.15 G .403(reate a database.)-2.903 F(Although)5.403 E(Berk)323.2 214.2 Q(ele)-.1E 4.046(yD)-.15 G 4.046(Bp)-4.046 G(ro)-4.046 E 1.546(vides reasonable def)-.15 F 1.546(aults, de)-.1 F -.15(ve)-.25 G(lopers).15 E 3.64(may o)323.2 226.2 R -.15(ve)-.15 G 3.64(rride them to control system performance.).15 F .793(Small pages reduce the number of records that \214t on a)323.2 238.2 R.353(single page.)323.2 250.2 R(Fe)5.353 E .353(wer records on a page means that fe)-.25 F(wer)-.25 E .724(records are lock)323.2 262.2 R .724(ed when the page is lock)-.1 F .723(ed, impro)-.1 F(ving)-.15 E(concurrenc)323.2 274.2 Q 5.262 -.65(y. T)-.15 H 1.462(he per).65 F 1.462(-page o)-.2 F -.15(ve)-.15 G 1.462(rhead is proportionally).15 F 2.29(higher with smaller pages, of course, b)323.2 286.2 R 2.29(ut de)-.2 F-.15(ve)-.25 G(lopers).15 E(can trade of)323.2 298.2 Q 2.5(fs)-.25 G(pace for time as an application requires.)-2.5 E F1 3(3.7. Small)323.2328.2 R -.3(fo)3 G(otprint).3 E F0(Berk)323.2 344.4 Q(ele)-.1 E 3.973(yD)-.15 G 3.973(Bi)-3.973 G 3.974(sac)-3.973 G 1.474(ompact system.)-3.974 F 1.474(The full package,)6.474 F .832(including all access methods, reco)323.2 356.4 R -.15(ve)-.15 G(rability).15 E 3.331(,a)-.65 G .831(nd trans-)-3.331 F 1.235(action support is roughly 175K of te)323.2 368.4 R 1.236(xt space on com-)-.15 F(mon architectures.)323.2 380.4 Q F1 3(3.8. Cursors)323.2 410.4 R F0 1.57(In database terminology)323.2 426.6R 4.07(,ac)-.65 G 1.57(ursor is a pointer into an)-4.07 F 1.806(access method that can be called iterati)323.2 438.6 R -.15(ve)-.25 G1.807(ly to return).15 F 3.68(records in sequence.)323.2 450.6 R(Berk)8.68 E(ele)-.1 E 6.18(yD)-.15 G 6.18(Bi)-6.18 G 3.68(ncludes cursor)-6.18 F(interf)323.2 462.6 Q 2.814(aces for all access methods.)-.1 F2.815(This permits, for)7.814 F -.15(ex)323.2 474.6 S .34(ample, users to tra).15 F -.15(ve)-.2 G .34(rse a B+tree and vie).15 F2.84(wr)-.25 G .34(ecords in)-2.84 F(order)323.2 486.6 Q 6.233(.P)-.55 G1.234(ointers to records in cursors are persistent, so)-6.233 F 1.779(that once fetched, a record may be updated in place.)323.2 498.6 R(Finally)323.2 510.6 Q 4.438(,c)-.65 G 1.939(ursors support access to chains of duplicate)-4.438 F(data items in the v)323.2 522.6 Q(arious access methods.)-.25 E F1 3(3.9. J)323.2 552.6 R(oins)-.18 E F0 2.703(In database terminology)323.2568.8 R 5.203(,aj)-.65 G 2.702(oin is an operation that)-5.203 F .616(spans multiple separate tables \(or in the case of Berk)323.2 580.8 R(e-)-.1 E(le)323.2 592.8 Q 4.518(yD)-.15 G 2.018(B, multiple separate DB \214les\).)-4.518 F -.15(Fo)7.017 G 4.517(re).15 G 2.017(xample, a)-4.667 F(compan)323.2 604.8 Q 3.372(ym)-.15 G .873(ay store information about its customers in)-3.372 F 1.545(one table and information about sales in another)323.2 616.8 R 6.545(.A)-.55 G(n)-6.545 E 1.498(application will lik)323.2 628.8 R 1.499(ely w)-.1 F 1.499(ant to look up sales informa-)-.1 F .933(tion by customer name; this requires matching records)323.2 640.8 R2.28(in the tw)323.2 652.8 R 4.78(ot)-.1 G 2.28(ables that share a common customer ID)-4.78 F 2.515(\214eld. This)323.2664.8 R .015(combining of records from multiple tables is)2.515 F(called a join.)323.2 676.8 Q(Berk)323.2 693 Q(ele)-.1 E 5.561(yD)-.15 G5.561(Bi)-5.561 G 3.061(ncludes interf)-5.561 F 3.062(aces for joining tw)-.1 F 5.562(oo)-.1 G(r)-5.562 E(more tables.)323.2705 Q EP%%Page: 5 5%%BeginPageSetupBP%%EndPageSetup/F0 12/Times-Bold@0 SF 3(3.10. T)79.2 84 R(ransactions)-.888 E/F1 10/Times-Roman@0 SF -.35(Tr)79.2 100.2 S(ansactions ha).35 E .3 -.15(ve f)-.2 H(our properties [Gray93]:).15 E/F2 8/Times-Roman@0 SF<83>84.2 116.4Q F1(The)17.2 E 5.489(ya)-.15 G 2.989(re atomic.)-5.489 F 2.989(That is, all of the changes)7.989 F 1.475(made in a single transaction must be applied at)104.2 128.4 R 1.31(the same instant or not at all.)104.2 140.4 R 1.31(This permits, for)6.31 F -.15(ex)104.2 152.4 S 3.565(ample, the transfer of mone).15 F6.065(yb)-.15 G 3.565(etween tw)-6.065 F(o)-.1 E 3.68(accounts to be accomplished, by making the)104.2 164.4 R 1.27(reduction of the balance in one account and the)104.2 176.4 R(increase in the other into a single, atomic action.)104.2 188.4 Q F2<83>84.2 204.6 Q F1(The)17.2 E 3.125(ym)-.15 G .625(ust be consistent.)-3.125 F .625(That is, changes to the)5.625 F 3.628(database by an)104.2216.6 R 6.128(yt)-.15 G 3.628(ransaction cannot lea)-6.128 F 3.929 -.15(ve t)-.2 H(he).15 E(database in an ille)104.2 228.6 Q -.05(ga)-.15 G2.5(lo).05 G 2.5(rc)-2.5 G(orrupt state.)-2.5 E F2<83>84.2 244.8 Q F1(The)17.2 E 3.006(ym)-.15 G .506(ust be isolatable.)-3.006 F(Re)5.506 E-.05(ga)-.15 G .505(rdless of the num-).05 F .8(ber of users w)104.2256.8 R .8(orking in the database at the same)-.1 F 1.88(time, e)104.2268.8 R -.15(ve)-.25 G 1.88(ry user must ha).15 F 2.18 -.15(ve t)-.2 H1.88(he illusion that no).15 F(other acti)104.2 280.8 Q(vity is going on.)-.25 E F2<83>84.2 297 Q F1(The)17.2 E 5.54(ym)-.15 G3.04(ust be durable.)-5.54 F(Ev)8.04 E 3.04(en if the disk that)-.15 F.877(stores the database is lost, it must be possible to)104.2 309 R(reco)104.2 321 Q -.15(ve)-.15 G 2.668(rt).15 G .168(he database to its last transaction-consis-)-2.668 F(tent state.)104.2333 Q 2.49(This combination of properties \212 atomicity)79.2 349.2 R4.99(,c)-.65 G(onsis-)-4.99 E(tenc)79.2 361.2 Q 4.542 -.65(y, i)-.15 H3.243(solation, and durability \212 is referred to as).65 F -.4(AC)79.2373.2 S 3.459(IDity in the literature.).4 F(Berk)8.459 E(ele)-.1 E 5.958(yD)-.15 G 3.458(B, lik)-5.958 F 5.958(em)-.1 G(ost)-5.958 E .993(database systems, pro)79.2 385.2 R .993(vides A)-.15 F .994(CIDity using a collection)-.4 F(of core services.)79.2 397.2 Q .257(Programmers can choose to use Berk)79.2 413.4 R(ele)-.1 E 2.757(yD)-.15G(B')-2.757 E 2.757(st)-.55 G(ransac-)-2.757 E(tion services for applications that need them.)79.2 425.4 Q F0 3(3.10.1. Write-ahead)79.2 455.4 R(logging)3 E F1 .479(Programmers can enable the logging system when the)79.2 471.6 R(y)-.15E .918(start up Berk)79.2 483.6 R(ele)-.1 E 3.418(yD)-.15 G 3.418(B. During)-3.418 F 3.417(at)3.417 G .917(ransaction, the appli-)-3.417F .493(cation mak)79.2 495.6 R .493(es a series of changes to the database.)-.1 F(Each)5.494 E .552(change is captured in a log entry)79.2 507.6 R 3.052(,w)-.65 G .552(hich holds the state)-3.052 F .207(of the database record both before and after the change.)79.2 519.6 R2.208(The log record is guaranteed to be \215ushed to stable)79.2 531.6R .871(storage before an)79.2 543.6 R 3.371(yo)-.15 G 3.371(ft)-3.371 G.871(he changed data pages are writ-)-3.371 F 3.989(ten. This)79.2 555.6R(beha)3.989 E 1.489(vior \212 writing the log before the data)-.2 F(pages \212 is called)79.2 567.6 Q/F3 10/Times-Italic@0 SF(write-ahead lo)2.5 E -.1(gg)-.1 G(ing).1 E F1(.)A .835(At an)79.2 583.8R 3.335(yt)-.15 G .835(ime during the transaction, the application can)-3.335 F F3(commit)79.2 595.8 Q F1 4.202(,m)C 1.702(aking the changes permanent, or)-4.202 F F3 -.45(ro)4.201 G 1.701(ll bac).45 F(k)-.2 E F1(,)A .852(cancelling all changes and restoring the database to its)79.2 607.8 R1.57(pre-transaction state.)79.2 619.8 R 1.57(If the application rolls back the)6.57 F 1.003(transaction, then the log holds the state of all changed)79.2 631.8 R.5(pages prior to the transaction, and Berk)79.2 643.8 R(ele)-.1 E 3(yD)-.15 G 3(Bs)-3 G(imply)-3 E .226(restores that state.)79.2 655.8 R .226(If the application commits the trans-)5.226 F .538(action, Berk)79.2667.8 R(ele)-.1 E 3.038(yD)-.15 G 3.038(Bw)-3.038 G .538(rites the log records to disk.)-3.038 F(In-)5.537 E 2.312(memory copies of the data pages already re\215ect the)79.2 679.8 R1.399(changes, and will be \215ushed as necessary during nor)79.2 691.8R(-)-.2 E 2.35(mal processing.)79.2 703.8 R 2.35(Since log writes are sequential, b)7.35 F(ut)-.2 E 8.732(data page writes are random, this impro)79.2 715.8 R -.15(ve)-.15 G(s).15 E(performance.)323.2 84 Q F0 3(3.10.2. Crashes)323.2 114 R(and r)3 E(eco)-.216 E -.12(ve)-.12 G(ry).12 E F1(Berk)323.2 130.2 Q(ele)-.1 E3.592(yD)-.15 G(B')-3.592 E 3.592(sw)-.55 G 1.093(rite-ahead log is used by the transac-)-3.592 F .415(tion system to commit or roll back transactions.)323.2 142.2 R .414(It also)5.414 F(gi)323.2 154.2 Q -.15(ve)-.25 G 3.23(st).15 G .73(he reco)-3.23 F -.15(ve)-.15 G .73(ry system the information that it needs).15 F .824(to protect ag)323.2166.2 R .824(ainst data loss or corruption from crashes.)-.05 F(Berk)323.2 178.2 Q(ele)-.1 E 2.703(yD)-.15 G 2.703(Bi)-2.703 G 2.704(sa)-2.703 G .204(ble to survi)-2.704 F .504 -.15(ve a)-.25 H .204(pplication crashes, sys-).15 F .408(tem crashes, and e)323.2 190.2 R-.15(ve)-.25 G 2.908(nc).15 G .407(atastrophic f)-2.908 F .407(ailures lik)-.1 F 2.907(et)-.1 G .407(he loss)-2.907 F(of a hard disk, without losing an)323.2 202.2 Q 2.5(yd)-.15 G(ata.)-2.5E(Survi)323.2 218.4 Q .538(ving crashes requires data stored in se)-.25F -.15(ve)-.25 G .539(ral dif).15 F(fer)-.25 E(-)-.2 E 2.52(ent places.)323.2 230.4 R 2.52(During normal processing, Berk)7.52 F(ele)-.1 E 5.02(yD)-.15 G(B)-5.02 E .766(has copies of acti)323.2 242.4 R 1.066 -.15(ve l)-.25 H .766(og records and recently-used data).15 F 1.539(pages in memory)323.2 254.4 R 6.539(.L)-.65 G 1.539(og records are \215ushed to the log)-6.539 F .694(disk when transactions commit.)323.2 266.4 R .695(Data pages trickle out)5.694 F .008(to the data disk as pages mo)323.2278.4 R .308 -.15(ve t)-.15 H .008(hrough the b).15 F(uf)-.2 E .008(fer cache.)-.25 F(Periodically)323.2 290.4 Q 2.691(,t)-.65 G .191(he system administrator backs up the data)-2.691 F .278(disk, creating a safe cop)323.2 302.4 R 2.778(yo)-.1 G 2.778(ft)-2.778G .278(he database at a particular)-2.778 F 2.609(instant. When)323.2314.4 R .109(the database is back)2.609 F .109(ed up, the log can be)-.1F 3.838(truncated. F)323.2 326.4 R 1.337(or maximum rob)-.15 F 1.337(ustness, the log disk and)-.2 F(data disk should be separate de)323.2338.4 Q(vices.)-.25 E(Dif)323.2 354.6 Q 1.29(ferent system f)-.25 F 1.29(ailures can destro)-.1 F 3.79(ym)-.1 G(emory)-3.79 E 3.79(,t)-.65 G1.29(he log)-3.79 F 1.106(disk, or the data disk.)323.2 366.6 R(Berk)6.106 E(ele)-.1 E 3.606(yD)-.15 G 3.606(Bi)-3.606 G 3.606(sa)-3.606 G1.106(ble to survi)-3.606 F -.15(ve)-.25 G .679(the loss of an)323.2378.6 R 3.179(yo)-.15 G .679(ne of these repositories without losing)-3.179 F(an)323.2 390.6 Q 2.5(yc)-.15 G(ommitted transactions.)-2.5 E1.372(If the computer')323.2 406.8 R 3.871(sm)-.55 G 1.371(emory is lost, through an applica-)-3.871 F 1.619(tion or operating system crash, then the log holds all)323.2 418.8 R1.789(committed transactions.)323.2 430.8 R 1.788(On restart, the reco)6.789 F -.15(ve)-.15 G 1.788(ry sys-).15 F .49(tem rolls the log forw)323.2 442.8 R .49(ard ag)-.1 F .49(ainst the database, reapply-)-.05 F.682(ing an)323.2 454.8 R 3.181(yc)-.15 G .681(hanges to on-disk pages that were in memory)-3.181 F .14(at the time of the crash.)323.2 466.8 R .14(Since the log contains pre- and)5.14 F .957(post-change state for transactions, the reco)323.2 478.8 R -.15(ve)-.15G .956(ry system).15 F 1.14(also uses the log to restore an)323.2 490.8R 3.64(yp)-.15 G 1.14(ages to their original)-3.64 F 1.615(state if the)323.2 502.8 R 4.115(yw)-.15 G 1.615(ere modi\214ed by transactions that ne)-4.115 F -.15(ve)-.25 G(r).15 E(committed.)323.2 514.8 Q 2.051(If the data disk is lost, the system administrator can)323.2 531 R .887(restore the most recent cop)323.2 543 R 3.386(yf)-.1 G .886(rom backup.)-3.386 F .886(The reco)5.886 F(v-)-.15 E 1.298(ery system will roll the entire log forw)323.2 555 R 1.298(ard ag)-.1 F1.298(ainst the)-.05 F 2.64(original database, reapplying all committed changes.)323.2 567 R 4.363(When it \214nishes, the database will contain e)323.2 579 R -.15(ve)-.25 G(ry).15 E .535(change made by e)323.2 591 R -.15(ve)-.25 G .534(ry transaction that e).15 F -.15(ve)-.25 G 3.034(rc).15 G(ommitted.)-3.034 E .494(If the log disk is lost, then the reco)323.2 607.2 R -.15(ve)-.15 G .495(ry system can use).15 F 1.853(the in-memory copies of log entries to roll back an)323.2 619.2 R(y)-.15 E .026(uncommitted transactions, \215ush all in-memory database)323.2 631.2 R 1.659(pages to the data disk, and shut do)323.2 643.2 R1.659(wn gracefully)-.25 F 6.658(.A)-.65 G(t)-6.658 E 2.204(that point, the system administrator can back up the)323.2 655.2 R .039(database disk, install a ne)323.2 667.2 R 2.539(wl)-.25 G .039(og disk, and restart the sys-)-2.539 F(tem.)323.2 679.2 Q EP%%Page: 6 6%%BeginPageSetupBP
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -