📄 libtp_usenix.ps
字号:
2438(to)X2524(disk)X2681(is)X2758(the)X2880(log.)X3046(Since)X3248(the)X3370(log)X3495(is)X3571(append-only,)X4015(modi\256ed)X4 p%%Page: 4 410 s 10 xH 0 xS 1 f3 f1 f555 630(pages)N760(always)X1005(appear)X1242(at)X1322(the)X1442(end)X1580(and)X1718(may)X1878(be)X1976(written)X2224(to)X2307(disk)X2461(ef\256ciently)X2807(in)X2890(any)X3027(\256le)X3150(system)X3393(that)X3534(favors)X3756(sequential)X4102(order-)X555 720(ing)N677(\()X2 f704(e.g.)X1 f820(,)X860(FFS,)X1032(log-structured)X1502(\256le)X1624(system,)X1886(or)X1973(an)X2069(extent-based)X2495(system\).)X3 f555 906(3.1.2.)N775(Concurrency)X1245(Control)X1 f755 1029(The)N918(concurrency)X1354(control)X1619(protocol)X1923(is)X2013(responsible)X2415(for)X2546(maintaining)X2965(consistency)X3376(in)X3475(the)X3610(presence)X3929(of)X4033(multiple)X555 1119(accesses.)N897(There)X1114(are)X1242(several)X1499(alternative)X1867(solutions)X2183(such)X2358(as)X2453(locking,)X2741(optimistic)X3088(concurrency)X3514(control)X3769([KUNG81],)X4183(and)X555 1209(timestamp)N912(ordering)X1208([BERN80].)X1619(Since)X1821(optimistic)X2164(methods)X2459(and)X2599(timestamp)X2956(ordering)X3252(are)X3374(generally)X3696(more)X3884(complex)X4183(and)X555 1299(restrict)N804(concurrency)X1228(without)X1498(eliminating)X1888(starvation)X2230(or)X2323(deadlocks,)X2690(we)X2810(chose)X3018(two-phase)X3373(locking)X3638(\(2PL\).)X3890(Strict)X4088(2PL)X4246(is)X555 1389(suboptimal)N935(for)X1054(certain)X1297(data)X1455(structures)X1791(such)X1962(as)X2053(B-trees)X2309(because)X2588(it)X2656(can)X2792(limit)X2966(concurrency,)X3408(so)X3503(we)X3621(use)X3752(a)X3812(special)X4059(locking)X555 1479(protocol)N842(based)X1045(on)X1145(one)X1281(described)X1609(in)X1691([LEHM81].)X755 1602(The)N901(B-tree)X1123(locking)X1384(protocol)X1672(we)X1787(implemented)X2226(releases)X2502(locks)X2691(at)X2769(internal)X3034(nodes)X3241(in)X3323(the)X3441(tree)X3582(as)X3669(it)X3733(descends.)X4083(A)X4161(lock)X555 1692(on)N658(an)X757(internal)X1025(page)X1200(is)X1276(always)X1522(released)X1808(before)X2036(a)X2094(lock)X2254(on)X2356(its)X2453(child)X2635(is)X2710(obtained)X3008(\(that)X3177(is,)X3272(locks)X3463(are)X3584(not)X3 f3708(coupled)X1 f3996([BAY77])X555 1782(during)N786(descent\).)X1116(When)X1330(a)X1388(leaf)X1531(\(or)X1647(internal\))X1941(page)X2115(is)X2190(split,)X2369(a)X2427(write)X2614(lock)X2774(is)X2849(acquired)X3148(on)X3250(the)X3370(parent)X3593(before)X3821(the)X3941(lock)X4100(on)X4201(the)X555 1872(just-split)N855(page)X1028(is)X1102(released)X1387(\(locks)X1604(are)X3 f1724(coupled)X1 f2011(during)X2241(ascent\).)X2530(Write)X2734(locks)X2924(on)X3025(internal)X3291(pages)X3495(are)X3615(released)X3899(immediately)X555 1962(after)N723(the)X841(page)X1013(is)X1086(updated,)X1380(but)X1502(locks)X1691(on)X1791(leaf)X1932(pages)X2135(are)X2254(held)X2412(until)X2578(the)X2696(end)X2832(of)X2919(the)X3037(transaction.)X755 2085(Since)N964(locks)X1164(are)X1294(released)X1589(during)X1828(descent,)X2119(the)X2247(structure)X2558(of)X2655(the)X2783(tree)X2934(may)X3102(change)X3360(above)X3582(a)X3648(node)X3834(being)X4042(used)X4219(by)X555 2175(some)N752(process.)X1061(If)X1143(that)X1291(process)X1560(must)X1743(later)X1914(ascend)X2161(the)X2287(tree)X2435(because)X2717(of)X2811(a)X2874(page)X3053(split,)X3237(any)X3380(such)X3554(change)X3809(must)X3991(not)X4120(cause)X555 2265(confusion.)N938(We)X1077(use)X1211(the)X1336(technique)X1675(described)X2010(in)X2099([LEHM81])X2487(which)X2710(exploits)X2989(the)X3113(ordering)X3411(of)X3504(data)X3664(on)X3770(a)X3832(B-tree)X4059(page)X4237(to)X555 2355(guarantee)N888(that)X1028(no)X1128(process)X1389(ever)X1548(gets)X1697(lost)X1832(as)X1919(a)X1975(result)X2173(of)X2260(internal)X2525(page)X2697(updates)X2962(made)X3156(by)X3256(other)X3441(processes.)X755 2478(If)N836(a)X899(transaction)X1278(that)X1425(updates)X1697(a)X1760(B-tree)X1988(aborts,)X2231(the)X2356(user-visible)X2757(changes)X3043(to)X3131(the)X3255(tree)X3402(must)X3583(be)X3685(rolled)X3898(back.)X4116(How-)X555 2568(ever,)N735(changes)X1015(to)X1097(the)X1215(internal)X1480(nodes)X1687(of)X1774(the)X1892(tree)X2033(need)X2205(not)X2327(be)X2423(rolled)X2630(back,)X2822(since)X3007(these)X3192(pages)X3395(contain)X3651(no)X3751(user-visible)X4145(data.)X555 2658(When)N771(rolling)X1008(back)X1184(a)X1244(transaction,)X1640(we)X1758(roll)X1893(back)X2069(all)X2173(leaf)X2318(page)X2494(updates,)X2783(but)X2909(no)X3013(internal)X3281(insertions)X3615(or)X3705(page)X3880(splits.)X4111(In)X4201(the)X555 2748(worst)N759(case,)X944(this)X1085(will)X1235(leave)X1431(a)X1493(leaf)X1640(page)X1818(less)X1964(than)X2128(half)X2279(full.)X2456(This)X2624(may)X2788(cause)X2993(poor)X3166(space)X3371(utilization,)X3741(but)X3869(does)X4042(not)X4170(lose)X555 2838(user)N709(data.)X755 2961(Holding)N1038(locks)X1228(on)X1329(leaf)X1471(pages)X1675(until)X1842(transaction)X2215(commit)X2480(guarantees)X2845(that)X2986(no)X3087(other)X3273(process)X3535(can)X3668(insert)X3866(or)X3953(delete)X4165(data)X555 3051(that)N711(has)X854(been)X1042(touched)X1332(by)X1448(this)X1598(process.)X1914(Rolling)X2188(back)X2375(insertions)X2721(and)X2872(deletions)X3196(on)X3311(leaf)X3467(pages)X3685(guarantees)X4064(that)X4219(no)X555 3141(aborted)N819(updates)X1087(are)X1209(ever)X1371(visible)X1607(to)X1692(other)X1880(transactions.)X2326(Leaving)X2612(page)X2787(splits)X2978(intact)X3179(permits)X3442(us)X3536(to)X3621(release)X3867(internal)X4134(write)X555 3231(locks)N744(early.)X965(Thus)X1145(transaction)X1517(semantics)X1853(are)X1972(preserved,)X2325(and)X2461(locks)X2650(are)X2769(held)X2927(for)X3041(shorter)X3284(periods.)X755 3354(The)N901(extra)X1083(complexity)X1464(introduced)X1828(by)X1929(this)X2065(locking)X2326(protocol)X2614(appears)X2881(substantial,)X3264(but)X3387(it)X3452(is)X3525(important)X3856(for)X3970(multi-user)X555 3444(execution.)N950(The)X1118(bene\256ts)X1410(of)X1520(non-two-phase)X2040(locking)X2323(on)X2446(B-trees)X2721(are)X2863(well)X3044(established)X3443(in)X3548(the)X3689(database)X4009(literature)X555 3534([BAY77],)N899([LEHM81].)X1320(If)X1394(a)X1450(process)X1711(held)X1869(locks)X2058(until)X2224(it)X2288(committed,)X2670(then)X2828(a)X2884(long-running)X3322(update)X3556(could)X3754(lock)X3912(out)X4034(all)X4134(other)X555 3624(transactions)N967(by)X1076(preventing)X1448(any)X1593(other)X1787(process)X2057(from)X2241(locking)X2509(the)X2635(root)X2792(page)X2972(of)X3067(the)X3193(tree.)X3382(The)X3535(B-tree)X3764(locking)X4032(protocol)X555 3714(described)N884(above)X1096(guarantees)X1460(that)X1600(locks)X1789(on)X1889(internal)X2154(pages)X2357(are)X2476(held)X2634(for)X2748(extremely)X3089(short)X3269(periods,)X3545(thereby)X3806(increasing)X4156(con-)X555 3804(currency.)N3 f555 3990(3.1.3.)N775(Management)X1245(of)X1332(Shared)X1596(Data)X1 f755 4113(Database)N1075(systems)X1353(permit)X1587(many)X1790(users)X1980(to)X2067(examine)X2364(and)X2505(update)X2744(the)X2866(same)X3055(data)X3213(concurrently.)X3683(In)X3774(order)X3968(to)X4054(provide)X555 4203(this)N702(concurrent)X1078(access)X1316(and)X1464(enforce)X1738(the)X1868(write-ahead)X2280(logging)X2556(protocol)X2855(described)X3195(in)X3289(section)X3548(3.1.1,)X3759(we)X3884(use)X4022(a)X4089(shared)X555 4293(memory)N848(buffer)X1071(manager.)X1414(Not)X1559(only)X1726(does)X1898(this)X2038(provide)X2308(the)X2431(guarantees)X2800(we)X2919(require,)X3192(but)X3319(a)X3380(user-level)X3722(buffer)X3944(manager)X4246(is)X555 4383(frequently)N916(faster)X1126(than)X1295(using)X1498(the)X1626(\256le)X1758(system)X2010(buffer)X2237(cache.)X2491(Reads)X2717(or)X2814(writes)X3040(involving)X3376(the)X3504(\256le)X3636(system)X3888(buffer)X4115(cache)X555 4473(often)N746(require)X1000(copying)X1284(data)X1444(between)X1738(user)X1898(and)X2040(kernel)X2266(space)X2470(while)X2673(a)X2734(user-level)X3076(buffer)X3298(manager)X3600(can)X3737(return)X3954(pointers)X4237(to)X555 4563(data)N709(pages)X912(directly.)X1217(Additionally,)X1661(if)X1730(more)X1915(than)X2073(one)X2209(process)X2470(uses)X2628(the)X2746(same)X2931(page,)X3123(then)X3281(fewer)X3485(copies)X3710(may)X3868(be)X3964(required.)X3 f555 4749(3.2.)N715(Module)X997(Architecture)X1 f755 4872(The)N913(preceding)X1262(sections)X1552(described)X1892(modules)X2195(for)X2321(managing)X2669(the)X2799(transaction)X3183(log,)X3337(locks,)X3558(and)X3706(a)X3774(cache)X3990(of)X4089(shared)X555 4962(buffers.)N847(In)X938(addition,)X1244(we)X1362(need)X1538(to)X1624(provide)X1893(functionality)X2326(for)X2444(transaction)X2 f2819(begin)X1 f2997(,)X2 f3040(commit)X1 f3276(,)X3319(and)X2 f3458(abort)X1 f3654(processing,)X4040(necessi-)X555 5052(tating)N769(a)X837(transaction)X1221(manager.)X1570(In)X1669(order)X1871(to)X1965(arbitrate)X2265(concurrent)X2641(access)X2879(to)X2973(locks)X3173(and)X3320(buffers,)X3599(we)X3724(include)X3991(a)X4058(process)X555 5142(management)N995(module)X1264(which)X1489(manages)X1799(a)X1864(collection)X2209(of)X2305(semaphores)X2713(used)X2889(to)X2980(block)X3187(and)X3332(release)X3585(processes.)X3962(Finally,)X4237(in)X555 5232(order)N752(to)X841(provide)X1113(a)X1176(simple,)X1436(standard)X1735(interface)X2044(we)X2165(have)X2344(modi\256ed)X2655(the)X2780(database)X3084(access)X3317(routines)X3602(\()X3 f3629(db)X1 f3717(\(3\)\).)X3904(For)X4041(the)X4165(pur-)X555 5322(poses)N758(of)X850(this)X990(paper)X1194(we)X1313(call)X1453(the)X1575(modi\256ed)X1883(package)X2171(the)X3 f2293(Record)X2567(Manager)X1 f2879(.)X2943(Figure)X3176(one)X3316(shows)X3540(the)X3662(main)X3846(interfaces)X4183(and)X555 5412(architecture)N955(of)X1042(LIBTP.)X5 p%%Page: 5 510 s 10 xH 0 xS 1 f3 f1 f11 s1851 1520(log_commit)N2764 2077(buf_unpin)N2764 1987(buf_get)N3633 1408(buf_unpin)N3633 1319(buf_pin)N3633 1230(buf_get)N3 f17 s1163 960(Txn)N1430(M)X1559(anager)X2582(Record)X3040(M)X3169(anager)X1 Dt2363 726 MXY0 355 Dl1426 0 Dl0 -355 Dl-1426 0 Dl3255 1616 MXY0 535 Dl534 0 Dl0 -535 Dl-534 0 Dl2185 MX0 535 Dl535 0 Dl0 -535 Dl-535 0 Dl1116 MX0 535 Dl534 0 Dl0 -535 Dl-534 0 Dl726 MY0 355 Dl891 0 Dl0 -355 Dl-891 0 Dl1 f11 s2207 1297(lock)N2564 1386(log)N865(unlock_all)X1851 1609(log_unroll)N1650 2508 MXY0 178 Dl1605 0 Dl0 -178 Dl-1605 0 Dl1294 1616 MXY19 -30 Dl-19 11 Dl-20 -11 Dl20 30 Dl0 -535 Dl2319 2508 MXY-22 -30 Dl4 23 Dl-18 14 Dl36 -7 Dl-936 -357 Dl3277 2455(sleep_on)N1405 1616 MXY36 4 Dl-18 -13 Dl1 -22 Dl-19 31 Dl1070 -535 Dl2631 2508 MXY36 6 Dl-18 -14 Dl3 -22 Dl-21 30 Dl891 -357 Dl1426 2455(sleep_on)N3255 1884 MXY-31 -20 Dl11 20 Dl-11 19 Dl31 -19 Dl-535 0 Dl1554 2366(wake)N3277(wake)X
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -