📄 chap19.htm

📁 美国麻省理工的算法导论,英文版,看后程序员宏观的能力培养有好处
💻 HTM
📖 第 1 页 / 共 4 页
字号:
Show all legal B-trees of minimum degree 2 that represent {1, 2, 3, 4, 5}.<P>
<a name="0857_162a">19.1-4<a name="0857_162a"><P>
Derive a tight upper bound on the number of keys that can be stored in a B-tree of height <I>h</I> as a function of the minimum degree <I>t</I>.<P>
<a name="0857_162b">19.1-5<a name="0857_162b"><P>
<a name="0857_1624"><a name="0857_1625">Describe the data structure that would result if each black node in a red-black tree were to absorb its red children, incorporating their children with its own.<P>
<P>


<P>







<h1><a name="0858_0001">19.2 Basic operations on B-trees<a name="0858_0001"></h1><P>
In this section, we present the details of the operations B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT>, B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>CREATE</FONT>, and B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>INSERT</FONT>. In these procedures, we adopt two conventions:<P>
<FONT FACE="Courier New" SIZE=2><img src="../images/dot12.gif"></FONT>     The root of the B-tree is always in main memory, so that a <FONT FACE="Courier New" SIZE=2>DISK</FONT>-<FONT FACE="Courier New" SIZE=2>READ</FONT> on the root is never required; a <FONT FACE="Courier New" SIZE=2>DISK</FONT>-<FONT FACE="Courier New" SIZE=2>WRITE</FONT> of the root is required, however, whenever the root node is changed.<P>
<FONT FACE="Courier New" SIZE=2><img src="../images/dot12.gif"></FONT>     Any nodes that are passed as parameters must already have had a <FONT FACE="Courier New" SIZE=2>DISK</FONT>-<FONT FACE="Courier New" SIZE=2>READ</FONT> operation performed on them.<P>
The procedures we present are all &quot;one-pass&quot; algorithms that proceed downward from the root of the tree, without having to back up.<P>





<h2>Searching a B-tree</h2><P>
<a name="0859_1626"><a name="0859_1627">Searching a B-tree is much like searching a binary search tree, except that instead of making a binary, or &quot;two-way,&quot; branching decision at each node, we make a multiway branching decision according to the number of the node's children. More precisely, at each internal node <I>x</I>, we make an (<I>n</I>[<I>x</I>] + 1)-way branching decision.<P>
B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT> is a straightforward generalization of the <FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT> procedure defined for binary search trees. B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT> takes as input a pointer to the root node <I>x</I> of a subtree and a key <I>k</I> to be searched for in that subtree. The top-level call is thus of the form B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT>(<I>root</I>[<I>T</I>], <I>k</I>). If <I>k</I> is in the B-tree, B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT> returns the ordered pair (<I>y,i</I>) consisting of a node <I>y</I> and an index <I>i</I> such that <I>key<SUB>i</I></SUB>[<I>y</I>]<I> = k</I>. Otherwise, the value <FONT FACE="Courier New" SIZE=2>NIL</FONT> is returned.<P>
<pre><a name="0859_1628">B-TREE-SEARCH(<I>x, k</I>)</sub></sup></pre><P>
<pre>1  <I>i</I> <img src="../images/arrlt12.gif"> 1</sub></sup></pre><P>
<pre>2  <B>while</B> <I>i</I> <img src="../images/lteq12.gif"> <I>n</I>[<I>x</I>] and <I>k</I> <img src="../images/gteq.gif"> <I>key<SUB>i</I></SUB>[<I>x</I>]</sub></sup></pre><P>
<pre>3       <B>do</B> <I>i</I> <img src="../images/arrlt12.gif"> <I>i</I> + 1</sub></sup></pre><P>
<pre>4  <B>if</B> <I>i</I> <img src="../images/lteq12.gif"> <I>n</I>[<I>x</I>] and <I>k = key<SUB>i</I></SUB>[<I>x</I>]</sub></sup></pre><P>
<pre>5      <B>then return</B> (<I>x, i</I>)</sub></sup></pre><P>
<pre>6  <B>if</B> <I>leaf </I>[<I>x</I>]</sub></sup></pre><P>
<pre>7      <B>then return</B> NIL</sub></sup></pre><P>
<pre>8      <B>else</B> DISK-READ(<I>c<SUB>i</I></SUB>[<I>x</I>])</sub></sup></pre><P>
<pre>9           <B>return</B> B-TREE-SEARCH(<I>c<SUB>i</I></SUB>[<I>x</I>]<I>, k</I>)</sub></sup></pre><P>
Using a linear-search procedure, lines 1-3 find the smallest <I>i</I> such that <I>k</I> <img src="../images/lteq12.gif"> <I>key<SUB>i</I></SUB>[<I>x</I>], or else they set <I>i</I> to <I>n</I>[<I>x</I>] + 1. Lines 4-5 check to see if we have now discovered the key, returning if we have. Lines 6-9 either terminate the search unsuccessfully (if <I>x</I> is a leaf) or recurse to search the appropriate subtree of <I>x</I>, after performing the necessary <FONT FACE="Courier New" SIZE=2>DISK</FONT>-<FONT FACE="Courier New" SIZE=2>READ</FONT> on that child.<P>
Figure 19.1 illustrates the operation of B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT>; the lightly shaded nodes are examined during a search for the key <I>R</I>.<P>
<a name="0859_1629">As in the <FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT> procedure for binary search trees, the nodes encountered during the recursion form a path downward from the root of the tree. The number of disk pages accessed by B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SEARCH</FONT> is therefore <img src="../images/bound.gif">(<I>h</I>) = <img src="../images/bound.gif">(log<I><SUB>t</I></SUB> <I>n</I>), where <I>h</I> is the height of the B-tree and <I>n</I> is the number of keys in the B-tree. Since <I>n</I>[<I>x</I>] &lt; 2<I>t</I>, the time taken by the <B>while</B> loop of lines 2-3 within each node is <I>O</I>(<I>t</I>), and the total CPU time<I> </I>is<I> O</I>(<I>th</I>) = <I>O</I>(<I>t</I> log<I><SUB>t</SUB> n</I>).<P>
<P>







<h2>Creating an empty B-tree</h2><P>
<a name="085a_162a"><a name="085a_162b"><a name="085a_162c">To build a B-tree <I>T</I>, we first use B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>CREATE</FONT> to create an empty root node and then call B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>INSERT</FONT> to add new keys. Both of these procedures use an auxiliary procedure <FONT FACE="Courier New" SIZE=2>ALLOCATE</FONT>-<FONT FACE="Courier New" SIZE=2>NODE</FONT>, which allocates one disk page to be used as a new node in <I>O</I>(1) time. We can assume that a node created by <FONT FACE="Courier New" SIZE=2>ALLOCATE</FONT>-<FONT FACE="Courier New" SIZE=2>NODE</FONT> requires no <FONT FACE="Courier New" SIZE=2>DISK</FONT>-<FONT FACE="Courier New" SIZE=2>READ</FONT>, since there is as yet no useful information stored on the disk for that node.<P>
<pre>B-TREE-CREATE(<I>T</I>)</sub></sup></pre><P>
<pre>1  <I>x</I> <img src="../images/arrlt12.gif"> ALLOCATE-NODE()</sub></sup></pre><P>
<pre>2  <I>leaf</I>[<I>x</I>] <img src="../images/arrlt12.gif"> TRUE</sub></sup></pre><P>
<pre>3  <I>n</I>[<I>x</I>] <img src="../images/arrlt12.gif"> 0</sub></sup></pre><P>
<pre>4  DISK-WRITE(<I>x</I>)</sub></sup></pre><P>
<pre>5  <I>root</I>[<I>T</I>] <img src="../images/arrlt12.gif"> <I>x</I></sub></sup></pre><P>
B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>CREATE</FONT> requires <I>O</I>(1) disk operations and <I>O</I>(1) CPU time.<P>
<img src="389_a.gif"><P>
<h4><a name="085a_162d">Figure 19.5 Splitting a node with t = 4. Node y is split into two nodes, y and z, and the median key S of y is moved up into y's parent.<a name="085a_162d"></sub></sup></h4><P>
<P>







<h2>Splitting a node in a B-tree</h2><P>
<a name="085b_162d"><a name="085b_162e"><a name="085b_162f"><a name="085b_1630">Inserting a key into a B-tree is significantly more complicated than inserting a key into a binary search tree. A fundamental operation used during insertion is the <I><B>splitting</I></B> of a full node <I>y</I> (having 2<I>t</I> - 1 keys) around its <I><B>median key</I></B> <I>key<SUP>t</I></SUP>[<I>y</I>] into two nodes having <I>t</I> - 1 keys each. The median key moves up into<I> y</I>'s parent--which must be nonfull prior to the splitting of <I>y</I>--to identify the dividing point between the two new trees; if <I>y</I> has no parent, then the tree grows in height by one. Splitting, then, is the means by which the tree grows.<P>
The procedure B-<FONT FACE="Courier New" SIZE=2>TREE</FONT>-<FONT FACE="Courier New" SIZE=2>SPLIT</FONT>-<FONT FACE="Courier New" SIZE=2>CHILD</FONT> takes as input a <I>nonfull</I> internal node <I>x</I> (assumed to be in main memory), an index<I> i</I>, and a node <I>y</I> such that <I>y</I> = <I>c<SUB>i</I></SUB>[<I>x</I>] is a <I>full</I> child of <I>x</I>. The procedure then splits this child in two and adjusts <I>x</I> so that it now has an additional child.<P>
Figure 19.5 illustrates this process. The full node <I>y</I> is split about its median key <I>S</I>, which is moved up into <I>y'</I>s parent node <I>x</I>. Those keys in<I> y</I> that are greater than the median key are placed in a new node <I>z</I>, which is made a new child of <I>x</I>.<P>
<pre><a name="085b_1631">B-TREE-SPLIT-CHILD(<I>x,i,y</I>)</sub></sup></pre><P>
<pre>1  z <img src="../images/arrlt12.gif"> ALLOCATE-NODE()</sub></sup></pre><P>
<pre>2  l<I>eaf</I>[<I>z</I>] <img src="../images/arrlt12.gif"> leaf<I>[</I>y<I>]</I></sub></sup></pre><P>
<pre>3  <I>n</I>[<I>z</I>] <img src="../images/arrlt12.gif"> t - <I>1</I></sub></sup></pre><P>
<pre>4  <B>for</B> <I>j </I><img src="../images/arrlt12.gif"> 1 <B>to</B> <I>t</I> <I>- </I>1</sub></sup></pre><P>
<pre>5        <B>do</B><I> key<SUP>j</I></SUP>[<I>z</I>]<I> </I><img src="../images/arrlt12.gif"> <I>key<SUP>j</I></SUP><SUB>+</SUB><I><SUP>t</I></SUP>[<I>y</I>]</sub></sup></pre><P>
<pre>6  <B>if</B> not<I> leaf </I>[<I>y</I>]</sub></sup></pre><P>
<pre>7      <B>then for</B> <I>j </I><img src="../images/arrlt12.gif"> 1 <B>to</B><I> t</I></sub></sup></pre><P>
<pre>8<B>                do</B> c<SUB>j</SUB>[z] <img src="../images/arrlt12.gif"> c<SUB>j+t</SUB>[y]</sub></sup></pre><P>
<pre>9  <I>n</I>[<I>y</I>] <img src="../images/arrlt12.gif"> t - <I>1</I></sub></sup></pre><P>
<pre>10  <B>for</B> <I>j </I><img src="../images/arrlt12.gif"> n<I>[</I>x<I>] + 1 <B>downto</B> </I>i<I> + 1</I></sub></sup></pre><P>
<pre>11       <B>do</B> c<SUB>j+1</SUB>[x] <img src="../images/arrlt12.gif"> c<SUB>j</SUB>[<I>x</I>]</sub></sup></pre><P>
<pre>12  <I>c<SUB>i</I>+1</SUB>[<I>x</I>] <img src="../images/arrlt12.gif"> z</sub></sup></pre><P>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -