📄 file_organization
字号:
File Organization
The main class of this cluster is BTREE_FILE_UOS [KEY_TYPE,
ITEM_TYPE]. This class is a P_KEYED_BASIC_DICTIONARY_UOS and so
has the standard features of the basic dictionary class; namely,
is_empty, insert, delete, has and obtain. In addition, its main
features include the following:
- make: a creation procedure with arguments to supply the key
attributes for the file. The arguments for this call are
- d : the order of the B-tree associated with the file
- data_file_name: the name/path of the data file
- node_file_name: the name/path of the node file
- block_size: the maximum number of bytes that will ever
be needed for the storage of a block in the data file.
A value of about 250 is needed when both the key and
item have type INTEGER. For types that require more
bytes, larger values will be needed.
- node_size: the maximum number of bytes that will ever
be needed for the storage of an interior node in the
data file. A value of about 1250 is needed when the
tree order is 1, and both the key and item have type
INTEGER. For a larger order and/or types that require
more bytes, a larger value will be needed.
- open: a creation procedure with one argument to supply the name
of an existing data file. The other key attributes for the
file are stored as the first record of the data file. This
record is read in order to obtain the attributes for the
class.
- close_file: a procedure to close the file. It actually closes
both the data file and the node file.
- sequential_out: print out the values in the file in sequential
order, i.e., in increasing order by key.
- serial_out: print out the values in the file in the order that
they are stored on disk.
- order: the order of the B-tree associated with the file.
Associated with this class are two actual files, called the data
file and the node file.
Now consider the contents of the data file. As has already been
stated, the first record of the data file is a header record
which contains the key attributes of the file. This header record
is an instance of the class BTREE_FILE_HEADER_UOS. After this
header record, the data file consists of a sequence of blocks,
i.e., instances of BLOCK_UOS. Each block is an array of
instances of the class RECORD_UOS. Each record consists of a
(key, item) pair, where the key has type KEY_TYPE, and the item
has type ITEM_TYPE. The size of the array is blocking_factor,
where for this particular implementation, blocking_factor is the
constant 3.
Associated with a B-tree file, there must be a B-tree. The nodes
for this tree are stored in the node file. Thus, the contents of
this file is a sequence of objects of type BTREE_INTERIOR_UOS.
Each interior node has two arrays, a key array and an addresses
array. The key array has size 2*order, and stores the keys of the
interior node. The address array has size 2*order+1. If the
children of the node are not leaves, i.e., they are interior
nodes, then each address of the address array is the address of
the associated child node in the node file. When the child node
is needed for further processing, the data for the child node is
read from the node file to form the interior node. When the
children of the node are leaves, then each address is the address
of a block in the data file. Whenever the leaf node is needed,
the information is read and the leaf node object is built.
Possible exercises:
1. Modify the classes so that the blocking factor can be varied.
In particular, the blocking factor should be another argument
when a file is create via the make instruction.
2. The BTREE_FILE_UOS class is presently a basic dictionary.
Extend the class so that it is a full dictionary, i.e., it inherits
from P_KEYED_DICTIONARY_UOS. The key part of this assignment is
the addition of the linear iterator capabilities to traverse the
items in the data file in key order.
3. An extension of B-trees is B+trees. In a B+tree, each block
has a reference to the next block in key order. Thus, a
sequential scan of the blocks in the data file can be done
without using the B+tree or the node file. Develop a descendant
system that does this. Use the B+ links in order to implement
sequential_out. Optional, add an inheritance of the linear
iterator and use the B+ links to iterate through the items in
key order without using the node file. Is it possible to update
the item associated with a key, insert a new key-item pair, or delete
an item during such a traversal?
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -