📄 file_organization

📁 国外的数据结构与算法分析用书
💻
字号:
                    File Organization

The main class of this cluster is BTREE_FILE_UOS [KEY_TYPE,
ITEM_TYPE]. This class is a P_KEYED_BASIC_DICTIONARY_UOS and so
has the standard features of the basic dictionary class; namely,
is_empty, insert, delete, has and obtain. In addition, its main
features include the following:
 - make: a creation procedure with arguments to supply the key
    attributes for the file.  The arguments for this call are
        - d : the order of the B-tree associated with the file
        - data_file_name: the name/path of the data file
        - node_file_name: the name/path of the node file
        - block_size: the maximum number of bytes that will ever
          be needed for the storage of a block in the data file.
          A value of about 250 is needed when both the key and
          item have type INTEGER.  For types that require more
          bytes, larger values will be needed.
        - node_size: the maximum number of bytes that will ever
          be needed for the storage of an interior node in the
          data file.  A value of about 1250 is needed when the
          tree order is 1, and both the key and item have type
          INTEGER.  For a larger order and/or types that require
          more bytes, a larger value will be needed.

 - open: a creation procedure with one argument to supply the name
    of an existing data file.  The other key attributes for the
    file are stored as the first record of the data file.  This
    record is read in order to obtain the attributes for the
    class.

 - close_file: a procedure to close the file.  It actually closes
    both the data file and the node file.

 - sequential_out: print out the values in the file in sequential 
    order, i.e., in increasing order by key.

 - serial_out: print out the values in the file in the order that
    they are stored on disk.

 - order: the order of the B-tree associated with the file.


 Associated with this class are two actual files, called the data
 file and the node file.

 Now consider the contents of the data file. As has already been
 stated, the first record of the data file is a header record
 which contains the key attributes of the file. This header record
 is an instance of the class BTREE_FILE_HEADER_UOS.  After this
 header record, the data file consists of a sequence of blocks,
 i.e., instances of BLOCK_UOS.  Each block is an array of
 instances of the class RECORD_UOS.  Each record consists of a
 (key, item) pair, where the key has type KEY_TYPE, and the item
 has type ITEM_TYPE.  The size of the array is blocking_factor,
 where for this particular implementation, blocking_factor is the
 constant 3.

Associated with a B-tree file, there must be a B-tree.  The nodes
for this tree are stored in the node file.  Thus, the contents of
this file is a sequence of objects of type BTREE_INTERIOR_UOS.
Each interior node has two arrays, a key array and an addresses
array. The key array has size 2*order, and stores the keys of the
interior node.  The address array has size 2*order+1.  If the
children of the node are not leaves, i.e., they are interior
nodes, then each address of the address array is the address of
the associated child node in the node file.  When the child node
is needed for further processing, the data for the child node is
read from the node file to form the interior node.  When the
children of the node are leaves, then each address is the address
of a block in the data file.  Whenever the leaf node is needed,
the information is read and the leaf node object is built.


 Possible exercises:

 1. Modify the classes so that the blocking factor can be varied.
 In particular, the blocking factor should be another argument
 when a file is create via the make instruction.

 2. The BTREE_FILE_UOS class is presently a basic dictionary.
 Extend the class so that it is a full dictionary, i.e., it inherits
 from P_KEYED_DICTIONARY_UOS.  The key part of this assignment is
 the addition of the linear iterator capabilities to traverse the
 items in the data file in key order.

 3. An extension of B-trees is B+trees.  In a B+tree, each block
 has a reference to the next block in key order.  Thus, a
 sequential scan of the blocks in the data file can be done
 without using the B+tree or the node file.  Develop a descendant
 system that does this.  Use the B+ links in order to implement
 sequential_out.  Optional, add an inheritance of the linear
 iterator and use the B+ links to iterate through the items in
 key order without using the node file.  Is it possible to update
 the item associated with a key, insert a new key-item pair, or delete
 an item during such a traversal?
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -