📄 readme.htm
字号:
specify this flag if <code>use_transaction_log</code> flag is used, because
consistency will be provided by transaction mechanism in this case.
If <code>use_transaction_log</code> flag is not specified and
flag <code>fault_tolerant</code> flag is set, POST++ will not change
original file preserving its consistency. This is achieved either by
reading file in memory (if flag <code>no_file_mapping</code> is set)
or using copy-on-write pages protection. In last case attempt of
modification of mapped on file page will cause creation copy of the page
in system swap file. Method <code>flush()</code> will save in-memory
image of database to temporary file and then rename it to original file
using atomic operation. If <code>fault_tolerant</code> flag is not specified,
POST++ do in-place modification of database pages, providing maximal
application performance (because there is no overhead caused by
copying modified pages and saving database image in temporary file)
As far as modified pages are
not flushed to the disk immediately, some changes can be lost as a result
of system fault (the worst thing is that some modified pages can be saved
to the disk while other not - so database consistency can be violated).
<DT><B>do_garbage_collection</B><DD>
When this attribute is set POST++ will perform garbage collection in
storage during opening. The operation of collecting garbage
is combined with reference adjustment. Using garbage collection is always
more safer than manual memory deallocation (due to the problem of
hanging references), but explicit memory deallocation has less overhead.
Garbage collection in POST++ has one more advantage in comparison with
explicit deallocation: garbage collector performs utilization of
pages used for small objects. If there are no more allocated small
objects at the page then garbage collector will include this page
in the list of free pages. This is not done for explicit deallocation because
free cells for small objects are linked in chain and it is not so easier
to remove them from this chain (in case of garbage collector all chains
are reconstructed). Even if you are using explicit memory deallocation,
I suggest you to do time by time garbage collection to check for
reference consistency and absence of memory leaks
(<CODE>garbage_collection</CODE> method returns number of deallocated
objects and if you are sure that you have explicitly deallocate all
unreachable objects, then this number should be zero).
As far as garbage collector modifies all objects in the storage (set mark bit),
relink free objects in chains), running GC in transaction mode can be
time and disk space consuming operation (all pages from the file will be copied
to the transaction log file).
</DL><P>
You can specify maximal size for storage files by
<CODE>file::max_file_size</CODE> variable. If size of data file is less
than <CODE>file::max_file_size</CODE> and mode is not <CODE>read_only</CODE>,
then extra <CODE>size_of_file - file::max_file_size</CODE> bytes of
virtual space will be reserved after the file mapping region.
When storage size is extended (because of new objects allocation),
this pages will be committed (in Windows NT) and used. If size of file is
greater than <CODE>file::max_file_size</CODE> or <CODE>read_only</CODE> mode
is used, then size of mapped region is exactly the same as the file size.
Storage extension is not possible in the last case. In Windows I use
<CODE>GlobalMemoryStatus()</CODE> function to obtain information about
actually available virtual memory in the system and reduce
<CODE>file::max_file_size</CODE> to this value. Unfortunately I found no
portable call in Unix which can be used for the same purpose
(<CODE>getrlimit</CODE> doesn't return actual information about available
virtual memory for users process).<P>
Interface to object storage is specified in file
<A HREF = "storage.h">storage.h</A> and implementation can be found in
<A HREF = "storage.cxx">storage.cxx</A>. Operating system dependent part
of mapped on memory file is encapsulated within <CODE>file</CODE> class,
which definition is in <A HREF = "file.h">file.h</A> and implementation
in <A HREF = "file.cxx">file.cxx</A>.<P>
<H2><A NAME = "installation">Installation of POST++</A></H2>
Installation of POST++ is very simple. It is now checked for the following
platforms: Digital Unix, Linux, Solaris, Windows NT 4.0, Windows 95.
I expect no problems with most of all other new Unix dialect (AIX,
HP-UX 10, SCO...). Unfortunately I have no access to this systems.
At Windows I compiled POST++ by Microsoft Visual C++ 5.0 and Borland 5.02
compilers. Makefile for Visual C++ is <A HREF = "makefile.mvc">makefile</A>,
and makefile for Broland C++ is <A HREF = "makefile.mvc">makefile</A>.<P>
The only thing that you are needed to use POST++ is library
(<CODE>libstorage.a</CODE> at Unix and <CODE>storage.lib</CODE> at Windows).
This library can be produced by just issuing <CODE>make</CODE>
command. There is special <CODE>MAKE.BAT</CODE> for Microsoft Visual C++
which invokes <CODE>NMAKE</CODE> with <CODE>makefile.mvc</CODE> as input
(if your are using Borland either edit this file or invoke Borland
make by <CODE>make.exe -f makefile.bcc</CODE> command).<P>
Installation in Unix can be completed by copying POST++ library and header
files to some standard system catalogs. You should set proper values for
<code>INSTALL_LIB_PATH</code> and <code>INSTALL_INC_PATH</code>
variables in makefile and execute <code>make install</code> command.
Default value for <code>INSTALL_LIB_PATH</code> is <code>/usr/local/lib</code>
and for <code>INSTALL_INC_PATH</code> is <code>/usr/local/include</code>.
You can avoid copying of POST++ files in system catalogs by specifying
path to POST++ catalog to compiler and linker explicitly.<P>
<H2><A NAME = "classes">POST++ class library</A></H2>
POST++ contains definition of some persistent classes, which
can be used in your application and also are good examples of
developing classes for POST++. You can see that there are almost no POST
specific code in the implementation of these classes.
These classes include array, matrix, string, L2-list, hash table, AVL-tree,
R-tree, text object. R-tree provides fast access to spatial object
(object with spatial coordinates). Text object contains modification of
Boyer and Moore algorithm extended to search multiple patterns combined by
OR/AND relation. Definition of these classes can be found in following files:
<P>
<TABLE BORDER>
<TR BGCOLOR="#A0A0A0"><TH>Description</TH> <TH>Interface</TH> <TH>Implementation</TH></TR>
<TR><TD>Arrays of scalars and references, matrixes and strings</TD>
<TD><A HREF = "array.h">array.h</A></TD>
<TD><A HREF = "array.cxx">array.cxx</A></TD></TR>
<TR><TD>L2-list and AVL-tree</TD>
<TD><A HREF = "avltree.h">avltree.h</A></TD>
<TD><A HREF = "avltree.cxx">avltree.cxx</A></TD></TR>
<TR><TD>Hash table with collision chains</TD>
<TD><A HREF = "hashtab.h">hashtab.h</A></TD>
<TD><A HREF = "hashtab.cxx">hashab.cxx</A></TD></TR>
<TR><TD>R-tree with quadratic method of nodes splitting</TD>
<TD><A HREF = "rtree.h">rtree.h</A></TD>
<TD><A HREF = "rtree.cxx">rtree.cxx</A></TD></TR>
<TR><TD>T-tree (combination of AVL tree and array)</TD>
<TD><A HREF = "ttree.h">ttree.h</A></TD>
<TD><A HREF = "ttree.cxx">ttree.cxx</A></TD></TR>
<TR><TD>Text object with modified Boyer and Moore search algorithm</TD>
<TD><A HREF = "textobj.h">textobj.h</A></TD>
<TD><A HREF = "textobj.cxx">textobj.cxx</A></TD></TR>
</TABLE><P>
In the article "A study of index structures for main memory database management
systems" T.J. Lehman and M.J Carey proposed T-trees as a storage efficient data
structure for main memory databases. T-trees are based on AVL trees proposed
by Adelson-Velsky and Landis. In this subsection, we provide an overview of
T-trees as implemented in POST++.<P>
Like AVL trees, the
height of left and right subtrees of a T-tree may differ by at most one.
Unlike AVL trees, each node in a T-tree stores multiple key values in a sorted
order, rather than
a single key value. The left-most and the right-most key value in a node
define the range of key values contained in the node. Thus, the left subtree
of a node contains only key values less than the left-most key value, while the
right subtree contains key values greater than the right-most key value in the
node. A key value which is falls between the smallest and largest key values in
a node is said to be <I>bounded</I> by that node.
Note that keys equal to the smallest
or largest key in the node may or may not be considered to be bounded based on
whether the index is unique and based on the search condition (e.g.
"greater-than" versus "greater-than or equal-to").<P>
A node with both a left and
a right child is referred to as an <I>internal node</I>,
a node with only one child is referred to as a <I>semi-leaf</I>,
and a node with no children is referred to as a <I>leaf</I>.
In order to keep occupancy high, every internal node has a minimum number
of key values that it must contain (typically <I>k-2</I>, if <I>k</I>
is the maximum number
of keys that can be stored in a node). However, there is no occupancy condition
on the leaves or semi-leaves.<P>
Searching for a key value in a T-tree is
relatively straightforward. For every node, a check is made to see if the key
value is bounded by the left-most and the right-most key value in the node; if
this is the case, then the key value is returned if it is contained in the node
(else, the key value is not contained in the tree). Otherwise, if the key value
is less than the left-most key value, then the left child node is searched;
else the right child node is searched. The process is repeated until either the
key is found or the node to be searched is null.<P>
Insertions and deletions into
the T-tree are a bit more complicated. For insertions, first a variant of the
search described above is used to find the node that bounds the key value to
be inserted. If such a node exists, then if there is room in the node, the key
value is inserted into the node. If there is no room in the node, then the key
value is inserted into the node and the left-most key value in the node is
inserted into the left subtree of the node (if the left subtree is empty, then
a new node is allocated and the left-most key value is inserted into it). If no
bounding node is found then let <I>N</I> be the last node encountered by the
failed search and proceed as follows: If <I>N</I>
has room, the key value is inserted into <I>N</I>;
else, it is inserted into a new node that is either the right or left child of
<I>N</I>, depending on the key value and the left-most and right-most key
values in
<I>N</I>.<P>
Deletion of a key value begins by determining the node containing the key
value, and the key value is deleted from the node. If deleting the key value
results in an empty leaf node, then the node is deleted. If the deletion
results in an internal node or semi-leaf containing fewer than the minimum
number of key values, then the deficit is made up by moving the largest key in
the left subtree into the node, or by merging the node with its right child.
<P>
In both insert and delete, allocation/deallocation of a node may cause the
tree
to become unbalanced and rotations (RR, RL, LL, LR) may need
to be performed. (The heights of subtrees in the following description include
the effects of the insert or delete.) In the case of an insert, nodes along
the path from the newly allocated node to the root are examined until either
<OL>
<LI>a node for which the two subtrees have equal heights is found
(in this case no rotation needs to be performed), or
<LI>a node for which the difference in
heights between the left and the right subtrees is more than one is found and a
single rotation involving the node is performed.
</OL><P>
In the case of delete, nodes
along the path from the de-allocated node's parent to the root are examined
until a node is found whose subtrees' heights now differ by one. Furthermore,
every time a node whose subtrees' heights differ by more than one is
encountered, a rotation is performed. Note that de-allocation of a node may
result in multiple rotations.<P>
There are several test programs for testing classes from POST++ persistent
class library, which are included in default make target:<P>
<TABLE BORDER WIDTH=80%>
<TR BGCOLOR="#A0A0A0"><TH>Program</TH> <TH>Tested classes</TH></TR>
<TR><TD><A HREF = "testtree.cxx">testtree.cxx</A></TD>
<TD>AVL-tree, l2-node, hash table</TD></TR>
<TR><TD><A HREF = "testtext.cxx">testtext.cxx</A></TD>
<TD>text, string</TD></TR>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -