📄 libtour.html

📁 FastDb是高效的内存数据库系统
💻 HTML
📖 第 1 页 / 共 2 页
字号:
12 下一页
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 3.2//EN"><HTML><HEAD>   <TITLE></TITLE>   <META NAME="Author" CONTENT="Joe Hellerstein">   <META NAME="GENERATOR" CONTENT="Mozilla/3.0b5aGold (WinNT; I) [Netscape]"></HEAD><BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#FF0000" VLINK="#800080" ALINK="#0000FF"><H1>A Guided Tour of Sample Code<IMG SRC="gist.gif" HEIGHT=194 WIDTH=254 ALIGN=RIGHT></H1><CENTER><P><HR WIDTH="100%"></P></CENTER><P>The libGiST package is a C++ library that provides the basic GiST functionality.By itself it does nothing, since some of the important methods are <I>purevirtual</I>. However, with a little coding on your part, you can make itindex data in a variety of ways, and support just about any query.</P><P>We have provided some examples to get you started. &nbsp;The RTree directorycontains code which uses libGiST as the basis for a simple R-tree-likestorage/retrieval system that indexes 2-dimensional box data, and supportsnatural geographic queries. &nbsp;The RSTree directory provides the samefunctionality as the RTree directory, but uses the more efficient (andcomplex)&nbsp;R*-tree techniques invented by Beckman, et al. [BKSS90].The BTree directory contains code which uses libGiST to implement a simpleB+-tree storage/retrieval system that indexes textual data, to supportalphabetic equality and range queries.</P><P>As an introduction to libGiST, we will step through the header filesfor each of these small projects, which will introduce you to increasinglypowerful features of libGiST. We will go through the simplest project,RTree, in some detail, and then just describe the features of RSTree andBTree that are different from RTree. &nbsp;Probably the easiest way touse libGiST is to cannibalize some of this sample code, so it's worth lookingat carefully.</P><CENTER><P><HR WIDTH="100%"></P></CENTER><H2 ALIGN=CENTER>RTree</H2><P>R-trees were first proposed by Guttman [Gut84] to index geographic data(points, lines, and polygons), in order to efficiently find data that satisfygeographic predicates (i.e., <TT>key {contains,contained-in,overlaps,equals}constant</TT>). The keys in a (2D) R-tree are &quot;bounding boxes&quot;,i.e. rectangles which are aligned with the x and y axes. A key signifiesthat all objects stored below it fall within that geographic area. If youwant more detail on R-trees, Guttman's paper makes for fairly accessiblereading.</P><P>In order to extend libGiST to support R-tree behavior, we need fourof the basic GiST methods mentioned in the original GiST paper:</P><UL><LI>Consistent: compares a query rectangle and a key rectangle to see ifthey correspond in a way appropriate to the query predicate. One trickybit in R-trees is that entries on internal nodes are treated differentlythan entries at the leaves. (<TT>RTpredicate::Consistent(const RTpredicate&amp;)i</TT>n RTpredicate.h).</LI><LI>Union: given a set of geographic objects, finds the minimal boundingbox containing them all. &nbsp;(<TT>GiSTentry *RTnode::Union() </TT>inRTnode.h)</LI><LI>Penalty: returns a <TT>GiSTPenalty</TT> (just a <TT>double</TT>, really)representing the &quot;badness&quot; of inserting a new entry below thisentry -- for R-trees, this is measured by the increase in area of thisentry's bounding box required to accomodate the new entry. (<TT>GiSTpenalty*RTentry::Penalty(const GiSTentry)</TT> in RTentry.h)</LI><LI>PickSplit: Guttman proposes 3 different algorithms in his paper; weuse his polynomial-time algorithm in our implementation. ( <TT>GiSTnode*RTnode::PickSplit()</TT> in RTnode.h)</LI></UL><P>We will also need various utility methods like constructors and destructorsfor our RTree classes, which inherit most (but not all!) of their behaviorfrom libGiST. We won't talk about the constructors and destructors, whichare pretty straightforward, but the rest of the simple utility methodsare as follows (organized by file):</P><UL><LI>RT.h: contains the <TT>RT</TT> class for R-trees, which inherits fromthe <TT>GiST</TT> class of libGiST</LI><UL><LI><TT>GiSTobjid RT::IsA():&nbsp;</TT>Not strictly necessary, but usefulfor debugging. &nbsp;All classes which inherit from the GiSTobject classcan define an <TT>IsA</TT> method to identify themselves. The choice ofobject tags is in <TT>libGiST/GiSTdefs.h</TT> if you wish to extend it.</LI><LI><TT>GiSTnode *RT::CreateNode()</TT>:&nbsp;This constructor method is<I>critical</I>, as it ensures that our trees are actually instances ofthe RT class, and hence will use the appropriate RT methods. <B>N.B.: </B>Yourprograms will compile but <I>probably not work</I> if you don't defineconstructors and other basic allocation utilties for all your classes!!This is because libGiST makes heavy use of the virtual function mechanismof C++, which requires that objects be allocated within their appropriatesubclasses.</LI><LI><TT>GiSTstore *RT::CreateStore(): </TT>this method identifies and passescontrol to the storage subsystem of your choice. &nbsp;libGiST is distributedwith one storage subsystem, GiSTfile, which uses the file system to storeindices. &nbsp;Smarter storage systems (e.g. DBMSs of various sorts, orbuffer management packages) may be plugged in via inheritance to improvethe speed and/or functionality of your code.</LI></UL><LI>RTnode.h: defines the <TT>RTnode</TT> class for nodes in an R-tree,which inherits from <TT>GiSTnode</TT></LI><UL><LI><TT>GiSTobjid RTnode::IsA()</TT>: as above</LI><LI><TT>GiSTobject *RTnode::Copy()</TT>: as above, <I>crucial</I> to thecorrectness of your code.</LI><LI><TT>GiSTnode *RTnode::PickSplit()</TT> : One of the basic GiST methods.Note that the splitting is effected by actually deleting entries from <TT>*this</TT>,and putting them on the page which is returned by the method.</LI><LI><TT>GiSTentry *RTnode::Union()</TT>: another of the basic GiST methodsdescribed above</LI><LI><TT>GiSTentry *RTnode::CreateEntry():</TT> used to generate an entryof the appropriate subclass of <TT>GiSTentry</TT>, which in this case is<TT>RTentry</TT>. <I>&nbsp;Like the </I><TT>Copy</TT><I> methods, thisis crucial!</I></LI><LI>i<TT>nt RTnode::FixedLength()</TT>: tells libGiST two things -- thatthe keys in our nodes have a fixed length, and what that length is. &nbsp;Ifour entries were of variable length (as will be the case in BTree below),this method would return 0.</LI></UL><LI>RTentry.h: defines the <TT>RTentry </TT>clas for entries in an R-tree,which inherits from <TT>GiSTnode</TT>. Also defines the <TT>RTkey</TT>class, which represents bounding boxes. <B>N.B.:</B> The <TT>GiSTentry</TT>class contains a member <TT>GiSTobject *key</TT>, which is used to holdkeys. So our <TT>RTkey</TT> class must inherit from <TT>GiSTobject.</TT></LI><UL><LI><TT>RTkey::Copy()</TT>: again, <I>critical.</I></LI><LI>A variety of bounding-box methods, which are relatively self-explanatory.</LI><LI><TT>RTkey &amp;RTentry::Key()</TT>: casts the key member, which isof class <TT>GiSTobject </TT>to class <TT>RTkey.</TT> &nbsp;Used to preventcompiler warnings.</LI><LI><TT>GiSTobject *RTentry::Copy()</TT>: as above, you can't do withoutit.</LI><LI><TT>void RTentry::InitKey()</TT>: this method allocates a new key forthe entry -- like the <TT>Copy</TT> methods, it's crucial that it returnan instance of the appropriate key subclass (in this case, <TT>RTkey</TT>).</LI><LI><TT>GiSTpenalty * RTentry::Penalty(const GiSTentry &amp;newEntry)</TT>:as discussed above</LI><LI><TT>int RTentry::CompressedLength()</TT>: since libGiST supports compressedkeys (discussed in BTree below), this method is required to calculate thelength of this <I>key</I> (not the entry) when compressed. In this caseit's just <TT>sizeof(RTkey)</TT>.</LI><LI>A variety of other methods for accessing/setting properties of thekey.</LI></UL><LI>RTpredicate.h: Home of the <TT>RTpredicate</TT> class, which containsthe <TT>Consistent</TT> method. In addition, <TT>RTpredicate</TT> has twoimportant members, <TT>oper</TT> and <TT>value</TT>. &nbsp;The former distinguishesbetween the different box-comparison operators&nbsp;(<TT>Contains, Contained,Overlap, Equal)</TT> and the latter holds the bounding box to which we'recomparing.</LI></UL><P>A quick look at the .cpp files will show that there's little more inthem than what's described above -- the only method of any complexity isthe PickSplit algorithm, which is taken directly from Guttman's paper.</P><CENTER><P><HR WIDTH="100%"></P></CENTER>
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -