📄 ezdsl.doc
字号:
======================================================================
EZDSL
version 3.00
Easy classical data structures for Delphi 1, 2 and 3
Copyright (c) 1993,1996 Julian M. Bucknall
======================================================================
Introduction
----------------------------------------------------------------------
The EZDSL units provide an OOP interface for classical data structures
for Delphi: stacks, queues, lists, binary trees and so forth. For
programmers migrating from BP7 a TCollection replacement is also
provided.
My objective in writing these units was to provide myself with a set
of well-encapsulated classes that would manage the tedium of using
these types of data structures, leaving me to concentrate on coding
my latest program. Having used Borland's TCollection from Turbo
Vision and OWL a great deal, I also wanted to mimic as far as I could
the functionality and naming conventions contained within that
object, without necessarily restricting myself by doing so.
This library was rewritten from my EZSTRUCS library that I first
released in August 1994. It was not a simple port; I was determined to
use some of the whizz new syntax of Delphi: exceptions, virtual
constructors, properties and so on. The first version of EZDSL
appeared in 1995, this later version was ported to 32 bits for the
Delphi 2.0 compiler; of course making sure that it could still be
compiled with Delphi 1.0 in 16 bits.
Within these notes I shall not spend too much time describing how the
data structures work and what they're for; any data structures book
will be able to provide that kind of information easily. The three I
have made a great deal of use of (and that I recommend heartily) are
Algorithms, by Robert Sedgewick; Addison-Wesley
Data Structures, Algorithms and Performance, by Derek Wood;
Addison-Wesley
Practical Data Structures in C++, by Bryan Flamig; Wiley
Also, the three volumes of The Art of Computer Programming by Knuth
belong on every serious programmer's shelf, although they are
starting to look their age as far as the programming examples go.
Also these units require that you have a fairly good grounding in
using pointers, objects and classes in Delphi, including typecasting.
If you need to, please review your Delphi documentation on how to use
them.
Container classes and Data objects
----------------------------------------------------------------------
The EZDSLxxx units define a set of data object containers. These
are kinds of containers into which you throw a bunch of data 'things'
and retrieve them at a later stage. The data 'things' can be
anything: integers, strings, records, object instances, or whatever;
we'll use the generic term data object. The different kinds of
containers each have different characteristics, some will keep the
data objects sorted in some order, some will give you fast serial
access times, some will just give you them back in the same order you
put them in. Some will allow you to navigate through the structure
(and will provide a classic iterator method), some do not.
The containers can be separated into two types: those that own their
data objects, and those that do not. Being a data owner means that a
container will destroy its data objects when it is emptied or itself
destroyed. This distinction means that you could create two skip
lists for example, one a data owner and the other not, each with a
different sort sequence. You could then insert the same set of data
objects into both and know that when you call the Empty method for
both lists, the data objects will get destroyed only once.
As you look at the code for the containers and at this documentation,
you'll see that the hierarchy tree is fairly flat (or narrow if you
look at it vertically):
TAbstractContainer (EZDSLBSE.PAS)
+--TStack (EZDSLSTK.PAS)
+--TQueue (EZDSLQUE.PAS)
| +--TDeque (EZDSLQUE.PAS)
+--TPriorityQueue (EZDSLPQU.PAS)
+--TLinkList (EZDSLLST.PAS)
+--TDList (EZDSLDBL.PAS)
+--TSkipList (EZDSLSKL.PAS)
+--TBinTree (EZDSLBTR.PAS)
| +--TBinSearchTree (EZDSLBTR.PAS)
| +--TrbSearchTree (EZDSLBTR.PAS)
+--TEZCollection (EZDSLCOL.PAS)
+--TEZSortedCollection (EZDSLCOL.PAS)
+--TEZStringCollection (EZDSLCOL.PAS)
+--TEZStrZCollection (EZDSLCOL.PAS)
There is a single ancestor class that provides the important common
methods and fields (allocating/freeing a node, whether a container is
empty) and the base of the virtual methods hierarchy (Compare,
DisposeData, et al). Most of the other classes are descended directly
from this, with only a few further descendants. This is contrary to
lots of other container hierarchies where the author has endeavoured
to arrange things so that everything descends from one another all the
way back to a doubly linked list or something. I felt that this type
of scheme was too restricting: if I found a better implementation of a
list (and I think it can be done better with TEZCollections for
example) I don't have to worry too much about breaking up my hierarchy
(so long as I interface the same methods). The other thing is that
deriving a stack from a linked list for example means that not only do
you get the Push and Pop methods but you also get a whole slew of
linked list methods appearing as well (Prev, Next, Insert, Delete or
whatever).
However I have tried to be consistent in my naming convention so that
for example, there is a method called Next for a TLinkList, TDList and
TSkipList but it works in different ways for each (and has a different
calling interface) - the Next method is not virtual.
The EZDSL library has been written assuming that the data objects are
pointers to something. Like TCollections from BP7, this makes it easy
to have the containers have lots of different kinds of objects and to
be able to deal with them properly. However, there is nothing to stop
you using pointers to strings or even plain longints instead; just
typecast to a pointer when required (and typecast back again when the
object is retreived from the container!). For example when using a
stack you could use integers as follows:
var
MyStack : TStack;
begin
MyStack.Init;
with MyStack do
begin
Push(pointer(19));
Push(pointer(57));
writeln(longint(Pop)); {outputs: 57}
writeln(longint(Pop)); {outputs: 19}
end;
MyStack.Done;
end;
Or you could write a container that would wrap up this slight
awkwardness and hide it from you. See the example programs EXINTQUE
and EXSTRSTK.
Data Objects and Polymorphism
----------------------------------------------------------------------
The first version of the EZSTRUCS unit (on which this library is
based) had virtual methods called Compare, DisposeData and DupData.
However, I found that often the ONLY methods I was overriding were...
Compare, DisposeData and DupData! The data container worked fine 'out
of the box', it was such a pain to override yet another class just to
alter one of these methods.
I had two solutions to my perceived problem: make the data objects
derive from an ancestor object class or make the methods function
pointers. The first was out (I find it awkward), but the second was
just what I wanted. In fact Delphi itself is rife with this kind of
delegation model.
So, the data object related methods in the containers are in fact
procedure and function pointers held as properties in the container.
You set them when you initialise a new container, and they will be
used whenever two data objects need to be compared, or a data object
needs to be disposed of or you need to duplicate a data object.
To set them you first create your new container object by calling the
Create constructor. This sets the internal procedure and function
pointers of the object to 'do nothing' routines. Then you set the
actual routines you want to use by modifying the required properties:
Compare, DisposeData and DupData. EZDSL provides the ones used most
often in EZDSLSUP.PAS: comparison between two longints, between two
strings (short strings in Delphi 2/3), disposing and duplicating the
same. Please note that it doesn't make any sense to alter these
routines when the container is not empty and so setting these
properties in that case do nothing. In particular you cannot suddenly
change the order of data objects in a sorted container by changing the
Compare property that the container uses.
The Compare, DisposeData and DupData routines that you write *must*
be global, declared 'far' in Delphi 1.0 and declared using the normal
Pascal fastcall calling convention (viz 'register') in Delphi 2/3. In
other words
(a) they cannot be nested routines;
(b) in Delphi 1.0 they cannot be declared implicitly or explicitly
as near routines; and
(c) in Delphi 2.0/3.0 cannot be declared 'cdecl' or 'stdcall'.
They must also follow the routine prototypes TCompareFunc,
TDisposeDataProc and TDupDataFunc respectively.
To facilitate cloning containers with the most flexibility, you can
clone a container and use a different Compare function than the
container you are cloning from (useful for maintaining trees or lists
with their data objects in two different orders).
Sorted Containers, Compare and Duplicate Data Objects
----------------------------------------------------------------------
A number of the containers in EZDSL maintain their data objects in
some kind of ordered sequence. The sequence that they are maintained
in is defined by the container's Compare property.
Some containers are automatically and always 'sorted': skip lists
and binary search trees are two examples. Some are never sorted:
stacks and queues for example. Some containers can be used sorted or
unsorted: examples are linked lists or double linked lists.
The Compare property is a function that you write (or is one of the
supplied functions). The routine must compare two data objects and
return a negative number (eg -1) if the first data object is 'less'
than the second, zero if they are equal, and a positive number (eg 1)
if the first is greater than the second. As described above you set
a container's Compare property when the container is empty. The
Compare function is used only by sorted containers or by unsorted
containers that implement a Search method. Unsorted containers that
do not have a Search method do not use the Compare property, and so in
these cases it does not have to be set.
Once you accept that certain containers maintain data objects in an
ordered sequence, you have to consider the problem of duplicate data
objects. Two data objects are defined as duplicate if the container's
Compare function returns zero when called with them as parameters.
Well, EZDSL enforces a strict rule: sorted containers *cannot*
contain duplicate data objects. All sorted containers will raise an
exception (with string code edsInsertDup) if you attempt to insert a
data object that compares equal to an already inserted data object.
You might have decided that this is a somewhat harsh restriction.
Well, yes and no. Some containers just cannot be used with duplicate
data objects (the skip list is the main example; if you try to, some
wierd effects will occur). Some containers will internally reorganise
themselves, and will destroy any arrival sequence that duplicate data
objects could have been inserted with (the red-black binary search
tree is the main example here). Rather than document that some
containers will accept duplicate items and some won't and if they did
what kind of wierd effects could happen on deletion and insertion, I
decided to restrict all sorted containers in the same manner: they
will not accept duplicate data objects.
What to do? Suppose you had the kind of use for a skip list that
absolutely had to accept duplicate data objects? Firstly I would ask
you to redesign your data object so that duplicates could not occur
and secondly recode your Compare function so that maybe another field
of the data object could be used in the comparison to remove the
ambiguity.
If all fails, one solution could be to declare a sequence long integer
in your container descendant class. Set this to zero in your
constructor. Make sure that your data objects have a sequence field,
and that your Compare function checks this after your main comparison.
Override the Insert method of the container to increment the
container's sequence field and set the data object's sequence field to
it, and then call the inherited insert method. This algorithm is
shown in the example program EXINSDUP.
As always with rules, there is one exception to the 'no duplicates'
rule: the priority queue. This container will accept duplicate data
objects. However duplicate data objects will *not* be popped off in
arrival sequence or, reverse arrival sequence, or any other
deterministic sequence; as far as the priority queue is concerned
duplicate data objects are exactly that, it cannot distinguish
between them in any way. Again if this matters to you, you'll HAVE to
redefine your data object and Compare function to remove the anomaly.
Nodes and Node Stores
----------------------------------------------------------------------
We all know that these classical data structures are generally
implemented with nodes. For a doubly linked list a node is usually
represented by a structure of the form:
PMyNode = ^TMyNode;
TMyNode = record
Next : PMyNode; {Link to next node in the chain}
Prev : PMyNode; {Link to previous node in the chain}
Data : SomeRecord;
end;
When writing the unit I wanted to get away from the dependance of
knowing about this typical node structure: I wanted to insert, push,
examine, delete or pop data objects without having to be continually
reminded of the underlying algorithm. Also, I didn't want to have to
descend my data objects from a TNode object - I would be forced to
always use TNode descendants and couldn't use PStrings or something
along those lines. Also, by divorcing myself from knowing about
(worrying about) the node structure, I was able to make numerous
economies in data storage and speed.
However, some of these data structures cry out for being able to
navigate through the structure. The navigation is performed by the
container class' methods. Sometimes where you are in the
structure is stored internally in the container object (an internal
cursor), sometimes you have explicit 'cursors' to help you move around
the container (external cursors). All containers have methods to move
these internal and external cursors around the object. Of course
stacks and queues and similar structures do not, you can only
reference the topmost object (they are inherently non-navigable).
Also because, the node structure is hidden I was able to implement a
node suballocator scheme (called a TNodeStore; see the source) to
help me and the containers manage blocks of nodes, rather than just
one at a time. Because the nodes are all the same size for a given
container type (with two exceptions: TSkipList and TEZCollection), we
can take advantage of this and speed up allocations and deallocations
of nodes, compared with using the heap. So if you look at the
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -