📄 kdtree.w,v

📁 Lin-Kernighan heuristic for the TSP and minimum weight perfect matching
💻 W,V
📖 第 1 页 / 共 5 页
字号:
	case EUC_2D:	case CEIL_2D:	case ATT:		return 1;	default:		return 0;	}}@@ We need the interface to the \module{READ} module.  In turn,it needs the \module{LENGTH} module.@@<Early module headers@@>=#include "length.h"#include "read.h"@@* $k$-d trees.A $k$-d search tree is a tree over some set of points in $k$-dimensionalreal-space, together with the partitioning properties given in the followingparagraphs.Each internal nodeof the tree is associated with one of the $k$ dimensions, called |cutdimen|,and also has a real-valued|cutvalue|.  All points in the subtree rooted at the |lo_child| have their|cutdimen|-dimension value being no greater than |cutvalue|, and allpoints in the subtree rooted at the |hi_child| have their value no lessthan |cutvalue|.  Each external node of the tree, known as a \term{bucket}@@^bucket@@>, contains allthe points that simultaneously satisfy all the constraints in the internal nodes onthe path from the bucket to the root.  These are stored as indices |lo| and |hi| into |perm|, which itself isan array holding a permutation of the point names.  The points inthat bucket are |perm[lo]| through |perm[hi-1]|.There are other embellishments to the data structure, be we are ready tostart defining it now.The short fields are placed at the end so all the other fields are word-aligned.  This may speed things up on some architectures.@@<Module type definitions@@>=typedef struct E2_node_s {	@@<Other common fields@@>@@;	union {		struct {			int cutdimen;			double cutvalue;			struct E2_node_s *lo_child, *hi_child;			@@<Other internal node fields@@>@@;		} i;		struct {			int lo, hi;			@@<Other external node fields@@>@@;		} e;	} f;	char is_bucket;	@@<Other common short fields@@>@@;} E2_node_t;@@  The permutation array |perm| is visible to everyone because we might makeuse of it elsewhere.For example, I have an idea that we might get better performance ifwe reorganize the data according to a space-filling curve.  This would move together in memory those points which are close togethergeographically.  I have some notions on a particular curve to follow,and have a hunch that it provides an optimal layout, given the kindof access pattern that Lin-Kernighan exhibits.  This would be a verynice theorem to prove.  It's ``future work''.I briefly mentioned this reorganization idea during my Qualifying Oral exam.But I didn't speculate on optimality$\ldots$@@<Global variables@@>=int *perm;@@ The maximum number of points allowedin a bucket,recorded in the variable|kd_bucket_cutoff|,is chosen to be a small positive integer.Bentley found thata value of 5 works well.  @@^Bentley, Jon Louis@@>Now, a cutoff of 5 works well on Bentley's implementation, whichconstitutes a machine (a VAX-8550),a compiler, and his program, \etc.Should we use a different value on a modern RISC machine with a differentcompiler and memory architecture?  We may want to experiment with this value, which is whywe put it in a variable and not in a compile-time constant.After a little bit of experimenting, 10 seems faster, reducing ``nq 10''times on my machine for pla7397 from 7.3 seconds down to 5.2 seconds.@@<Global variables@@>=int kd_bucket_cutoff=10;@@@@<Exported variables@@>=extern int kd_bucket_cutoff;@@ Now, Bentley showed that bottom-up accesses are much faster in practice@@^Bentley, Jon Louis@@>than top-down searches.In particular, his implementation performsan all-nearest-neigbhours computation ona point-set uniform in a square usingbottom-up accesses inlinear time, \ie, constant amortized time per lookup.  He presentssome theoretical arguments as to why things should turn out this way.In addition, Bentley's implementation takes only linear time to compute a nearest-neighbour tour on such distributions.Finding such a touris more involved than all-nearest-neighbours because, first,onehides each point as it acquires a neighbour, and second, at the end of thecomputation the unhidden points are likely to be far away both in theEuclidean metric and in the tree.In contrast, a top-down search takes at least logarithmic time on average:one must traverse the path from the root to the bucket containingthe point.  In a tree whose arity is bounded by a constant,the average depth of an item is at least logarithmic in the number of items.The upshot is that we put a parent pointer in each node.@@<Other common fields@@>=struct E2_node_s *parent;@@ We need a way to hide entire subtrees at a time.  This is done bysetting the |hidden| field to a true value.An invariant we must maintain is that if a node is hidden,then its children (if any) are also hidden.@@<Other common short fields@@>=#if !defined(KD_NO_HIDDEN_BIT)char hidden;#endif@@ In a bottom-up search that hopes visit a sub-logarithmic number of nodes, we need to know when to stop climbing the tree.  This is done by checkingagainst a description of the bounding box of that subtree.For example, if we desire a nearest neighbour to $i$ that we know iswithin $r$ units (say, because we already have another neighbour thatis $r$ units away from $i$), then we stop once the ball of radius $r$centred around point $i$ is fully contained by the bounding boxof the subtree.Bentley's experiments show that things actually run faster if we only@@^Bentley, Jon Louis@@>occasionally compare to this box.  For speed, he suggests comparingonly at every third level in the tree.   In terms of defining our data structure, if we compare only occasionally,then we should store only occasionally as well.   One possiblescheme is at each level store a pointer to the corresponding bounding box;a missing bounding box would be recorded as a |NULL| pointer.We can store a $k$-dimensional bounding box in $2k$ words, and a pointer is stored in one word.Assuming that memory allocator space overhead is negligible and thatwe store a bounding box for every $s$ nodes (this is simpler than analysing every $s$ {\it levels}---why?),this scheme uses $s+2k$ words where the obvious scheme would use $2sk$.For the 2-d case and one bounding box everythree nodes, the clever scheme uses 11 words (two |NULL| box pointers, one valid box pointer, and one box structure)for every 24 words (three box structures)in theoriginal scheme.This analysis assumes that the coordinatetype and the pointer type take the same amount of space.Using |double| and ordinary \CEE/ pointers, this is valid on most 32-bit architectures.  (Will Microsoft's |near| and |far| pointers@@^Microsoft@@>disappear by the time you read this?  Probably.)I think things change on 64-bit architectures and higher, but I'm willingto sweat the difference.So let's go with the clever scheme.  In particular, a pointer toa possible bounding box gets placed in each node.Question: would I save any space if I placed them only in internal nodes?@@<Other common fields@@>=E2_box_t *bbox;@@ Of course, we need a definition of a 2-d box.@@<Early module type definitions@@>=typedef struct {	double xmin,xmax,ymin,ymax;} E2_box_t;@@ The space analysis we've done assumes that the per-boxspaceoverhead of the memory allocator is negligible.  This isn't the caseif we use the general purpose |malloc| and |free| routines, andby extension, the |new_of| routine that uses them.  However, we canmake the per-box space overhead as small as we like by using the pool-orientedallocator of module \module{POOL} and picking a suitably largeminimum block size.@@d new_box() ((E2_box_t *)pool_alloc(box_pool))@@d free_box(P) pool_free(box_pool,(P))@@ Of course, we need the interface to \module{POOL}.@@<Module headers@@>=#include "pool.h"@@ I'll use a global variable to keep track of the number of levelsto skip between bounding boxes.  This allows us to experiment withdifferent values, for example by setting it via a command-line option.The default is Bentley's suggestion of every third level.@@^Bentley, Jon Louis@@>@@<Global variables@@>=int kd_bbox_skip=3;@@@@<Exported variables@@>=extern int kd_bbox_skip;@@*The opportunity of equal elements.Bentley experimented with various point-set distributions and discovered@@^Bentley, Jon Louis@@>that his implementation of $k$-d trees behaved pretty much the same wayas on sets where the points were drawn uniformly over the unit square.When he used a simple partitioning method,there was one glaring exception to this is rule: the ``spokes'' distributionforced his program to touch four times as many tree nodes.  A 2-d spokes distribution has half the points with $x=1/2$and with $y$ chosen uniformly from $[0,1]$, and the other half of thepoints with $y=1/2$ and $x$ chosen uniformly from $[0,1]$; this resultsin a large ``plus'' sign: $+$.In the $k$-d search tree paper, Bentley's fix (see section 7) was to use a more elaborate partitioning algorithmbased on a theorem of Bentley and Shamos INSERT REFERENCE (and alsodescribed by Preparata and Shamos INSERT REFERENCE) that guaranteesthe existence of a good so-called nearest neighbourcut plane given that the point set obeysa certain sparseness criterion.  Given these conditions on the intput, the partitioning algorithm takes $O(n \log n)$ time, for fixed $k$.However, Bentley himself admits that, although better than naive partitioning,this elaborate partitioning schemeis not completely robust.  In fact, Papadimitriou and Bentley INSERT REFERENCEhave shown that some inputs have {\it no\/} good nearest neighbour cut planes.@@ I propose a different solution to this problem.  Ironically, it is based on observations made by Bentley  and McIlroy in @@^Bentley, Jon Louis@@>@@^McIlroy, M.~Douglas@@>``Engineering a Sort Function'', {\sl Software---Practice and Experience},Vol 23(11), 1249--1265, November 1993.  There, they describe theirexperience of writing a new library implementation of Quicksort.@@^Quicksort@@>(This is an excellent paper.  Go read it.)Part of their work was to recognize the fact that it often paysto treat specially those elements that compare equal to the pivot.In an implementation of Quicksort, if one parititions elements intothree classes---those less than, equal to, and greater than the pivot---thenone often saves many recursive calls if a significant fraction of theelements compare equal to the pivot.  In fact, the spokes distribution in the $2$-d partitioningproblem exhibits exactly thiskind of behaviour: in a given dimension, half the points compare equalto the pivot.  So I propose the same solution for the $k$-d partitioningproblem as Bentley and McIlroy describe for Quicksort:  treat equalelements specially, by partitioning into three classes.  Bentleyand McIlroy call this \term{fat partitioning}@@^fat partitioning@@>,and they observe that task is equivalent to Dijkstra's `Dutch National Flag'problem.  @@^Dijkstra, Edsger W.@@>We'll hear more of this later when we actually do the partitioning.To implement this, we add another child to each node.(Is this a record?  Eight paragraphs for one variable declaration?)@@<Other internal node fields@@>=struct E2_node_s *eq_child;@@ Unfortunately, there is a small fly in the ointment: what ifthere are many points with exactly the same coordinates?@@^degenerate inputs@@>If we don't check for this problem, then we will soon come toa point where we endlessly recurse onthe equal elements---there being no others---trying to break them into buckets with no more than|kd_bucket_cutoff| points each.The solution is to never flatten a dimension twice in the same path from a bucket to theroot.This implemented by passing a bit mask down the recursion tree.If there are a large number of points with exactly the same coordinates,then we get poor performance on the queries.  But gee whiz, what arewe supposed to do?  You deserve what you get if you don't preprocessthe input to remove duplicates.  So there!@@*Creating the tree.Although we're not quite finished defining the data structure, wenow know enough about it to write most of the code that builds it.We'll put in placeholders to be filled in later when we introduce thenecessary concepts.Here's the outline of the creation routine.@@<Subroutines@@>=voidE2_create(tsp_instance_t *tsp) {	errorif(!E2_supports(tsp), "2-d trees may not be used for this instance");	@@<Allocate structures@@>@@;	@@<Build the tree@@>@@;}@@The first step is to allocate the space for the tree.  Ingeneral, the buckets have an irregular number of items in them, so we can't allocate all the tree nodes in advance: we don't even know how manywe'll use.So we'll have to settle for creating the pool within which we'llbe allocating the tree nodes.  Let's allocate them in 500-node blocks.We also need a pool for the bounding boxes.@@d new_node() ((E2_node_t *)(pool_alloc(node_pool)))@@d free_node(P) pool_free(node_pool,(P))@@<Allocate structures@@>=node_pool = pool_create(sizeof(E2_node_t),500);box_pool = pool_create(sizeof(E2_box_t),500);@@ Of course, we'll need to declare these variables.@@<Module variables@@>=static pool_t *node_pool, *box_pool;@@ We also need to allocate the permutation array.  We can allocate this one allin one swoop because it has $n$ entries, where $n$ is the number of cities.Let's cheat a bit and initialize the array as well.  It starts out with the identity permutation.@@<Allocate structures@@>=n = tsp->n;perm = new_arr_of(int,n);{int i; for (i=0;i<n;i++) perm[i]=i;}@@@@<Module variables@@>=static int n;@@ We need the interface to the ordinary memory allocator.@@<Module headers@@>=#include "error.h"#include "memory.h"@@ While we're thinking about memory allocation, let's take care of deallocation.  @@<Subroutines@@>=voidE2_destroy(void) {	pool_destroy(node_pool);	pool_destroy(box_pool);	box_pool = node_pool = NULL; /* Let's be defensive. */	free_mem(perm);	@@<Other deallocation@@>@@;}@@ We've used the constant |NULL|, which is written in ordinary\CEE/ text as \hbox{|NU|}|LL|.  This constant is defined in the \file{stddef.h} header file.  While we're including it, let's includesome other standard header files.@@<System headers@@>=#include <stddef.h>#include <stdio.h>#include <stdlib.h>#include "fixincludes.h"@@The process of building the tree is much  like executing a Quicksortroutine.  We pick a pivot element, and then partition the points into three classes:points preceding the pivot go in the |lo_child| subtree, points matching the pivot go in the |eq_child| subtree, andpoints following the pivot go in the |hi_child| subtree.  Then we recursively build the three subtrees.However, the $k$-d tree-building problem has the extra complication ofpicking the dimension along which we partition, |cutdimen|.  Bentley@@^Bentley, Jon Louis@@>suggests partitioning along the dimension which has maximum spread.We do that here.  Note that the bookkeeping to determine spreads canbe folded into the parititioning at the previous level.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -