📄 kdtree.w,v
字号:
decrement the |hi| index. To recover all the points in a bucket, wealso store a |hi_all| index, which is the value of |hi| when allthe points are visible.@@<Other external node fields@@>=int hi_all;@@ We need to initialize this field during tree creation.@@<Fill other bucket fields@@>=node->f.e.hi_all = hi;@@ In addition to moving the point to the end of its bucket, we need toensure the |hidden| fields of our ancestors are properly set.@@<Subroutines@@>=voidE2_hide(int c) { int ci; E2_node_t *node; errorif (c<0 || c>=n,"Invalid city %d to hide",c); node = E2_point_to_bucket[c]; @@<Make |perm[ci]==c|@@>@@; @@<Hide |ci|@@>@@; @@<If necessary, hide |node|'s ancestors@@>@@;}@@ Before we can hide city |c|, we must find it in its bucket. One option is to maintain an inverse of |perm|. However, this takesup linear extra space, and is a hassle to maintain.The option I've chosen here is to perform a sequential search through thebucket. For most buckets this is fast because, on average, this takes about|kd_bucket_cutoff/2| steps. With the default of 5 for |kd_bucket_cutoff|,this is 2.5 steps. That's small. I said `most buckets' because there is a fly in the ointment. In fact,this is the same fly and the same ointment that I mentioned earlier. When there aremany points with exactly the same coordinate, then the bucket storingthose points will be rather large. Again, degenerate inputs lead to@@^degenerate inputs@@>bad running times. I've decided to risk this tradeoff because I expectthese inputs to be very rare.@@^tradeoffs@@>@@<Make |perm[ci]==c|@@>={ int hi_all = node->f.e.hi_all; for ( ci=node->f.e.lo;ci<hi_all; ci++ ) { if ( perm[ci]==c ) break; } errorif(ci==hi_all,"Point %d not found in its bucket",c);}@@ For correctness' sake, we must check to see whether the city is alreadyhidden. (If the city is already hidden, then |hi| should not bedecremented.)Should we complain publicly if the city is already hidden? @@^public complaining@@>As a matter of defensive programming, the answer---currently---is yes.The algorithms that use this module shouldn't try to hide a hidden city.Complaining about it here would probably help me track down a bug.However, if this code is used in other contexts, then we may wantto eliminate this public complaining.@@<Hide |ci|@@>={ int t, hi = node->f.e.hi;if ( ci < hi ) { hi=--node->f.e.hi; t=perm[ci];@@+perm[ci]=perm[hi];@@+perm[hi]=t;} else { fprintf(stderr,"Hiding hidden city %d at perm[%d]\n",c,ci);}} @@ We've used |fprintf|, and so we need the interface to the standardI/O library. % While we're at it,@@<System headers@@>=#include <stdio.h>@@ The next task is to set the |hidden| fields in our ancestors properly.In particular, each node is an ancestor of itself. This simplifies the code nicely.We stop climbing on one of three conditions: when we run off the top of the tree, when we hit an ancestor that is already hidden, or when not all the nodes in the subtree rootedat that node are hidden.To simplify the loop test, we put it at the end of the loop, so it only applies to internal nodes.Remember that each internal node is ternary:none of its three children are null pointers, |NULL|, though they may be empty.@@<If necessary, hide |node|'s ancestors@@>=#if !defined(KD_NO_HIDDEN_BIT)if ( node->f.e.lo == node->f.e.hi && !node->hidden ) { do { node->hidden = 1; /* |printf("h");| */ node=node->parent; } while ( node && !node->hidden && node->f.i.lo_child->hidden && node->f.i.eq_child->hidden && node->f.i.hi_child->hidden );}#endif@@ The code for hiding a city is now complete. It's now timeto turn our attention to unhiding a city.Luckily for us, we get to reuse some of the code we've just seen, namelythe code that finds $\hbox{|perm|}^{-1}(c)$. The other sections are easy.@@<Subroutines@@>=voidE2_unhide(int c) { int ci; E2_node_t *node = E2_point_to_bucket[c]; @@<Make |perm[ci]==c|@@>@@; @@<Unhide |ci|@@>@@; @@<Unhide |node| and its ancestors@@>@@;}@@ As with hiding, we must first check to see if the city is already unhidden.We complain publicly if this is the case.@@^public complaining@@>@@<Unhide |ci|@@>={ int t, hi = node->f.e.hi;if ( ci >= hi ) { t=perm[ci];@@+perm[ci]=perm[hi];@@+perm[hi]=t; node->f.e.hi++;} else { fprintf(stderr,"Unhiding unhidden city %d at perm[%d]\n",c,ci);}} @@Unhiding ancestors is analogous hiding ancestors, but simpler.We climb up the tree, setting |hidden| fields to false, and stop whenwe either run off the top or reach an ancestor that is already unhidden.@@<Unhide |node| and its ancestors@@>=#if !defined(KD_NO_HIDDEN_BIT)while ( node && node->hidden ) { node->hidden = 0; node=node->parent;}#endif@@ Now we are ready to see how to hide and unhide all the points in the set.One option is to call the individual operation on every point in the set.This works, but has a high overhead. We can reduce the overhead by specially codingthese bulk operations. This is the course I've chosen.@@ We'll start with hiding. We use a recursive routine to unhide allthe subtrees. We use a helper routine that hides an entire subtree.@@<Subroutines@@>=void E2_hide_all(void) {#if !defined(KD_NO_HIDDEN_BIT) if ( E2_root && !E2_root->hidden ) E2_hide_all_helper(E2_root);#else if ( E2_root ) E2_hide_all_helper(E2_root);#endif}@@ The recursive routine sets the |hidden| field, and then splitsinto two cases.Inside a bucket, it moves the live-range index |hi| down to |lo|.At an internal node, it calls itself recursively.One might worry about overflowing the runtime stack. With a bit of care, we'll onlyoverflow the stack here if we've already overflowed it in building the tree itself. The `bit of care' is to process the largest segment last.Unfortunately, we've lost that information, and it's noteasily recoverable. Yikes! Oh well,let's recurse blindly.@@^stack overflow@@>An extra optimization is to only recurse down subtrees which arenot currently hidden.@@<Module subroutines@@>=static void E2_hide_all_helper(E2_node_t *p) {#if !defined(KD_NO_HIDDEN_BIT) p->hidden = 1;#endif if ( p->is_bucket ) p->f.e.hi = p->f.e.lo; else { #if !defined(KD_NO_HIDDEN_BIT) if(!p->f.i.lo_child->hidden)E2_hide_all_helper(p->f.i.lo_child); if(!p->f.i.eq_child->hidden)E2_hide_all_helper(p->f.i.eq_child); if(!p->f.i.hi_child->hidden)E2_hide_all_helper(p->f.i.hi_child);#else E2_hide_all_helper(p->f.i.lo_child); E2_hide_all_helper(p->f.i.eq_child); E2_hide_all_helper(p->f.i.hi_child);#endif }}@@ Unhiding is very similar to hiding.The only subtlety is that we have no option but to always recurse:there may be hidden cities in subtrees whose |hidden| bit is turned off.@@<Subroutines@@>=void E2_unhide_all(void) { if ( E2_root ) E2_unhide_all_helper(E2_root);}@@ This time, a bucket's live-range index |hi| gets set to |hi_all|,the value it has when all its points are unhidden. At an internal node,we have no option but to recurse into the children.@@<Module subroutines@@>=static void E2_unhide_all_helper(E2_node_t *p) { if ( p->is_bucket ) { p->f.e.hi = p->f.e.hi_all;#if !defined(KD_NO_HIDDEN_BIT) p->hidden = p->f.e.lo >= p->f.e.hi;#endif } else {#if !defined(KD_NO_HIDDEN_BIT) p->hidden = 0;#endif E2_unhide_all_helper(p->f.i.lo_child); E2_unhide_all_helper(p->f.i.eq_child); E2_unhide_all_helper(p->f.i.hi_child); }}@@* Nearest neighbours.Now that we've built the tree, we're ready to describe the first query:nearest neighbour search. Given a city $i$, we'd like to find anunhidden city that is nearest to $i$, excluding $i$ of course.We proceed in bottom-up fashion, searching from the bucket containing $i$ and upward through the ever-larger subtrees that contain that bucket.The idea is that the nearest neighbour is likely to be found in a partof the tree that is near $i$'s bucket. With a candidate $j$ in hand, we terminate the walk to the root once we can prove that the ball of radius $\cost(i,j)$ lies entirely within the currentsubtree. Here, of course, one can use whatever metric is in effect inthe current context to define ``ball''; we aren't restricted to the Euclideanmetric.I should say up front that most of this code is taken almost verbatim fromBentley's {\sl $K$-d Trees for Semidynamic Point Sets}, Sixth Annual @@^Bentley, Jon Louis@@>ACM Symposium on Computational Geometry, Berkeley, CA, June 1990, pp.~187--197.@@ We also need quadrant-based nearest neighbour search. Most of the code is the same, so I've folded in the generalizationsrequired.I borrow from and slightly extend ordinary mathematicalterminology: if city |i| liesat $(x,y)$, thenquadrant 0 is point $(x,y)$;quadrant 1 is all points $(x',y')$ with $x'>x$ and $y'\ge y$;quadrant 2 is all points $(x',y')$ with $x'\le x$ and $y'>y$;quadrant 3 is all points $(x',y')$ with $x'<x$ and $y'\le y$;quadrant 4 is all points $(x',y')$ with $x'\ge x$ and $y'<y$.This nomenclature is presented in module \module{NN}.For the sake of generality, the core searching code will be able to handleany combination of the five quadrants by using bit masks.For exaple, to specify searching in quadrants 0, 1 and 4, a maskof |0x13| will is used.@@<Module variables@@>=static int quadrant_mask;@@ We use a recursive helper routine to check subtrees. Its job is to search an entire subtree for near neighbours.|E2_rnn| maintains a priority queue |E2_nn_bin| of the nearest |E2_nn_bin_size|neighbours to the target city. The |E2_nn_bin| is reverse-ordered,meaning the farthest vertex is considered the {\it least\/} elementin the queue. To save time passing state around, we define some external state.The city we're trying to find neighbours for is |E2_nn_seed|;the best city found so far is |E2_nn_incumbent|.We try to find up to |E2_nn_num| nearest neighbours.Because we're using the Euclidean metric, we can take Sproull's advice (see Bentley (INSERT REFERENCE)) and @@^Bentley, Jon Louis@@>@@^Sproull@@>compare squared distances instead of actual distances; this saves a square root operation each time we determine an edge cost. Thesquared distance to the farthest city in the bin is stored in |E2_nn_dist_sq|.Occasionally, however, we test against a bounding box whose radiusis the actual distance between the seed and the farthest neighbour inthe bin. This distance is stored in |E2_nn_dist|.We will often compute distances to the seed city, so it will also be convenient to remember its coordinates. They are stored in |E2_nn_seed_x|and |E2_nn_seed_y|.@@<Module variables@@>=static int E2_nn_seed, E2_nn_incumbent, E2_nn_bin_want_size;static double E2_nn_dist, E2_nn_dist_sq, E2_nn_seed_x, E2_nn_seed_y;static pq_t *E2_nn_bin;static kd_bin_t *E2_nn_bin_work=NULL;@@ We need the priority queue type.@@<Module headers@@>=#include "pq.h"@@ The elements of the priority queue are of type |kd_bin_t|.Each element contains the name of the vertex, |city|, and thesquared distance from the target to that city, |cost_squared|.@@<Exported type definitions@@>=typedef struct { double cost_squared; int city;} kd_bin_t;@@@@<Exported macros@@>=#define kd_bin_city(bin_pointer) ((bin_pointer)->city)#define kd_bin_monotonic_len(bin_pointer) ((bin_pointer)->cost_squared)@@ We need to allocate the bin itself and a workspace too.@@<Allocate structures@@>=E2_nn_bin = pq_create(kd_bin_cmp_decreasing);@@ We need to define a comparison function |bin_cmp|for elements of type |kd_bin_t|. Remember that it should make higher costscompare as lower.@@<Subroutines@@>=intkd_bin_cmp_increasing(const void *a, const void *b){ double da = ((const kd_bin_t*)a)->cost_squared; double db = ((const kd_bin_t*)b)->cost_squared; if ( da > db ) return 1; if ( da < db ) return -1; return ((const kd_bin_t*)a)->city @@| -((const kd_bin_t*)b)->city; /* Bring duplicates together. */}intkd_bin_cmp_decreasing(const void *a, const void *b){ double da = ((const kd_bin_t*)a)->cost_squared; double db = ((const kd_bin_t*)b)->cost_squared; if ( da < db ) return 1; if ( da > db ) return -1; return ((const kd_bin_t*)a)->city @@| -((const kd_bin_t*)b)->city; /* Bring duplicates together. */}@@ We also need to deallocatethe work array and the queue itself.@@<Other deallocation@@>=pq_destroy(E2_nn_bin);@@ The recursive routine that checks subtrees is named |E2_rnn|.It splits naturally into two cases: internal versus external nodes.@@<Module subroutines@@>=static void E2_rnn(E2_node_t *p) { if ( p->is_bucket ) { @@<Search this bucket for a nearer neighbour@@>@@; } else { @@<Search the children for a nearer neighbour@@>@@; }}@@ Inside a bucket, we perform a sequential search.@@<Search this bucket for a nearer neighbour@@>={ int i, hi=p->f.e.hi;for ( i=p->f.e.lo ; i<hi ; i++ ) { int pi = perm[i]; @@<Update nearest neighbours if |pi| is nearer@@>@@;}}@@I avoid rounding in the computation of the actual distance becauseI want this code to be applicable to both |EUC_2D| and |CEIL_2D|cost functions. Rounding would have implications on the boundingbox test. We should be careful in this respect, and not roundingis a conservative choice: it doesn't needlessly throw bits away.@@^rounding@@>If you need further convincing, then observe that both the rounding and the ceiling functions are monotonic on the real line. That is, for any real numbers$\alpha$ and $\beta$,$\alpha \le \beta$ % \Rightarrow implies both$\lfloor \alpha + 0.5\rfloor\le \lfloor \beta + 0.5\rfloor$ and$\lce
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -