ann_test.cpp

来自「c++实现的KNN库:建立高维度的K-d tree,实现K邻域搜索」· C++ 代码 · 共 1,641 行 · 第 1/5 页
CPP
1,641 行
//----------------------------------------------------------------------//	File:			ann_test.cpp//	Programmer:		Sunil Arya and David Mount//	Description:	test program for ANN (approximate nearest neighbors)//  Last modified:	08/04/06 (Version 1.1.1)//----------------------------------------------------------------------// Copyright (c) 1997-2005 University of Maryland and Sunil Arya and// David Mount.  All Rights Reserved.// // This software and related documentation is part of the Approximate// Nearest Neighbor Library (ANN).  This software is provided under// the provisions of the Lesser GNU Public License (LGPL).  See the// file ../ReadMe.txt for further information.// // The University of Maryland (U.M.) and the authors make no// representations about the suitability or fitness of this software for// any purpose.  It is provided "as is" without express or implied// warranty.//----------------------------------------------------------------------// History://	Revision 0.1  03/04/98//		Initial release//	Revision 0.2  06/26/98//		Added CLOCKS_PER_SEC definition if needed//	Revision 1.0  04/01/05//		Added comments (from "#" to eol)//		Added clus_orth_flats and clus_ellipsoids distributions//		Fixed order of fair and midpt in split_table//		Added dump/load operations//		Cleaned up C++ for modern compilers//	Revision 1.1  05/03/05//		Added fixed radius kNN search//	Revision 1.1.1 08/04/06//		Added planted distribution//----------------------------------------------------------------------#include <ctime>						// clock#include <cmath>						// math routines#include <string>						// C string ops#include <fstream>						// file I/O#include <ANN/ANN.h>					// ANN declarations#include <ANN/ANNx.h>					// more ANN declarations#include <ANN/ANNperf.h>				// performance evaluation#include "rand.h"						// random point generation#ifndef CLOCKS_PER_SEC					// define clocks-per-second if needed  #define CLOCKS_PER_SEC 1000000#endifusing namespace std;					// make std:: available//----------------------------------------------------------------------// ann_test//// This program is a driver for testing and evaluating the ANN library// for computing approximate nearest neighbors.  It allows the user to// generate data and query sets of various sizes, dimensions, and// distributions, to build kd- and bbd-trees of various types, and then// run queries and outputting various performance statistics.//// Overview:// ---------// The test program is run as follows:// //		ann_test < test_input > test_output//// where the test_input file contains a list of directives as described// below.  Directives consist of a directive name, followed by list of// arguments (depending on the directive).  Arguments and directives are// separated by white space (blank, tab, and newline).  String arguments// are not quoted, and consist of a string of nonwhite chacters.  A// character "#" denotes a comment.  The following characters up to// the end of line are ignored.  Comments may only be inserted between// directives (not within the argument list of a directive).//// Basic operations:// -----------------// The test program can perform the following operations.  How these// operations are performed depends on the options which are described// later.////		Data Generation://		----------------//		read_data_pts <file>	Create a set of data points whose//								coordinates are input from file <file>.//		gen_data_pts			Create a set of data points whose//								coordinates are generated from the//								current point distribution.////		Building the tree://		------------------//		build_ann				Generate an approximate nearest neighbor//								structure for the current data set, using//								the selected splitting rules.  Any existing//								tree will be destroyed.////		Query Generation/Searching://		---------------------------//		read_query_pts <file>	Create a set of query points whose//								coordinates are input from file <file>.//		gen_query_pts			Create a set of query points whose//								coordinates are generated from the//								current point distribution.//		run_queries <string>	Apply nearest neighbor searching to the//								query points using the approximate nearest//								neighbor structure and the given search//								strategy.  Possible strategies are://									standard = standard kd-tree search//									priority = priority search////		Miscellaneous://		--------------//		output_label			Output a label to the output file.//		dump <file>				Dump the current structure to given file.//								(The dump format is explained further in//								the source file kd_tree.cc.)//	  	load <file>				Load a tree from a data file which was//								created by the dump operation.	Any//								existing tree will be destroyed.//// Options:// --------// How these operations are performed depends on a set of options.// If an option is not specified, a default value is used. An option// retains its value until it is set again.  String inputs are not// enclosed in quotes, and must contain no embedded white space (sorry,// this is C++'s convention).//// Options affecting search tree structure:// ----------------------------------------//		split_rule <type>		Type of splitting rule to use in building//								the search tree.  Choices are://									kd			= optimized kd-tree//									midpt		= midpoint split//									fair		= fair split//									sl_midpt	= sliding midpt split//									sl_fair		= sliding fair split//									suggest		= authors' choice for best//								The default is "suggest".  See the file//								kd_split.cc for more detailed information.////		shrink_rule <type>		Type of shrinking rule to use in building//								a bd-tree data structure.  If "none" is//								given, then no shrinking is performed and//								the result is a kd-tree.  Choices are://									none		= perform no shrinking//									simple		= simple shrinking//									centroid	= centroid shrinking//									suggest		= authors' choice for best//								The default is "none".  See the file//								bd_tree.cc for more information.//		bucket_size <int>		Bucket size, that is, the maximum number of//								points stored in each leaf node.//// Options affecting data and query point generation:// --------------------------------------------------//		dim <int>				Dimension of space.//		seed <int>				Seed for random number generation.//		data_size <int>			Number of data points.  When reading data//								points from a file, this indicates the//								maximum number of points for storage//								allocation. Default = 100.//		query_size <int>		Same as data_size for query points.//		std_dev <float>			Standard deviation (used in gauss,//								planted, and clustered distributions).//								This is the "small" distribution for//								clus_ellipsoids.  Default = 1.//		std_dev_lo <float>		Low and high standard deviations (used in//		std_dev_hi <float>		clus_ellipsoids).  Default = 1.//		corr_coef <float>		Correlation coefficient (used in co-gauss//								and co_lapace distributions). Default = 0.05.//		colors <int>			Number of color classes (clusters) (used//								in the clustered distributions).  Default = 5.//		new_clust				Once generated, cluster centers are not//								normally regenerated.  This is so that both//								query points and data points can be generated//								using the same set of clusters.  This option//								forces new cluster centers to be generated//								with the next generation of either data or//								query points.//		max_clus_dim <int>		Maximum dimension of clusters (used in//								clus_orth_flats and clus_ellipsoids).//								Default = 1.//		distribution <string>	Type of input distribution//									uniform		= uniform over cube [-1,1]^d.//									gauss		= Gaussian with mean 0//									laplace		= Laplacian, mean 0 and var 1//									co_gauss	= correlated Gaussian//									co_laplace	= correlated Laplacian//									clus_gauss	= clustered Gaussian//									clus_orth_flats = clusters of orth flats//									clus_ellipsoids = clusters of ellipsoids//									planted		= planted distribution//								See the file rand.cpp for further information.//// Options affecting nearest neighbor search:// ------------------------------------------//		epsilon <float>			Error bound for approx. near neigh. search.//		near_neigh <int>		Number of nearest neighbors to compute.//		max_pts_visit <int>		Maximum number of points to visit before//								terminating.  (Used in applications where//								real-time performance is important.)//								(Default = 0, which means no limit.)//		radius_bound <float>	Sets an upper bound on the nearest//								neighbor search radius.  If the bound is//								positive, then fixed-radius nearest//								neighbor searching is performed, and the//								count of the number of points in the//								range is returned.  If the bound is//								zero, then standard search is used.//								This can only be used with standard, not//								priority, search.  (Default = 0, which//								means standard search.)//// Options affection general program behavior:// -------------------------------------------//		stats <string>			Level of statistics output//									silent		 = no output,//									exec_time	+= execution time only//									prep_stats	+= preprocessing statistics//									query_stats += query performance stats//									query_res	+= results of queries//									show_pts	+= show the data points//									show_struct += print search structure//		validate <string>		Validate experiment and compute average//								error.  Since validation causes exact//								nearest neighbors to be computed by the//								brute force method, this can take a long//								time.  Valid arguments are://									on			= turn validation on//									off			= turn validation off//		true_near_neigh <int>	Number of true nearest neighbors to compute.//								When validating, we compute the difference//								in rank between each reported nearest neighbor//								and the true nearest neighbor of the same//								rank.  Thus it is necessary to compute a//								few more true nearest neighbors.  By default//								we compute 10 more than near_neigh.  With//								this option the exact number can be set.//								(Used only when validating.)//// Example:// --------//	output_label test_run_0		# output label for this run//	  validate off				# do not perform validation//	  dim 16					# points in dimension 16//	  stats query_stats			# output performance statistics for queries//	  seed 121212				# random number seed//	  data_size 1000//	  distribution uniform//	gen_data_pts				# 1000 uniform data points in dim 16//	  query_size 100//	  std_dev 0.05//	  distribution clus_gauss//	gen_query_pts				# 100 points in 10 clusters with std_dev 0.05//	  bucket_size 2//	  split_rule kd//	  shrink_rule none//	build_ann					# kd-tree, bucket size 2//	  epsilon 0.1//	  near_neigh 5//	  max_pts_visit 100			# stop search if more than 100 points seen//	run_queries standard		# run queries; 5 nearest neighbors, 10% error//	  data_size 500//	read_data_pts data.in		# read up to 500 points from file data.in//	  split_rule sl_midpt//	  shrink_rule simple//	build_ann					# bd-tree; simple shrink, sliding midpoint split//	  epsilon 0//	run_queries priority		# run same queries; 0 allowable error////------------------------------------------------------------------------//------------------------------------------------------------------------//	Constants//------------------------------------------------------------------------const int		STRING_LEN		= 500;			// max string lengthconst double	ERR				= 0.00001;		// epsilon (for float compares)//------------------------------------------------------------------------//	Enumerated values and conversions//------------------------------------------------------------------------typedef enum {DATA, QUERY} PtType;		// point types//------------------------------------------------------------------------//	Statistics output levels//------------------------------------------------------------------------typedef enum {					// stat levels		SILENT,							// no output		EXEC_TIME,						// just execution time		PREP_STATS,						// preprocessing info		QUERY_STATS,					// query performance		QUERY_RES,						// query results		SHOW_PTS,						// show data points		SHOW_STRUCT,					// show tree structure		N_STAT_LEVELS}					// number of levels		StatLev;const char stat_table[N_STAT_LEVELS][STRING_LEN] = {		"silent",						// SILENT		"exec_time",					// EXEC_TIME		"prep_stats",					// PREP_STATS		"query_stats",					// QUERY_STATS		"query_res",					// QUERY_RES		"show_pts",						// SHOW_PTS		"show_struct"};					// SHOW_STRUCT//------------------------------------------------------------------------//	Distributions//------------------------------------------------------------------------typedef enum {					// distributions		UNIFORM,						// uniform over cube [-1,1]^d.		GAUSS,							// Gaussian with mean 0		LAPLACE,						// Laplacian, mean 0 and var 1		CO_GAUSS,						// correlated Gaussian		CO_LAPLACE,						// correlated Laplacian		CLUS_GAUSS,						// clustered Gaussian		CLUS_ORTH_FLATS,				// clustered on orthog flats		CLUS_ELLIPSOIDS,				// clustered on ellipsoids		PLANTED,						// planted distribution		N_DISTRIBS}		Distrib;
ann_test.cpp - 源码说明

本页面展示了「c++实现的KNN库:建立高维度的K-d tree,实现K邻域搜索」中的 ann_test.cpp 源码文件，采用 C++ 编程语言编写，共 1,641 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与tree相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?