⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 kctree.cpp

📁 高效的k-means算法实现
💻 CPP
📖 第 1 页 / 共 3 页
字号:
////	The key to pruning the set of candidates for each node is//	handled by two functions.  The function nearCand() finds the//	candidate that is nearest to the midpoint of the cell.  The//	function pruneTest() determines whether another candidate is//	close enough to the cell to be closer to some part of the cell//	than the nearest candidate.//----------------------------------------------------------------------void KCtree::getNeighbors(		// compute neighbors for centers    KMfilterCenters& ctrs)			// the centers{    initDistGlobals(ctrs);			// initialize globals    int *candIdx = new int[kcKCtrs];		// allocate center indices    for (int j = 0; j < kcKCtrs; j++) {		// initialize everything    	candIdx[j] = j;				// initialize indices    }    root->getNeighbors(candIdx, kcKCtrs);	// get neighbors for tree    delete [] candIdx;				// delete center indices    deleteDistGlobals();			// delete globals}//----------------------------------------------------------------------void KCsplit::getNeighbors(		// get neighbors for internal node    KMctrIdxArray	cands,			// candidate centers    int			kCands)			// number of centers{    if (kCands == 1) {				// only one cand left?						// post points as neighbors    	postNeigh(this, sum, sumSq, n_data, cands[0]);    }    else {    						// get closest cand to box	int cc = closestToBox(cands, kCands, bnd_box);	KMctrIdx closeCand = cands[cc];		// closest candidate index						// space for new candidates	KMctrIdxArray newCands = new KMctrIdx[kCands];	int newK = 0;				// number of new candidates	for (int j = 0; j < kCands; j++) {	    if (j == cc || !pruneTest(		// is candidate close enough?	    			kcCenters[cands[j]],	    			kcCenters[closeCand],				bnd_box)) {	    	newCands[newK++] = cands[j];	// yes, keep it	    }	}						// apply to children	child[KM_LO]->getNeighbors(newCands, newK);	child[KM_HI]->getNeighbors(newCands, newK);	delete [] newCands;			// delete new candidates    }}//----------------------------------------------------------------------void KCleaf::getNeighbors(		// get neighbors for leaf node    KMctrIdxArray	cands,			// candidate centers    int			kCands)			// number of centers{    if (kCands == 1) {				// only one cand left?						// post points as neighbors    	postNeigh(this, sum, sumSq, n_data, cands[0]);    }    else {					// find closest centers	for (int i = 0; i < n_data; i++) {	// for each point in bucket	    KMdist minDist = KM_DIST_INF;	// distance to nearest point	    int minK = 0;			// index of this point	    KMpoint thisPt = kcPoints[bkt[i]];	// this data point	    for (int j = 0; j < kCands; j++) {	// compute closest candidate		KMdist dist = kmDist(kcDim, kcCenters[cands[j]], thisPt);        	if (dist < minDist) {		// best so far?        	    minDist = dist;		// yes, save it		    minK = j;			// ...and its index		}	    }    	    postNeigh(this, kcPoints[bkt[i]], sumSq, 1, cands[minK]);	}    }}//----------------------------------------------------------------------// getAssignments //	This determines the assignments of the closest center to each of//	the data points.  It is basically a structural copy of the//	procedure getNeighbors, but rather than incrementing the various//	sums and sums of squares, it simply records the assignment of//	each data point to its closest center.  Unlike the filtering//	search, when only one candidate remains, it does not stop the//	search, but continues to traverse all the leaves descended from//	this node in order to perform the assignments.//----------------------------------------------------------------------void KCtree::getAssignments(		// compute assignments for points    KMfilterCenters&    ctrs,			// the current centers    KMctrIdxArray 	closeCtr,		// closest center per point    double*	 	sqDist)			// sq'd distance to center{    initDistGlobals(ctrs);			// initialize globals    int *candIdx = new int[kcKCtrs];		// allocate center indices    for (int j = 0; j < kcKCtrs; j++) {		// initialize everything    	candIdx[j] = j;				// initialize indices    }    						// search the tree    root->getAssignments(candIdx, kcKCtrs, closeCtr, sqDist);    delete [] candIdx;				// delete center indices    deleteDistGlobals();			// delete globals}//----------------------------------------------------------------------void KCsplit::getAssignments(		// get assignments for internal node    KMctrIdxArray	cands,			// candidate centers    int			kCands,			// number of centers    KMctrIdxArray 	closeCtr,		// closest center per point    double*	 	sqDist)			// sq'd distance to center{    if (kCands == 1) {				// only one cand left?						// no more pruning needed	child[KM_LO]->getAssignments(cands, kCands, closeCtr, sqDist);	child[KM_HI]->getAssignments(cands, kCands, closeCtr, sqDist);    }    else {    						// get closest cand to box	int cc = closestToBox(cands, kCands, bnd_box);	KMctrIdx closeCand = cands[cc];		// closest candidate index						// space for new candidates	KMctrIdxArray newCands = new KMctrIdx[kCands];	int newK = 0;				// number of new candidates	for (int j = 0; j < kCands; j++) {	    if (j == cc || !pruneTest(		// is candidate close enough?	    			kcCenters[cands[j]],	    			kcCenters[closeCand],				bnd_box)) {	    	newCands[newK++] = cands[j];	// yes, keep it	    }	}						// apply to children	child[KM_LO]->getAssignments(newCands, newK, closeCtr, sqDist);	child[KM_HI]->getAssignments(newCands, newK, closeCtr, sqDist);	delete [] newCands;			// delete new candidates    }}//----------------------------------------------------------------------void KCleaf::getAssignments(		// get assignments for leaf node    KMctrIdxArray	cands,			// candidate centers    int			kCands,			// number of centers    KMctrIdxArray 	closeCtr,		// closest center per point    double*	 	sqDist)			// sq'd distance to center{    for (int i = 0; i < n_data; i++) {		// for each point in bucket	KMdist minDist = KM_DIST_INF;		// distance to nearest point	int minK = 0;				// index of this point	KMpoint thisPt = kcPoints[bkt[i]];	// this data point	for (int j = 0; j < kCands; j++) {	// compute closest candidate	    KMdist dist = kmDist(kcDim, kcCenters[cands[j]], thisPt);	    if (dist < minDist) {		// best so far?		minDist = dist;			// yes, save it		minK = j;			// ...and its index	    }	}	if (closeCtr != NULL) closeCtr[bkt[i]] = cands[minK];	if (sqDist != NULL) sqDist[bkt[i]] = minDist;    }}//----------------------------------------------------------------------//  Local utilities//----------------------------------------------------------------------//----------------------------------------------------------------------//  closestToBox - compute the closest point to the box//	This procedure is given a list of candidates (cands), the number//	of candidates (kCands), and a cell (bnd_box), and returns the//	index (in cands) of the element of cands that is closest to the//	midpoint of the cell.  The global variable kcBoxMidpt is used to//	store the cell midpoint.//----------------------------------------------------------------------static int closestToBox(		// get closest point to box center    KMctrIdxArray	cands,			// candidates for closest    int			kCands,			// number of candidates    KMorthRect		&bnd_box)		// bounding box of cell{    for (int d = 0; d < kcDim; d++) {		// compute midpoint	kcBoxMidpt[d] = (bnd_box.lo[d] + bnd_box.hi[d])/2;    }    KMdist minDist = KM_DIST_INF;		// distance to nearest point    int minK = 0;				// index of this point    for (int j = 0; j < kCands; j++) {		// compute dist to each point        KMdist dist = kmDist(kcDim, kcCenters[cands[j]], kcBoxMidpt);        if (dist < minDist) {			// best so far?            minDist = dist;			// yes, save it	    minK = j;				// ...and its index	}    }    return minK;				// return closest index}//----------------------------------------------------------------------//  pruneTest - determine whether a point should be pruned//	This procedure is given a cell of the kc-tree (bnd_box or B),//	and candidate (cand or c) and the closest candidate to the//	cell (closeCand or c').  It determines whether the entire//	cell is closer to c' than it is to c.////	The procedure works by considering the relationship between//	two vectors.  Let (a.b) denote the dot product of vectors a//	and b.  Observe that a point p is closer c than to c' if and//	only if////		(p-c).(p-c) < (p-c').(p-c')//	//	after simple manipulations this is true if and only if////		(c-c').(c-c') < 2(p-c').(c-c').//	//	We want to know whether this relation is satisfied for any//	point p in B.  If so then it is satisfied for the point p//	in B that maximizes (p-c').(c-c').  Observe that p will be//	a vertex of B.  To determine p, we consider the sign of each//	coordinate of (c-c').  If the d-th coordinate is positive then//	we set p[d] = B.hi[d], and otherwise we set it to B.lo[d].////	NOTE: This procedure assumes that Euclidean distances are//	used.//----------------------------------------------------------------------static bool pruneTest(    KMcenter		cand,			// candidate to test    KMcenter		closeCand,		// closest candidate    KMorthRect		&bnd_box)		// bounding box{    double boxDot = 0;				// holds (p-c').(c-c')    double ccDot = 0;				// holds (c-c').(c-c')    for (int d = 0; d < kcDim; d++) {    	double ccComp = cand[d] - closeCand[d];	// one component c-c'	ccDot += ccComp * ccComp;		// increment dot product	if (ccComp > 0) {			// candidate on high side	   					// use high side of box	   boxDot += (bnd_box.hi[d] - closeCand[d]) * ccComp;	}	else {					// candidate on low side	   					// use low side of box	   boxDot += (bnd_box.lo[d] - closeCand[d]) * ccComp;	}    }    return (ccDot >= 2*boxDot);			// return final result}//----------------------------------------------------------------------// postNeigh - registers neighbors for a given candidate//	This procedure registers a set of points as neighbors//	of a given center (cand).  The points are represented by//	their sum and sum of squares.  A pointer to the//	node doing the posting is passed along, but it is used//	only if tracing.//----------------------------------------------------------------------static void postNeigh(    KCptr		p,			// the node posting    KMpoint		sum,			// the sum of coordinates    double		sumSq,			// the sum of squares    int			n_data,			// number of points    KMctrIdx		ctrIdx)			// center index{    for (int d = 0; d < kcDim; d++) {			// increment sum	kcSums[ctrIdx][d] += sum[d];    }    kcWeights[ctrIdx] += n_data;			// increment weight    kcSumSqs[ctrIdx] += sumSq;				// incr sum of squares}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -