⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 knndd.m

📁 数据挖掘的工具箱,最新版的,希望对做这方面研究的人有用
💻 M
字号:
%KNNDD K-Nearest neighbour data description method.% %       W = KNNDD(A,FRACREJ,K,METHOD)% % Calculates the K-Nearest neighbour data description on dataset A.% Three methods are defined to compute a distance to the dataset using% the k-nearest neighbours:%% METHOD     does:% 'kappa'      use distance to the k-th nearest neighbor% 'delta'      distance to the average of the k-nn's% 'gamma'      average distance to the k-nn's%% When no K is defined, it will be optimized using knn_optk, when it% is smaller than 0, sqrt(n) will be used.% Copyright: D. Tax, davidt@ph.tn.tudelft.nl% Faculty of Applied Physics, Delft University of Technology% P.O. Box 5046, 2600 GA Delft, The Netherlandsfunction W = knndd(a,fracrej,k,method)if nargin < 4, method = 'kappa'; endif nargin < 3, k = []; endif nargin < 2 | isempty(fracrej), fracrej = 0.05; endif nargin < 1 | isempty(a) % empty knndd	W = mapping(mfilename,{fracrej,k,method});	W = setname(W,'K-Nearest neighbour data description');	returnendif ~ismapping(fracrej)           %training	% some checking of datatypes and sizes:	a = +target_class(a);  % make sure we have a OneClass dataset	[m,d] = size(a);	if (m<2)		warning([mfilename ': Dataset contains less than 2 objects']);	end	if (k>=m)		error(['More neighbors than training samples are requested! (max=',...                num2str(m-1),')']);	end   if isa(k,'char')      error('Argument k should define the number of neighbors');   end	% the most important thing:	distmat = sqeucldistm(a,a);	% is k is not defined, find the optimal k optimizing the loglikelihood:	if isempty(k)		k = knn_optk(distmat,d);	else  %tricky, when k<=0 we use the default sqrt(n) solution...		if (k<=0)			k = round(sqrt(m));		end	end	if (k<1)		warning([mfilename ': K must be positive (>0)']);	end	[sD,I] = sort(distmat,2);	% different treatment by different methods:	switch method	case 'kappa'		fit = sD(:,k+1);  	case 'delta'		nn = zeros(m,d);		for i=2:k+1			nn = nn + a(I(:,i),:);		end		nn = (+a - (nn/(k)));		fit = sum(nn.*nn,2);	case 'gamma'		fit = mean(sD(:,(2:(k+1))),2);	otherwise		error([mfilename,': Unknown method']);	end	%now obtain the threshold:	thresh = dd_threshold(fit,1-fracrej);	%and save all useful data:	W.x = +a;	W.k = k;	W.method = method;	W.threshold = thresh;	W.scale = mean(fit);	W = mapping(mfilename,'trained',W,str2mat('target','outlier'),d,2);	W = setname(W,'K-Nearest neighbour data description');else                               %testing	W = getdata(fracrej);  % unpack	[m,d] = size(a);	%compute:	distmat = sqeucldistm(+a,W.x);    %dist between train and test	[sD,I] = sort(distmat,2);	% different treatment by different methods:	switch W.method	case 'kappa'		ind = sD(:,W.k);		%ind = sD(:,W.k+1);	case 'delta'		nn = zeros(m,d);		%for i=1:W.k+1		for i=1:W.k			nn = nn + W.x(I(:,i),:);		end		nn = (+a - (nn/(W.k)));		ind = sum(nn.*nn,2);	case 'gamma'		ind = mean(sD(:,(1:(W.k))),2);	otherwise		error([mfilename,': Unknown method']);	end	% store the results in the final dataset:	out = -[ind repmat(W.threshold,[m,1])];	W = setdat(a,out,fracrej);endreturn

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -