📄 kmeans_ser.asv
字号:
1 s bnj ,;[]=mnm$aation [dbelong, neach, clusters, niter, ST, DB] = ... kmeans_ser(indata, nc, np, incnp, eta, deta, maxEpoch, info, psec);%% The size of the dataset[ndata,nvar] = size(indata);if info >= 1 disp(sprintf('Number of datapoints: %d', ndata)); disp(sprintf('Number of variables: %d', nvar));end% Normalize data to unit standard deviation and zero mean% using prestd.[indata, datamean, datastd] = kmeans_norm(indata); %%%%%%%%%% Parameters for k-means %%%%%%%%%%if np >= 0 if np == 0 npoints = ndata; elseif np < 1 npoints = floor(np * ndata); if npoints < 1 npoints = 1; end elseif np > 1 npoints = np; if npoints > ndata npoints = ndata; end endelse npoints = 1;end% The starting clustersrand('state',sum(100*clock));clusters = 0.1 * (rand(nc,nvar) - 0.5); % Slumpar startpunkter.% If in 2 dimensions then do some plottingif nvar == 2 & info > 0 class=ones(1,npoints); fig = kmeans_plotdata(indata, class);end% If info make a headerif info == 1 fprintf(1,'K DB ST quanterr\n');elseif info == 2 fprintf(1, 'Iter d_clust q_err eta npoints\n');end% Startar tidtagningen.tic; niter = 0;go = 1;qerr_old = 0.0;cold = zeros(nc,nvar);while go == 1 niter = niter + 1; %%%%%%%%%%% More than 1 data points to update %%%%%%%%%%% if npoints > 1 % Select the datapoints to use when updating the clusters if npoints == ndata data = indata; else rndidx = randperm(ndata); rndidx = rndidx(1:npoints); data = indata(rndidx,:); end % Compute the distance between each datapoint to the clusters dists = kmeans_dist(data, clusters); % Assign datapoints to clusters and compute updates [minval, minidx] = min(dists'); for i=1:nc idxtmp = find(minidx == i); class(idxtmp)=i; if length(idxtmp > 0) cpoints = data(idxtmp,:); % Compute the mean of the datapoints belonging to cluster i centroid = mean(cpoints); update(i,:) = centroid - clusters(i,:); else update(i,:) = zeros(1,nvar); end end % Update the cluster centers clusters = clusters + eta * update; %%%%%%%%%%% Serial updating %%%%%%%%%%% else rndidx = randperm(ndata); % Loop over all datapoints for id=1:ndata % Find nearest cluster tmpdata = indata(rndidx(id),:); cidx = kmeans_findnn(tmpdata, clusters); % Update the cluster center update_i = tmpdata - clusters(cidx,:); clusters(cidx,:) = clusters(cidx,:) + eta * update_i; end end % Compute changes dcluster = sum( sqrt( sum( ( (clusters-cold).*(clusters-cold) )') ) ) / ... nc; cold = clusters; %dcluster = sum( sqrt( sum( (update.*update)') ) ) / nc; % Compute some clusters statistics [dbelong, neach, avedist, quanterr] = ... kmeans_stat(indata, clusters, nc); % Some plotting if 2 dimenstions if nvar == 2 & info > 0 kmeans_plotclusters(fig, clusters,indata,class); end % Stopping criteria if abs(quanterr - qerr_old) < 0.001 | niter >= maxEpoch go = 0; end qerr_old = quanterr; % Some info if info == 2 fprintf(1, '%5d %6.3f %8.2f %7.4f %5d\n', ... niter, dcluster, quanterr, eta, npoints); end % Update the learning rate eta = eta * deta; % Update the number of datapoints to use if incnp > 0 npoints = npoints + incnp; if npoints > ndata npoints = ndata; end end % This for visualization purposes pause(psec); end% Final plotting if in 2 dimenstionsif nvar == 2 & info > 0 close(fig); fig = kmeans_plotdata(indata,class); kmeans_plotclusters(fig, clusters,indata,class);end[ST, DB] = kmeans_val(indata, clusters, nc);tid = toc;if info >= 2 fprintf(1,'*************** Final result ***************\n'); fprintf(1,'Final quantization error: %f\n', quanterr); fprintf(1,'Davies-Bouldin index : %f\n', DB); fprintf(1,'Siddheswar-Turi index : %f\n', ST); fprintf(1,'Number of points in each cluster\n'); disp([1:nc]); disp(neach); fprintf(1,'Average distance to cluster centers\n'); disp(avedist); fprintf(1,'CPU time: %5.1f sec.\n', tid);elseif info == 1 fprintf(1,'%d %f %f %f\n', nc, DB, ST, quanterr);end
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -