📄 som_demo3.m
字号:
%SOM_DEMO3 Self-organizing map visualization.% Contributed to SOM Toolbox 2.0, February 11th, 2000 by Juha Vesanto% http://www.cis.hut.fi/projects/somtoolbox/% Version 1.0beta juuso 071197 % Version 2.0beta juuso 080200 070600clf reset;figure(gcf)echo onclc% ==========================================================% SOM_DEMO3 - VISUALIZATION% ==========================================================% som_show - Visualize map.% som_grid - Visualization with free coordinates.%% som_show_add - Add markers on som_show visualization.% som_show_clear - Remove markers from som_show visualization.% som_recolorbar - Refresh and rescale colorbars in som_show % visualization.%% som_cplane - Visualize component/color/U-matrix plane.% som_pieplane - Visualize prototype vectors as pie charts.% som_barplane - Visualize prototype vectors as bar charts.% som_plotplane - Visualize prototype vectors as line graphs.%% pcaproj - Projection to principal component space.% cca - Projection with Curvilinear Component Analysis.% sammon - Projection with Sammon's mapping.% som_umat - Calculate U-matrix.% som_colorcode - Color coding for the map.% som_normcolor - RGB values of indexed colors.% som_hits - Hit histograms for the map.% The basic functions for SOM visualization are SOM_SHOW and% SOM_GRID. The SOM_SHOW has three auxiliary functions:% SOM_SHOW_ADD, SOM_SHOW_CLEAR and SOM_RECOLORBAR which are used % to add and remove markers and to control the colorbars.% SOM_SHOW actually uses SOM_CPLANE to make the visualizations.% Also SOM_{PIE,BAR,PLOT}PLANE can be used to visualize SOMs.% The other functions listed above do not themselves visualize% anything, but their results are used in the visualizations. % There's an important limitation that visualization functions have:% while the SOM Toolbox otherwise supports N-dimensional map grids, % visualization only works for 1- and 2-dimensional map grids!!!pause % Strike any key to create demo data and map...clc% DEMO DATA AND MAP% =================% The data set contructed for this demo consists of random vectors% in three gaussian kernels the centers of which are at [0, 0, 0],% [3 3 3] and [9 0 0]. The map is trained using default parameters.D1 = randn(100,3);D2 = randn(100,3) + 3;D3 = randn(100,3); D3(:,1) = D3(:,1) + 9;sD = som_data_struct([D1; D2; D3],'name','Demo3 data',... 'comp_names',{'X-coord','Y-coord','Z-coord'});sM = som_make(sD);% Since the data (and thus the prototypes of the map) are% 3-dimensional, they can be directly plotted using PLOT3.% Below, the data is plotted using red 'o's and the map% prototype vectors with black '+'s.plot3(sD.data(:,1),sD.data(:,2),sD.data(:,3),'ro',... sM.codebook(:,1),sM.codebook(:,2),sM.codebook(:,3),'k+')rotate3d on% From the visualization it is pretty easy to see what the data is% like, and how the prototypes have been positioned. One can see% that there are three clusters, and that there are some prototype% vectors between the clusters, although there is actually no% data there. The map units corresponding to these prototypes% are called 'dead' or 'interpolative' map units.pause % Strike any key to continue...clc% VISUALIZATION OF MULTIDIMENSIONAL DATA% ======================================% Usually visualization of data sets is not this straightforward,% since the dimensionality is much higher than three. In principle,% one can embed additional information to the visualization by% using properties other than position, for example color, size or% shape.% Here the data set and map prototypes are plotted again, but% information of the cluster is shown using color: red for the% first cluster, green for the second and blue for the last.plot3(sD.data(1:100,1),sD.data(1:100,2),sD.data(1:100,3),'ro',... sD.data(101:200,1),sD.data(101:200,2),sD.data(101:200,3),'go',... sD.data(201:300,1),sD.data(201:300,2),sD.data(201:300,3),'bo',... sM.codebook(:,1),sM.codebook(:,2),sM.codebook(:,3),'k+')rotate3d on% However, this works only for relatively small dimensionality, say% less than 10. When the information is added this way, the% visualization becomes harder and harder to understand. Also, not% all properties are equal: the human visual system perceives% colors differently from position, not to mention the complex% rules governing perception of shape. pause % Strike any key to learn about linking...clc% LINKING MULTIPLE VISUALIZATIONS% ===============================% The other option is to use *multiple visualizations*, so called% small multiples, instead of only one. The problem is then how to% link these visualizations together: one should be able to idetify% the same object from the different visualizations.% This could be done using, for example, color: each object has% the same color in each visualization. Another option is to use % similar position: each object has the same position in each% small multiple.% For example, here are four subplots, one for each component and% one for cluster information, where color denotes the value and% position is used for linking. The 2D-position is derived by% projecting the data into the space spanned by its two greatest% eigenvectors.[Pd,V,me] = pcaproj(sD.data,2); % project the dataPm = pcaproj(sM.codebook,V,me); % project the prototypescolormap(hot); % colormap used for valuesecho offfor c=1:3, subplot(2,2,c), cla, hold on som_grid('rect',[300 1],'coord',Pd,'Line','none',... 'MarkerColor',som_normcolor(sD.data(:,c))); som_grid(sM,'Coord',Pm,'Line','none','marker','+'); hold off, title(sD.comp_names{c}), xlabel('PC 1'), ylabel('PC 2');endsubplot(2,2,4), claplot(Pd(1:100,1),Pd(1:100,2),'ro',... Pd(101:200,1),Pd(101:200,2),'go',... Pd(201:300,1),Pd(201:300,2),'bo',... Pm(:,1),Pm(:,2),'k+')title('Cluster')echo onpause % Strike any key to use color for linking...% Here is another example, where color is used for linking. On the% top right triangle are the scatter plots of each variable without% color coding, and on the bottom left triangle with the color% coding. In the colored figures, each data sample can be% identified by a unique color. Well, almost identified: there are% quite a lot of samples with almost the same color. Color is not as% precise linking method as position.echo off Col = som_normcolor([1:300]',jet(300));k=1;for i=1:3, for j=1:3, if i<j, i1=i; i2=j; else i1=j; i2=i; end if i<j, subplot(3,3,k); cla plot(sD.data(:,i1),sD.data(:,i2),'ko') xlabel(sD.comp_names{i1}), ylabel(sD.comp_names{i2}) elseif i>j, subplot(3,3,k); cla som_grid('rect',[300 1],'coord',sD.data(:,[i1 i2]),... 'Line','none','MarkerColor',Col); xlabel(sD.comp_names{i1}), ylabel(sD.comp_names{i2}) end k=k+1; endendecho onpause % Strike any key to learn about data visualization using SOM...clc% DATA VISUALIZATION USING SOM% ============================% The basic visualization functions and their usage have already% been introduced in SOM_DEMO2. In this demo, a more structured% presentation is given. % Data visualization techniques using the SOM can be divided to% three categories based on their goal:% 1. visualization of clusters and shape of the data:% projections, U-matrices and other distance matrices%% 2. visualization of components / variables: % component planes, scatter plots%% 3. visualization of data projections: % hit histograms, response surfacespause % Strike any key to visualize clusters with distance matrices...clfclc% 1. VISUALIZATION OF CLUSTERS: DISTANCE MATRICES% ===============================================% Distance matrices are typically used to show the cluster% structure of the SOM. They show distances between neighboring% units, and are thus closely related to single linkage clustering% techniques. The most widely used distance matrix technique is% the U-matrix. % Here, the U-matrix of the map is shown (using all three% components in the distance calculation):colormap(1-gray)som_show(sM,'umat','all');pause % Strike any key to see more examples of distance matrices...% The function SOM_UMAT can be used to calculate U-matrix. The% resulting matrix holds distances between neighboring map units,% as well as the median distance from each map unit to its% neighbors. These median distances corresponding to each map unit% can be easily extracted. The result is a distance matrix using% median distance.U = som_umat(sM);Um = U(1:2:size(U,1),1:2:size(U,2));% A related technique is to assign colors to the map units such% that similar map units get similar colors.% Here, four clustering figures are shown: % - U-matrix% - median distance matrix (with grayscale)% - median distance matrix (with map unit size)% - similarity coloring, made by spreading a colormap% on top of the principal component projection of the% prototype vectorssubplot(2,2,1)h=som_cplane([sM.topol.lattice,'U'],sM.topol.msize, U(:)); set(h,'Edgecolor','none'); title('U-matrix')subplot(2,2,2)h=som_cplane(sM, Um(:));set(h,'Edgecolor','none'); title('D-matrix (grayscale)')subplot(2,2,3)som_cplane(sM,'none',1-Um(:)/max(Um(:)))title('D-matrix (marker size)')subplot(2,2,4)C = som_colorcode(Pm); % Pm is the PC-projection calculated earliersom_cplane(sM,C)title('Similarity coloring')pause % Strike any key to visualize shape and clusters with projections...clfclc% 1. VISUALIZATION OF CLUSTERS AND SHAPE: PROJECTIONS% ===================================================% In vector projection, a set of high-dimensional data samples is% projected to a lower dimensional such that the distances between% data sample pairs are preserved as well as possible. Depending % on the technique, the projection may be either linear or% non-linear, and it may place special emphasis on preserving% local distances. % For example SOM is a projection technique, since the prototypes% have well-defined positions on the 2-dimensional map grid. SOM as% a projection is however a very crude one. Other projection% techniques include the principal component projection used% earlier, Sammon's mapping and Curvilinear Component Analysis% (to name a few). These have been implemented in functions% PCAPROJ, SAMMON and CCA. % Projecting the map prototype vectors and joining neighboring map% units with lines gives the SOM its characteristic net-like look.% The projection figures can be linked to the map planes using% color coding.% Here is the distance matrix, color coding, a projection without% coloring and a projection with one. In the last projection,% the size of interpolating map units has been set to zero.subplot(2,2,1)som_cplane(sM,Um(:));title('Distance matrix')subplot(2,2,2)C = som_colorcode(sM,'rgb4');som_cplane(sM,C);title('Color code')subplot(2,2,3)som_grid(sM,'Coord',Pm,'Linecolor','k');title('PC-projection')subplot(2,2,4)h = som_hits(sM,sD); s=6*(h>0);som_grid(sM,'Coord',Pm,'MarkerColor',C,'Linecolor','k','MarkerSize',s);title('Colored PC-projection')pause % Strike any key to visualize component planes...clfclc% 2. VISUALIZATION OF COMPONENTS: COMPONENT PLANES% ================================================% The component planes visualizations shows what kind of values the% prototype vectors of the map units have for different vector% components.% Here is the U-matrix and the three component planes of the map.som_show(sM)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -