📄 lsa.m
字号:
function [dr,wr]=lsa(X,k)
% LSA Latent Semantic Analysis of document-word co-occurrence matrix X
% Input:
% k --- number of topics
% Output:
% dr --- Spearman Rank Correlation of documents
% wr --- Spearman Rank Correlation of words
% About Spearman Rank Correlation:
% Spearman Rank Correlation is an effective measure of two matrices/arrays.
% Firstly you are supposed to get the Spearman Rank matrices/arrays of the
% original matrices/arrays( This is done by spearmanrankcollums.m &
% spearmanrankrows.m ). Then use the following equation to compute the
% distance(correlation) of them:
% r=1-6*sum(d.^2)/(n*(n^2-1))
% (d = array1 - array2, for matrix array1/2 is replaced with the
% collum/row components, n = length(array1)=length(array2))
% r belongs to [-1,1], the bigger r is, the more relative the two arrays
% are.
% See details in spearmanrankcollums.m spearmanrankrows.m
% spearmanrcollums.m spearmanrrows.m
% Reference: "Tang Ketan,LSA"
% By Tang Ketan, tkt@mail.ustc.edu.cn
% 2007/10/25
[m,n]=size(X);
[U,S,V]=svd(X); % svd decomposition
% let all the diagnal elements of S to be zeros except the first k elements
for i=k+1:min(size(S))
S(i,i)=0;
end
Y=U*S*V'; % regenerate the data
d = spearmanrankcollums(Y); % generate Spearman Rank matrix of Y, i.e. words array
w = spearmanrankrows(Y); % generate Spearman Rank matrix of Y, i.e. documents array
dr = spearmanrcollums(d)
wr = spearmanrrows(w)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -