lsa.m
来自「用于文本语义分析的潜在语义分析算法LSA(Latent Semantic Ana」· M 代码 · 共 42 行
M
42 行
function [dr,wr]=lsa(X,k)
% LSA Latent Semantic Analysis of document-word co-occurrence matrix X
% Input:
% k --- number of topics
% Output:
% dr --- Spearman Rank Correlation of documents
% wr --- Spearman Rank Correlation of words
% About Spearman Rank Correlation:
% Spearman Rank Correlation is an effective measure of two matrices/arrays.
% Firstly you are supposed to get the Spearman Rank matrices/arrays of the
% original matrices/arrays( This is done by spearmanrankcollums.m &
% spearmanrankrows.m ). Then use the following equation to compute the
% distance(correlation) of them:
% r=1-6*sum(d.^2)/(n*(n^2-1))
% (d = array1 - array2, for matrix array1/2 is replaced with the
% collum/row components, n = length(array1)=length(array2))
% r belongs to [-1,1], the bigger r is, the more relative the two arrays
% are.
% See details in spearmanrankcollums.m spearmanrankrows.m
% spearmanrcollums.m spearmanrrows.m
% Reference: "Tang Ketan,LSA"
% By Tang Ketan, tkt@mail.ustc.edu.cn
% 2007/10/25
[m,n]=size(X);
[U,S,V]=svd(X); % svd decomposition
% let all the diagnal elements of S to be zeros except the first k elements
for i=k+1:min(size(S))
S(i,i)=0;
end
Y=U*S*V'; % regenerate the data
d = spearmanrankcollums(Y); % generate Spearman Rank matrix of Y, i.e. words array
w = spearmanrankrows(Y); % generate Spearman Rank matrix of Y, i.e. documents array
dr = spearmanrcollums(d)
wr = spearmanrrows(w)
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?