⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 p_spectrum.m

📁 《模式分析的核方法》一书中的源代码
💻 M
字号:
function [result, K] = p_spectrum(s,t,p)%P_SPECTRUM%        -Finds the contiguous subsequence match count between strings s and t%         by using a dynamic programming implementation,%         where the length of the subsequence is p.%         *(There is also a brute force implementation of this algorithm.%           Type help p_spectrum_bf for info.)%%        -Simply prompting the function will return the value K(s,t), however%         using the function as [result,K] = K(s,t) will also return the matrix K.%%        -The following algorithm is used:%         K[p](sa,t) = K[p](s,t) + [Summation of i from 1 to |t|] G[p-1](s,t(1:i-1)) [t(i) == a]%           K[p](s,t) = 0 if |s| < p  or |t| < p%         G[p](sa, tb) = G[p-1](s,t)[a==b]%           G[0](s,t) = 1 for all s,t%           G[p](s,t) = 0 if |s| == 0  or |t| == 0%         %%        -Example: p_spectrum('abccc','abc', 3) returns a value of 1.%            (Note that p_spectrum('abccc','abc',3)=p_spectrum('abc','abccc',3) since K(s,t,p) = K(t,s,p) ).%        -Example: p_spectrum('a','a', 1) returns a value of 1.%        -Example: p_spectrum('a','b', 1) returns a value of 0.%        -Example: p_spectrum('ab','ab', 2) returns a value of 1.%         %%%USAGE:   scalar = p_spectrum('string1','string2', p);    (where p is the length of the substring)%%         [scalar, matrix] = p_spectrum('string1,'string2', p);%%For more information, visit http://www.kernel-methods.net/%Written and tested in Matlab 6.0, Release 12.%Copyright 2003, Manju M. Pai 4/2003%manju@kernel-methods.net%------------------------------------------------------------------------------------------%Obtain lengths of strings[num_rows_s, n] = size(s);[num_rows_t, m] = size(t);%Initially set every matrix index to -1 to show value has not yet been foundK = repmat(-1, [n, m]);                %The main kernelG = repmat(-1, [n, m, p]);             %The suffix kernel%Error checking statements:  %Make sure input vectors are horizontal.  if (num_rows_s ~= 1 | num_rows_t ~= 1)       error('Error: s and t must be horizontal vectors.');  end;    %If p is less than zero or not a number, program should quit due to faulty variable input.  if p <= 0 | ischar(p)      error('Error: p needs to be a number greater than 0.');  end;  %End of error checking%Fill in the rest of the matrix using the function p_spectrum_kernel()for i=1:n    for j=1:m              [K(i,j), G] = p_spectrum_kernel(s(1:i), t(1:j), K, G, p);    end;end;result = K(n,m);%------------------------------------------------------------------------------------------function [ans, G] = p_spectrum_kernel(sa, t, K, G, p)%This function is called by p_spectrum(s,t,p).%Type 'help p_spectrum' for a description of the program.%%------------------------------------------------------------------------------------------%Obtain lengths of both stringsn = length(sa);m = length(t);%truncate last character of string and obtain length of new strings = sa(1:n-1);length_s = length(s);%Start algorithm:  % 1) Split main algorithm into two parts:    % a) K(s,t)       if (length(s) < p) | (length(t) < p)         %This is a base case where 0 is returned if either string has length 0         ans = 0;       elseif( K( length(s), length(t) ) == -1 )         % Value has not yet been calculated         ans = p_spectrum_kernel(s, t, K, G, p);       else         % Value has already been calculated         ans = K( length(s), length(t) );       end;    % b) Summation of G[p-1](s,t(1:i-1))[t(i) == a] for  i = 1:(length(t) - p)          %this is the letter (a) that was truncated off the string      letter = sa(n);      %We need this 'for' loop as a cursor that iterates through the t string.      pos_array = find(t(1:(m)) == letter);  %array which consists of all indices of t where t(i) == a      for index = 1:length(pos_array)        i = pos_array(index);        length_t = length(t(1:(i-1)));        if ( (p-1) == 0 )          result = 1;        elseif (length_s == 0 | length_t == 0)          %This is a base case where 0 is returned if either string has length 0          result = 0;        elseif ( G( length_s, length_t, (p-1)) == -1 )          % Value has not yet been calculated          [result, G] = suffix_kernel(s, t(1:(i-1)), G, (p-1));        else          % Value has already been calculated          result = G( length_s, length_t, (p-1));        end;        ans = ans + result;      end;                  return% End of algorithm%------------------------------------------------------------------------------------------function [ans, G] = suffix_kernel(sa, tb, G, p)%This function is called by p_spectrum(s,t,p).%Type 'help p_spectrum' for a description of the program.%%------------------------------------------------------------------------------------------%Obtain lengths of both stringsn = length(sa);m = length(tb);%if last characters of both strings do not match, return 0if ~(strcmpi( sa(n), tb(m) ) )    ans = 0;    returnend;%truncate last character of strings = sa(1:n-1);t = tb(1:m-1);%Obtain lengths of truncated stringslength_s = length(s);length_t = length(t);%Start algorithm: G(sa,tb) = (1 + lambda^2)*G-1(s,t)[a==b]if ((p-1) == 0)    %This is a base case where 1 is returned if G[p] = G[0]    ans = 1;elseif (length_s == 0) | (length_t == 0)    %This is a base case where 0 is returned if either string has length 0    ans = 0;elseif( G( length_s, length_t, (p-1) ) == -1 )    % Value has not yet been calculated    [ans, G] = suffix_kernel(s, t, G, (p-1));    G( length_s, length_t, (p-1)) = ans;else    % Value has already been calculated    ans = G( length_s, length_t, (p-1) );end;return

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -