⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 subseq_count.m

📁 《模式分析的核方法》一书中的源代码
💻 M
字号:
function [result, K] = subseq_count(s,t,p)%SUBSEQ_COUNT%         Finds the non-contiguous subsequence match count between strings s and t%         by implementing a dynamic matrix, where the length of the subsequence is p.%%         The following algorithm is used:%         K[0](x,y) = 1 for all x,y%         K[p](s,'empty string') = 0 for all p > 0, all s%         K[p](sa,t) = K[p](s,t) + [Summation from 1 to i]K[p-1](s,t(1:i-1) [t(i) == a]%%         Simply prompting the function will return the value K(s,t), however%         using the function as [result,K] = K(s,t) will also return the matrix K[p].%%         Example: subseq_count('abc','abbbbc', 3) returns a value of 4.%            (Note that subseq_count('abc','abbbbc')=subseq_count('abbbbc','abc') since K(s,t,p) = K(t,s,p) ).%         Example: subseq_count('a','a', 1) returns a value of 1.%         Example: subseq_count('a','a', 0) returns a value of 1 (for the empty string).%         Example: subseq_count('a','b', 1) returns a value of 0.%         Example: subseq_count('a','b', 0) returns a value of 1.%         Example: subseq_count('ab','ab', 2) returns a value of 1.%         %%%USAGE:   scalar = subseq_count('string1','string2', p);    (where p is the length of the subsequence)%%         [scalar, matrix] = subseq_count('string1,'string2', p);%%%For more information, visit http://www.kernel-methods.net/%%Written and tested in Matlab 6.0, Release 12.%Copyright 2003, Manju M. Pai 4/2003%manju@kernel-methods.net%------------------------------------------------------------------------------------------%Obtain lengths of stringsn = length(s);m = length(t);%Allow one extra row & column for empty string%Initially set every matrix index to -1 to show value has not yet been foundans = repmat(-1, [n+1, m+1, p]);%If p is equal to zero, just return 1 for the empty string and quit programif p == 0    result = 1;    K = repmat(1, [n+1, m+1]);    return;end;%Fill in the rest of the matrix using the function subseq_count(s,t)for h=1:p    for i=1:n+1        for j=1:m+1                  ans(i,j,h) = subseq_count_kernel(s(1:i-1), t(1:j-1), ans, h);        end;    end;end;result = ans(n+1,m+1,p);K = ans(:,:,p);%------------------------------------------------------------------------------------------function ans = subseq_count_kernel(sa, t, K, p)%This function is called by subseq_count(s,t,p).%Type 'help subseq_count' for a description of the program.%%------------------------------------------------------------------------------------------%Obtain lengths of both stringsn = length(sa);m = length(t);%Start algorithm: % 1) Base case:    % K[p](sa, 'empty string') == K[p]('empty string', t) == zero    if (m == 0) | (n == 0)      ans = 0;      return    end;    % 2) Split main algorithm into two parts:    % a) K[p](s,t)          %truncate last character of string      s = sa(1:n-1);            if( K( length(s)+1, length(t) + 1, p) == -1 )        % Value has not yet been calculated        ans = subseq_count_kernel(s,t,K,p);      else        % Value has already been calculated        ans = K( length(s)+1, length(t) + 1, p);      end;    % b) Summation of K[p-1](s,t(1:i-1)), where t(i) == a      %this is the letter (a) that was truncated off the string      letter = sa(n);      %We need this 'for' loop to go through all strings of t that end in a.      %(If you're wondering why there are alot of '+ 1' and '- 1' in the algorithm,      % its because the length of the rows and the columns are 1 bigger than the length      % of the strings because of the addition of the empty string to the matrix.)      pos_array = find(t == letter);  %array which consists of all indices of t where t(i) == a      for index = 1:length(pos_array)        i = pos_array(index);        if (p-1) == 0               %This is a base case where 1 should always be returned if p = 0;          ans = ans + 1;        elseif( K( length(s) + 1 , length(t(1:i-1)) + 1, p-1) == -1)          % Value has not yet been calculated          ans = ans + subseq_count_kernel( s , t(1:i-1), K, p-1 );        else          % Value has already been calculated.          ans = ans + K( length(s) + 1, length(t(1:i-1)) + 1, p-1 );        end;      end;         return% End of algorithm

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -