📄 subseq_count.m
字号:
function [result, K] = subseq_count(s,t,p)%SUBSEQ_COUNT% Finds the non-contiguous subsequence match count between strings s and t% by implementing a dynamic matrix, where the length of the subsequence is p.%% The following algorithm is used:% K[0](x,y) = 1 for all x,y% K[p](s,'empty string') = 0 for all p > 0, all s% K[p](sa,t) = K[p](s,t) + [Summation from 1 to i]K[p-1](s,t(1:i-1) [t(i) == a]%% Simply prompting the function will return the value K(s,t), however% using the function as [result,K] = K(s,t) will also return the matrix K[p].%% Example: subseq_count('abc','abbbbc', 3) returns a value of 4.% (Note that subseq_count('abc','abbbbc')=subseq_count('abbbbc','abc') since K(s,t,p) = K(t,s,p) ).% Example: subseq_count('a','a', 1) returns a value of 1.% Example: subseq_count('a','a', 0) returns a value of 1 (for the empty string).% Example: subseq_count('a','b', 1) returns a value of 0.% Example: subseq_count('a','b', 0) returns a value of 1.% Example: subseq_count('ab','ab', 2) returns a value of 1.% %%%USAGE: scalar = subseq_count('string1','string2', p); (where p is the length of the subsequence)%% [scalar, matrix] = subseq_count('string1,'string2', p);%%%For more information, visit http://www.kernel-methods.net/%%Written and tested in Matlab 6.0, Release 12.%Copyright 2003, Manju M. Pai 4/2003%manju@kernel-methods.net%------------------------------------------------------------------------------------------%Obtain lengths of stringsn = length(s);m = length(t);%Allow one extra row & column for empty string%Initially set every matrix index to -1 to show value has not yet been foundans = repmat(-1, [n+1, m+1, p]);%If p is equal to zero, just return 1 for the empty string and quit programif p == 0 result = 1; K = repmat(1, [n+1, m+1]); return;end;%Fill in the rest of the matrix using the function subseq_count(s,t)for h=1:p for i=1:n+1 for j=1:m+1 ans(i,j,h) = subseq_count_kernel(s(1:i-1), t(1:j-1), ans, h); end; end;end;result = ans(n+1,m+1,p);K = ans(:,:,p);%------------------------------------------------------------------------------------------function ans = subseq_count_kernel(sa, t, K, p)%This function is called by subseq_count(s,t,p).%Type 'help subseq_count' for a description of the program.%%------------------------------------------------------------------------------------------%Obtain lengths of both stringsn = length(sa);m = length(t);%Start algorithm: % 1) Base case: % K[p](sa, 'empty string') == K[p]('empty string', t) == zero if (m == 0) | (n == 0) ans = 0; return end; % 2) Split main algorithm into two parts: % a) K[p](s,t) %truncate last character of string s = sa(1:n-1); if( K( length(s)+1, length(t) + 1, p) == -1 ) % Value has not yet been calculated ans = subseq_count_kernel(s,t,K,p); else % Value has already been calculated ans = K( length(s)+1, length(t) + 1, p); end; % b) Summation of K[p-1](s,t(1:i-1)), where t(i) == a %this is the letter (a) that was truncated off the string letter = sa(n); %We need this 'for' loop to go through all strings of t that end in a. %(If you're wondering why there are alot of '+ 1' and '- 1' in the algorithm, % its because the length of the rows and the columns are 1 bigger than the length % of the strings because of the addition of the empty string to the matrix.) pos_array = find(t == letter); %array which consists of all indices of t where t(i) == a for index = 1:length(pos_array) i = pos_array(index); if (p-1) == 0 %This is a base case where 1 should always be returned if p = 0; ans = ans + 1; elseif( K( length(s) + 1 , length(t(1:i-1)) + 1, p-1) == -1) % Value has not yet been calculated ans = ans + subseq_count_kernel( s , t(1:i-1), K, p-1 ); else % Value has already been calculated. ans = ans + K( length(s) + 1, length(t(1:i-1)) + 1, p-1 ); end; end; return% End of algorithm
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -