📄 kmer_gapped_kernel.m
字号:
function K = kmer_gapped_kernel(S,k,maxgaps,lambda)%function K = kmer_gapped_kernel(S,k,maxgaps,lambda)%% Computes the k-mer gapped kernel between the strings stored in S;% w gaps are allowed. The features are weighted by an% exponential lambda^-m, where m is the actual number of gaps for% that feature.%%INPUTS% S = a cell array containing all strings% k = an integer indicating the size of the k-mer v(not including the gaps)% maxgaps = an integer <k indicating the number of gaps allowed% lambda = the forget factor%%OUTPUTS% K = the kmer kernel evaluated between all strings%%%For more info, see www.kernel-methods.net%%Author: Tijl De Bie, march 2004, bug fixed october 2004 (thanks to Neil Lawrence).nS=length(S);alphabet=[];for j=1:nS for i=1:length(S{j}) alphabet=union(alphabet,S{j}(i)); endendna=length(alphabet);for j=1:nS ngaps{j}=zeros(length(S{j}),1);endK=zeros(nS,nS);for i=1:na i present=[]; clear newS newind newngaps ct=0; for j=1:nS newind_store=find(S{j}(1:end-k+1)==alphabet(i)); newngaps_store=zeros(size(newind_store)); if ~isempty(newind_store) ct=ct+1; present=[present ; j]; newS{ct}=S{j}; newind{ct}=newind_store; newngaps{ct}=newngaps_store; end end if ~isempty(present) K(present,present)=K(present,present)... +iterates(newS,newind,newind,1,k,newngaps,maxgaps,lambda,alphabet,alphabet(i)); endendfunction K=iterates(S,indstart,indend,it,maxit,ngaps,maxgaps,lambda,alphabet,seq)%function K=iterates(S,indstart,indend,it,maxit,ngaps,maxgaps,lambda,alphabet,seq)%% A help function for kmer_wildcard_kernel.m%%%Author: Tijl De Bie, march 2004.na=length(alphabet);nS=length(S);if it==maxit for i=1:nS number(i,1)=0; for j=1:length(indstart{i}) number(i,1) = number(i,1)... +align_seqs(S{i}(indstart{i}(j):... min(indstart{i}(j)+maxit+maxgaps-1,length(S{i}))),seq,lambda); end end K=number*number'; returnendK=zeros(nS,nS);for i=1:na ct=0; present=[]; clear newindstart newindend newS newgaps for j=1:nS newindstart_store=[]; newindend_store=[]; newngaps_store=[]; for k=1:length(indstart{j}) for offset=1:min(maxgaps-ngaps{j}(k)+1 , length(S{j})-indend{j}(k)) if S{j}(indend{j}(k)+offset)==alphabet(i) newindstart_store=[newindstart_store ; indstart{j}(k)]; newindend_store=[newindend_store ; indend{j}(k)+offset]; newngaps_store=[newngaps_store ; ngaps{j}(k)+offset-1]; break end end end if ~isempty(newindend_store) ct=ct+1; newindstart{ct}=newindstart_store; newindend{ct}=newindend_store; present=[present ; j]; newS{ct}=S{j}; newngaps{ct}=newngaps_store; end end if ct~=0 K(present,present)=K(present,present)... +iterates(newS,newindstart,newindend,it+1,maxit,newngaps,maxgaps,... lambda,alphabet,[seq alphabet(i)]); endendfunction align_seqs=align_seqs(s,t,lambda)% Compute alignment score of a short sequence t to a long one s, where gaps% are allowed only in t, and gaps are exponentially penalized with a factor% lambda.%Note: this can be made more efficient by computing only a thick diagonal%of the DP table!M=zeros(length(t)+1,length(s)+1);%for j=1:length(s)-length(t)+1% M(1,j)=1;%endM(1,1)=1;for j=2:length(s)+1 for i=2:length(t)+1 M(i,j)=(t(i-1)==s(j-1))*M(i-1,j-1)+((i==length(t)+1)*1+(i~=length(t)+1)*lambda)*M(i,j-1); endendalign_seqs=M(end,end);
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -