⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 learn_hmm.m

📁 隐马尔科夫模型对文本信息进行抽取利用MATLAB实现
💻 M
字号:
function [LL, prior, transmat, obsmat, gamma] = learn_hmm(...    data, prior, transmat, obsmat, max_iter, thresh, verbose, act)% LEARN_HMM Find the ML parameters of an HMM  with discrete outputs using EM.%% [LL, PRIOR, TRANSMAT, OBSMAT] = LEARN_HMM(DATA, PRIOR0, TRANSMAT0, OBSMAT0) computes ML% estimates of the following parameters, where, for each time t, Q(t) is the hidden state, and% Y(t) is the observation%   prior(i) = Pr(Q(1) = i)%   transmat(i,j) = Pr(Q(t+1)=j | Q(t)=i)%   obsmat(i,o) = Pr(Y(t)=o | Q(t)=i)% PRIOR0 is the initial estimate of PRIOR, etc.% Row l of DATA is the observation sequence for example l. If the sequences are of% different lengths, you can pass in a cell array, so DATA{l} is a vector.% LL is the "learning curve": a vector of the log lik.elihood values at each iteration.%% There are several optional arguments, which should be passed in the following order%   LEARN_HMM(DATA, PRIOR, TRANSMAT, OBSMAT, MAX_ITER, THRESH, VERBOSE)% These have the following meanings%   max_iter = max. num EM steps to take (default 10)%   thresh = threshold for stopping EM (default 1e-4)%  verbose = 0 to suppress the display of the log lik at each iteration (Default 1).%% If the transition matrix is non-stationary (e.g., as in a POMDP),% then TRANSMAT should be a cell array, where T{a}(i,j) = Pr(Q(t+1)=j|Q(t)=i,A(t)=a).% Specify the sequence of As in the same form as DATA:%   LEARN_HMM(DATA, PRIOR, TRANSMAT, OBSMAT, MAX_ITER, THRESH, VERBOSE, AS)%% For online EM, just pass in the current sliding window of data and set max_iter = 1.% The smoothed window is returned as an optional final argument:%   [LL, PRIOR, TRANSMAT, OBSMAT, GAMMA] = LEARN_HMM(...)% This obviously only makes sense if there is a single sequence.if ~exist('max_iter'), max_iter = 10; endif ~exist('thresh'), thresh = 1e-4; endif ~exist('verbose'), verbose = 1; endif ~exist('act'), act = []; endprevious_loglik = -inf;loglik = 0;converged = 0;num_iter = 1;LL = [];if ~iscell(data)  data = num2cell(data, 2); % each row gets its own cellendnumex = length(data);while (num_iter <= max_iter) & ~converged  % E step  [loglik, exp_num_trans, exp_num_visits1, exp_num_emit, gamma] = ...      compute_ess(prior, transmat, obsmat, data, act);  if verbose, fprintf(1, 'iteration %d, loglik = %f\n', num_iter, loglik); end  num_iter =  num_iter + 1;  % M step  prior = normalise(exp_num_visits1);  transmat = mk_stochastic(exp_num_trans);  obsmat = mk_stochastic(exp_num_emit);  converged = em_converged(loglik, previous_loglik, thresh);  previous_loglik = loglik;  LL = [LL loglik];end%%%%%%%%%%%function [loglik, exp_num_trans, exp_num_visits1, exp_num_emit, gamma] = ...    compute_ess(prior, transmat, obsmat, data, act)%% Compute the Expected Sufficient Statistics for a discrete Hidden Markov Model.%% Outputs:% exp_num_trans(i,j) = sum_l sum_{t=2}^T Pr(X(t-1) = i, X(t) = j| Obs(l))% exp_num_visits1(i) = sum_l Pr(X(1)=i | Obs(l))% exp_num_emit(i,o) = sum_l sum_{t=1}^T Pr(X(t) = i, O(t)=o| Obs(l))% where Obs(l) = O_1 .. O_T for sequence l.obsmat1 = obsmat;numex = length(data);[S O] = size(obsmat);exp_num_trans = zeros(S,S);exp_num_visits1 = zeros(S,1);exp_num_emit = zeros(S,O);loglik = 0;for ex=1:numex  obs = data{ex};  T = length(obs);  olikseq = mk_dhmm_obs_lik(obs, obsmat, obsmat1);  if isempty(act)    [alpha, beta, gamma, xi, current_loglik] = forwards_backwards(prior, transmat, olikseq);  else    [alpha, beta, gamma, xi, current_loglik] = forwards_backwards(prior, transmat, olikseq, [], [], act{ex});  end  loglik = loglik +  current_loglik;   exp_num_trans = exp_num_trans + sum(xi,3);  exp_num_visits1 = exp_num_visits1 + gamma(:,1);  if T < O    for t=1:T      o = obs(t);      exp_num_emit(:,o) = exp_num_emit(:,o) + gamma(:,t);    end  else    for o=1:O      ndx = find(obs==o);      if ~isempty(ndx)	exp_num_emit(:,o) = exp_num_emit(:,o) + sum(gamma(:, ndx), 2);      end    end  endend

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -