📄 train.tex
字号:
\[ \alpha_j(t) = \left[ \sum_{i=2}^{N-1} \alpha_i(t-1) a_{ij} \right] b_j(\bm{o}_t)\]with initial conditions given by \[ \alpha_1(1) = 1\]\[ \alpha_j(1) = a_{1j} b_j(\bm{o}_1)\]for $1<j<N$ and final condition given by\[ \alpha_N(T) = \sum_{i=2}^{N-1} \alpha_i(T) a_{iN}\]The backward probability $\beta_i(t)$ for $1<i<N$ and $T>t \geq 1$ is calculated by the backward recursion\[ \beta_i(t) = \sum_{j=2}^{N-1} a_{ij} b_j(\bm{o}_{t+1}) \beta_j(t+1)\]with initial conditions given by\[ \beta_i(T) = a_{iN}\]for $1<i<N$ and final condition given by\[ \beta_1(1) = \sum_{j=2}^{N-1} a_{1j} b_j(\bm{o}_1) \beta_j(1)\]In the case of embedded training where the HMM spanning the observationsis a composite constructed by concatenating $Q$ subword models, it isassumed that at time $t$, the $\alpha$ and $\beta$values corresponding to the entry state and exit states of a HMMrepresent the forward and backward probabilities at time $t-\Delta t$and $t+\Delta t$, respectively, where $\Delta t$ is small. The equationsfor calculating $\alpha$ and $\beta$ are then as follows.For the forward probability, the initial conditions are established attime $t=1$ as follows\[ \alpha^{(q)}_{1}(1) = \left\{ \begin{array}{cl} 1 & \mbox{if $q=1$} \\ \alpha^{(q-1)}_1(1) a^{(q-1)}_{1N_{q-1}} & \mbox{otherwise} \end{array} \right.\]\[ \alpha^{(q)}_{j}(1) = a^{(q)}_{1j} b^{(q)}_j(\bm{o}_1)\]\[ \alpha^{(q)}_{N_q}(1) = \sum_{i=2}^{N_q-1} \alpha^{(q)}_{i}(1) a^{(q)}_{iN_q}\]where the superscript in parentheses refers to the index of the model in the sequence of concatenated models. All unspecified values of $\alpha$are zero. For time $t > 1$, \[ \alpha^{(q)}_{1}(t) = \left\{ \begin{array}{cl} 0 & \mbox{if $q=1$} \\ \alpha^{(q-1)}_{N_{q-1}}(t-1) + \alpha^{(q-1)}_1(t) a^{(q-1)}_{1N_{q-1}}& \mbox{otherwise} \end{array} \right.\]\[ \alpha^{(q)}_j(t) = \left[ \alpha^{(q)}_1(t) a^{(q)}_{1j} + \sum_{i=2}^{N_q-1} \alpha^{(q)}_{i}(t-1) a^{(q)}_{ij} \right] b^{(q)}_j(\bm{o}_t)\]\[ \alpha^{(q)}_{N_q}(t) = \sum_{i=2}^{N_q-1} \alpha^{(q)}_{i}(t) a^{(q)}_{iN_q}\]For the backward probability, the initial conditions are set at time$t=T$ as follows\[ \beta^{(q)}_{N_q}(T) = \left\{ \begin{array}{cl} 1 & \mbox{if $q=Q$} \\ \beta^{(q+1)}_{N_{q+1}}(T) a^{(q+1)}_{1N_{q+1}} & \mbox{otherwise} \end{array} \right.\]\[ \beta^{(q)}_i(T) = a^{(q)}_{iN_q} \beta^{(q)}_{N_q}(T)\]\[ \beta^{(q)}_1(T) = \sum^{N_q - 1}_{j=2} a^{(q)}_{1j} b^{(q)}_j(\bm{o}_T) \beta^{(q)}_j(T)\]where once again, all unspecified $\beta$ values are zero. Fortime $t<T$,\[ \beta^{(q)}_{N_q}(t) = \left\{ \begin{array}{cl} 0 & \mbox{if $q=Q$} \\ \beta^{(q+1)}_1(t+1)+ \beta^{(q+1)}_{N_{q+1}} (t) a^{(q+1)}_{1N_{q+1}} & \mbox{otherwise} \end{array} \right.\]\[ \beta^{(q)}_i(t) = a^{(q)}_{iN_q} \beta^{(q)}_{N_q}(t) + \sum_{j=2}^{N_q-1} a^{(q)}_{ij} b^{(q)}_j(\bm{o}_{t+1}) \beta^{(q)}_{j}(t+1) \]\[ \beta^{(q)}_{1}(t) = \sum_{j=2}^{N_q-1} a^{(q)}_{1j} b^{(q)}_j(\bm{o}_t) \beta^{(q)}_{j}(t) \]The total probability $P = \mbox{prob}(\bm{O} | \lambda)$ can be computedfrom either the forward or backward probabilities\[P = \alpha_N(T) = \beta_1(1)\]\subsection{Single Model Reestimation(\htool{HRest})}\index{model training!isolated unit formulae}In this style of model training, a set of training observations$\bm{O}^r, \;\; 1 \leq r \leq R$ is used to estimate the parameters of a single HMM. Thebasic formula for the reestimation of the transition probabilities is\newcommand{\albe}[1]{ \sum_{r=1}^R \frac{1}{P_r} \sum_{t=1}^{T_r} \alpha^r_#1(t)\beta^r_#1(t)}\[ \hat{a}_{ij} = \frac{ \sum_{r=1}^R \frac{1}{P_r} \sum_{t=1}^{T_r-1} \alpha^r_i(t)a_{ij}b_j(\bm{o}^r_{t+1})\beta^r_j(t+1) }{\albe{i}}\]where $1<i<N$ and $1<j<N$ and $P_r$ is the total probability$P = \mbox{prob}(\bm{O}^r | \lambda)$ of the $r$'th observation. The transitions from the non-emitting entry state are reestimated by\[ \hat{a}_{1j} = \frac{1}{R} \sum_{r=1}^R \frac{1}{P_r} \alpha^r_j(1) \beta^r_j(1)\]where $1<j<N$ and the transitions from the emitting states to the finalnon-emitting exit state are reestimated by\[ \hat{a}_{iN} = \frac{ \sum_{r=1}^R \frac{1}{P_r} \alpha^r_i(T)\beta^r_i(T) }{ \sum_{r=1}^R \frac{1}{P_r} \sum_{t=1}^{T_r} \alpha^r_i(t)\beta^r_i(t) }\]where $1<i<N$.For a HMM with $M_s$ mixture components in stream $s$, the means, covariancesand mixture weights for that stream are reestimated as follows.Firstly, the probability of occupying the $m$'th mixture component in stream$s$ at time $t$ for the $r$'th observation is\[ L^r_{jsm}(t) = \frac{1}{P_r} U^r_j(t) c_{jsm} b_{jsm}(\bm{o}^r_{st}) \beta^r_j(t) b^*_{js}(\bm{o}^r_t)\]where\hequation{ U^r_j(t) = \left\{ \begin{array}{cl} a_{1j} & \mbox{if $t=1$} \\ \sum^{N-1}_{i=2} \alpha^r_i(t-1) a_{ij} & \mbox{otherwise} \end{array} \right.}{urjt}and\[ b^*_{js}(\bm{o}^r_t) = \prod_{k \neq s} b_{jk}(\bm{o}^r_{kt})\]For single Gaussian streams, the probability of mixture component occupancy isequal to the probability of state occupancy and hence it is more efficientin this case to use\[ L^r_{jsm}(t) = L^r_{j}(t) = \frac{1}{P_r} \alpha_j(t) \beta_j(t)\]Given the above definitions, the re-estimation formulae may now be expressed in terms of $L^r_{jsm}(t)$ as follows.\newcommand{\liksum}[1]{ \sum_{r=1}^R \sum_{t=1}^{T_r} L^r_{#1}(t)}\[ \hat{\bm{\mu}}_{jsm} = \frac{ \liksum{jsm}\bm{o}^r_{st}}{\liksum{jsm}}\]\hequation{ \hat{\bm{\Sigma}}_{jsm} = \frac{ \liksum{jsm}(\bm{o}^r_{st} - \hat{\bm{\mu}}_{jsm}) (\bm{o}^r_{st} - \hat{\bm{\mu}}_{jsm})' }{\liksum{jsm}}}{sigjsm}\[ \bm{c}_{jsm} = \frac{\liksum{jsm}}{\liksum{j}}\]\subsection{Embedded Model Reestimation(\htool{HERest})}\index{model training!embedded subword formulae}The re-estimation formulae for the embedded model case haveto be modified to take account of the fact that theentry states can be occupied at any time as a resultof transitions out of the previous model. The basicformulae for the re-estimation of the transitionprobabilities is\newcommand{\albeq}[1]{ \sum_{r=1}^R \frac{1}{P_r} \sum_{t=1}^{T_r} \alpha^{(q)r}_#1(t)\beta^{(q)r}_#1(t)}\[ \hat{a}^{(q)}_{ij} = \frac{ \sum_{r=1}^R \frac{1}{P_r} \sum_{t=1}^{T_r-1} \alpha^{(q)r}_i(t) a^{(q)}_{ij}b^{(q)}_j(\bm{o}^r_{t+1}) \beta^{(q)r}_j(t+1) }{\albeq{i}}\]The transitions from the non-emitting entry states into the HMM are re-estimated by\[ \hat{a}^{(q)}_{1j} = \frac{ \sum_{r=1}^R \frac{1}{P_r} \sum_{t=1}^{T_r-1} \alpha^{(q)r}_1(t) a^{(q)}_{1j}b^{(q)}_j(\bm{o}^r_{t}) \beta^{(q)r}_j(t) }{\albeq{1} + \alpha^{(q)r}_{1}(t)a^{(q)}_{1N_q}\beta^{(q+1)r}_1(t)}\]and the transitions out of the HMM into the non-emitting exit states are re-estimated by\[ \hat{a}^{(q)}_{iN_q} = \frac{ \sum_{r=1}^R \frac{1}{P_r} \sum_{t=1}^{T_r-1} \alpha^{(q)r}_i(t) a^{(q)}_{iN_q} \beta^{(q)r}_{N_q}(t) }{\albeq{i}}\]Finally, the direct transitions from non-emitting entry to non-emitting exit states arere-estimated by\[ \hat{a}^{(q)}_{1N_q} = \frac{ \sum_{r=1}^R \frac{1}{P_r} \sum_{t=1}^{T_r-1} \alpha^{(q)r}_1(t) a^{(q)}_{1N_q} \beta^{(q+1)r}_1(t) }{\albeq{i} + \alpha^{(q)r}_{1}(t)a^{(q)}_{1N_q}\beta^{(q+1)r}_1(t)}\]The re-estimation formulae for the output distributions are thesame as for the single model case except for the obvious additional subscript for $q$. However, theprobability calculations must now allow for transitions from theentry states by changing $U^r_j(t)$ in equation~\ref{e:urjt} to\[ U^{(q)r}_j(t) = \left\{ \begin{array}{cl} \alpha^{(q)r}_1(t) a^{(q)}_{1j} & \mbox{if $t=1$} \\ \alpha^{(q)r}_1(t) a^{(q)}_{1j} + \sum^{N_q-1}_{i=2} \alpha^{(q)r}_i(t-1) a^{(q)}_{ij} & \mbox{otherwise} \end{array} \right.\]%%% Local Variables: %%% mode: plain-tex%%% TeX-master: "htkbook"%%% End:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -