📄 labman2.tex
字号:
$\bullet$ state 3: Gaussian ${\cal N}_{/i/}$ \\$\bullet$ state 4: Gaussian ${\cal N}_{/y/}$ \\$\bullet$ state 5: final state \end{tabular} & \[ \left[\begin{array}{lllll} 0.0 & \bf 1.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \bf 0.5 & \bf 0.5 & 0.0 & 0.0 \\ 0.0 & 0.0 & \bf 0.5 & \bf 0.5 & 0.0 \\ 0.0 & 0.0 & 0.0 & \bf 0.5 & \bf 0.5 \\ 0.0 & 0.0 & 0.0 & 0.0 & \bf 1.0 \end{array}\right] \] & \input{hmm3.pstex_t} \\ % \begin{tabular}{>{\hspace{-2em}}l}{\bf HMM4}\,: \\$\bullet$ state 1: initial state \\$\bullet$ state 2: Gaussian ${\cal N}_{/a/}$ \\$\bullet$ state 3: Gaussian ${\cal N}_{/i/}$ \\$\bullet$ state 4: Gaussian ${\cal N}_{/y/}$ \\$\bullet$ state 5: final state \end{tabular} & \[ \left[\begin{array}{lllll} 0.0 & \bf 1.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \bf 0.95 & \bf 0.05 & 0.0 & 0.0 \\ 0.0 & 0.0 & \bf 0.95 & \bf 0.05 & 0.0 \\ 0.0 & 0.0 & 0.0 & \bf 0.95 & \bf 0.05 \\ 0.0 & 0.0 & 0.0 & 0.0 & \bf 1.0 \end{array}\right] \] & \input{hmm4.pstex_t} \\ % \begin{tabular}{>{\hspace{-2em}}l}{\bf HMM5}\,: \\$\bullet$ state 1: initial state \\$\bullet$ state 2: Gaussian ${\cal N}_{/y/}$ \\$\bullet$ state 3: Gaussian ${\cal N}_{/i/}$ \\$\bullet$ state 4: Gaussian ${\cal N}_{/a/}$ \\$\bullet$ state 5: final state \end{tabular} & \[ \left[\begin{array}{lllll} 0.0 & \bf 1.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \bf 0.95 & \bf 0.05 & 0.0 & 0.0 \\ 0.0 & 0.0 & \bf 0.95 & \bf 0.05 & 0.0 \\ 0.0 & 0.0 & 0.0 & \bf 0.95 & \bf 0.05 \\ 0.0 & 0.0 & 0.0 & 0.0 & \bf 1.0 \end{array}\right] \] & \input{hmm5.pstex_t} \\ % \begin{tabular}{>{\hspace{-2em}}l}{\bf HMM6}\,: \\$\bullet$ state 1: initial state \\$\bullet$ state 2: Gaussian ${\cal N}_{/a/}$ \\$\bullet$ state 3: Gaussian ${\cal N}_{/i/}$ \\$\bullet$ state 4: Gaussian ${\cal N}_{/e/}$ \\$\bullet$ state 5: final state \end{tabular} & \[ \left[\begin{array}{lllll} 0.0 & \bf 1.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \bf 0.95 & \bf 0.05 & 0.0 & 0.0 \\ 0.0 & 0.0 & \bf 0.95 & \bf 0.05 & 0.0 \\ 0.0 & 0.0 & 0.0 & \bf 0.95 & \bf 0.05 \\ 0.0 & 0.0 & 0.0 & 0.0 & \bf 1.0 \end{array}\right] \] & \input{hmm6.pstex_t} \\\end{tabular}\caption{\label{tab:models}List of the Markov models used in the experiments.}\end{table}The parameters of the densities and of the Markov models are stored in thefile \com{data.mat}. A Markov model named, e.g., \com{hmm1} is stored as anobject with fields \com{hmm1.means}, \com{hmm1.vars} and \com{hmm1.trans},and corresponds to the model HMM1 of table~\ref{tab:models}. The\com{means} fields contains a list of mean vectors; the \com{vars} fieldcontains a list of variance matrices; the \com{trans} field contains thetransition matrix; e.g to access the mean of the $3^{rd}$ state of\com{hmm1}, use\,: \\%\mat{hmm1.means\{3\}}%The initial and final states are characterized by an empty mean and variancevalue.\subsubsection*{Preliminary Matlab commands\,:}Before realizing the experiments, execute the following commands\,: \\\mat{colordef none; \% Set a black background for the figures}\mat{load data; \% Load the experimental data}\mat{whos \% View the loaded variables}\pagebreak%%%%%%%%%%%%%%%%%%\section{Generating samples from Hidden Markov Models}\label{sec:generating}%%%%%%%%%%%%%%%%%%\subsubsection*{Experiment\,:}Generate a sample $X$ coming from the Hidden Markov Models HMM1, HMM2 andHMM4. Use the function \com{genhmm} (\com{>> help genhmm}) to do severaldraws with each of these models. View the resulting samples and statesequences with the help of the functions \com{plotseq} and \com{plotseq2}.\subsubsection*{Example\,:}Do a draw\,: \\\mat{[X,stateSeq] = genhmm(hmm1);}\noindent Use the functions \com{plotseq} and \com{plotseq2} to picture theobtained 2-dimensional data. In the resulting views, the obtained sequencesare represented by a yellow line where each point is overlaid with acolored dot. The different colors indicate the state from which anyparticular point has been drawn. \\\mat{figure; plotseq(X,stateSeq); \% View of both dimensions as separate sequences}This view highlights the notion of sequence of states associated with asequence of sample points. \\\mat{figure; plotseq2(X,stateSeq,hmm1); \% 2D view of the resulting sequence,}\mbox{\hspace{46ex}} \com{\% with the location of the Gaussian states} \\This view highlights the spatial repartition of the sample points.\noindent Draw several new samples with the same parameters and visualize them\,: \\\mat{clf; [X,stateSeq] = genhmm(hmm1); plotseq(X,stateSeq);}(To be repeated several times.)\noindent Repeat with another model\,: \\\mat{[X,stateSeq] = genhmm(hmm2);plotseq(X,stateSeq);}and re-iterate the experiment. Also re-iterate with model HMM3.\subsubsection*{Questions\,:}\begin{enumerate}\item How can you verify that a transition matrix is valid ?\item What is the effect of the different transition matrices on thesequences obtained during the current experiment~? Hence, what is the roleof the transition probabilities in the Markovian modeling framework~?\item What would happen if we didn't have a final state ?\item In the case of HMMs with plain Gaussian emission probabilities, whatquantities should be present in the complete parameter set $\Theta$ thatspecifies a particular model~?If the model is ergodic with $N$ states (including the initial and final),and represents data of dimension $D$, what is the total number ofparameters in $\Theta$~?\item Which type of HMM (ergodic or left-right) would you use to modelwords ?\end{enumerate}\subsubsection*{Answers\,:}\expl{\begin{enumerate}\item In a transition matrix, the element of row $i$ and column $j$specifies the probability to go from state $i$ to state $j$. Hence, thevalues on row $i$ specify the probability of all the possible transitionsthat start from state $i$. This set of transitions must be a {\em completeset of discrete events}. Hence, the terms of the $i^{th}$ row of thematrix must sum up to $1$. Similarly, the sum of all the elements of thematrix is equal to the number of states in the HMM.\end{enumerate}\hfill (Answers continue on the next page...)}\subsubsection*{Answers (continued)\,:}\expl{\begin{enumerate}\setcounter{enumi}{1}\item The transition matrix of HMM1 indicates that the probability ofstaying in a particular state is close to the probability of transiting toanother state. Hence, it allows for frequent jumps from one state to anyother state. The observation variable therefore frequently jumps from one``phoneme'' to any other, forming sharply changing sequences like/a,i,a,y,y,i,a,y,y,$\cdots$/.\tab Alternately, the transition matrix of HMM2 specifies highprobabilities of staying in a particular state. Hence, it allows for more``stable'' sequences, like /a,a,a,y,y,y,i,i,i,i,i,y,y,$\cdots$/.\tab Finally, the transition matrix of HMM4 also rules the order in whichthe states are browsed\,: the given probabilities force the observationvariable to go through /a/, then to go through /i/, and finally to stay in/y/, e.g. /a,a,a,a,i,i,i,y,y,y,y,$\cdots$/.\tab Hence, the role of the transition probabilities is to {\em introduce atemporal (or spatial) structure in the modeling of random sequences}.\tab Furthermore, the obtained sequences have variable lengths\,: thetransition probabilities implicitly model a variability in the duration ofthe sequences. As a matter of fact, different speakers or differentspeaking conditions introduce a variability in the phoneme or worddurations. In this respect, HMMs are particularly well adapted to speechmodeling.\item If we didn't have a final state, the observation variable wouldwander from state to state indefinitely, and the model would necessarilycorrespond to sequences of infinite length.\item In the case of HMMs with Gaussian emission probabilities, theparameter set $\Theta$ comprises\,:\begin{itemize}\item the transition probabilities $a_{ij}$;\item the parameters of the Gaussian densities characterizing each state,i.e. the means $\mu_i$ and the variances $\Sigma_i$.\end{itemize}The initial state distribution is sometimes modeled as an additionalparameter instead of being represented in the transition matrix.In the case of an ergodic HMM with $N$ emitting states and Gaussianemission probabilities, we have\,: \begin{itemize}\item $(N-2) \times (N-2)$ transitions, plus $(N-2)$ initial stateprobabilities and $(N-2)$ probabilities to go to the final state;\item $(N-2)$ emitting states where each pdf is characterized by a $D$dimensional mean and a $D \times D$ covariance matrix.\end{itemize}Hence, in this case, the total number of parameters is $(N-2) \times \left(N + D \times (D+1) \right)$. Note that this number grows exponentially withthe number of states and the dimension of the data.\item Words are made of ordered sequences of phonemes\,: /h/ is followed by/e/ and then by /l/ in the word ``hello''. Each phoneme can in turn beconsidered as a particular random process (possibly Gaussian). Thisstructure can be adequately modeled by a left-right HMM.\tab In ``real world'' speech recognition, the phoneme themselves are oftenmodeled as left-right HMMs rather than plain Gaussian densities (e.g. tomodel separately the attack, then the stable part of the phoneme andfinally the ``end'' of it). Words are then represented by large HMMs madeof concatenations of smaller phonetic HMMs.\end{enumerate}}\vfill\pagebreak%%%%%%%%%%%%%%%%%%\section{Pattern recognition with HMMs}%%%%%%%%%%%%%%%%%%%%%%%%%%%\subsection{Likelihood of a sequence given a HMM}%%%%%%%%%In section~\ref{sec:generating}, we have generated some stochasticobservation sequences from various HMMs. Now, it is useful to study thereverse problem, namely\,: given a new observation sequence and a set ofmodels, which model explains best the sequence, or in other terms whichmodel gives the highest likelihood to the data~?To solve this problem, it is necessary to compute $p(X|\Theta)$, i.e. thelikelihood of an observation sequence given a model.\subsubsection*{Useful formulas and definitions\,:}\begin{itemize}\item[-] {\em Probability of a state sequence}\,: the probability of astate sequence $Q=\{q_1,\cdots,q_T\}$ coming from a HMM with parameters$\Theta$ corresponds to the product of the transition probabilities fromone state to the following\,:\[P(Q|\Theta) = \prod_{t=1}^{T-1} a_{t,t+1}= a_{1,2} \cdot a_{2,3} \cdots a_{T-1,T}\]%\item[-] {\em Likelihood of an observation sequence given a statesequence}, or {\em likelihood of an observation sequence along a singlepath}\,: given an observation sequence $X=\{x_1,x_2,\cdots,x_T\}$ and astate sequence $Q=\{q_1,\cdots,q_T\}$ (of the same length) determined froma HMM with parameters $\Theta$, the likelihood of $X$ along the path $Q$ isequal to\,:\[p(X|Q,\Theta) = \prod_{i=1}^T p(x_i|q_i,\Theta)= b_1(x_1) \cdot b_2(x_2) \cdots b_T(x_T)\]i.e. it is the product of the emission probabilities computed along theconsidered path.In the previous lab, we had learned how to compute the likelihood of asingle observation with respect to a Gaussian model. This method can beapplied here, for each term $x_i$, if the states contain Gaussian pdfs.%\item[-] {\em Joint likelihood of an observation sequence $X$ and a path$Q$}\,: it consists in the probability that $X$ and $Q$ occursimultaneously, $p(X,Q|\Theta)$, and decomposes into a product of the twoquantities defined previously\,:\[p(X,Q|\Theta) = p(X|Q,\Theta) P(Q|\Theta) \mbox{\hspace{3em}(Bayes)}\]%\item[-] {\em Likelihood of a sequence with respect to a HMM}\,: the
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -