📄 report.tex.svn-base
字号:
model trained over morphological features.We extend this model to address the problems of poor generalization bydecomposing the surface-form translation process into two parallelprocesses: translation of lemmas and morphological features. Lemmasand morphological features are then recombined in a generation processto create target surface forms. Details of both these models andexperiments we conducted with them are described below.For both sets of experiments {\tt EuroMini} data was preprocessed tousing FreeLing~\cite{Atserias:2006} for both part of speech tags andmorphological analysis.\subsection{Explicit Agreement and Coherence Models}In these experiments, we applied factored models as an extension tostandard phrase-based MT. In these experiments standardsurface-to-surface translation is performed, but we add stochasticconstraints on possible target hypotheses to limit enforce (in a softway) agreement relations. This is done by generating a latent factor(in this case morphological features) from each target word that ishypothesized during decoding. Hypotheses are then scored with bothstandard MT model components and language models trained overagreement features. Figure~\ref{fig:latent-factor-check} shows theconfiguration of two models described below: Verb/Noun/Preposition(VNP) and Noun/Determiner/Adjective (NDA).\begin{figure}[t]\begin{center}\includegraphics[width=4cm]{wade-latent-factors} \caption{Latent Factor Checking using Agreement Features}\label{fig:latent-factor-check}\end{center}\end{figure}This configuration was also used with POS tags to ensure long termcoherence. Table~\ref{tab:LM-models} summarizes different models to explicitlycheck for agreement and ensure coherence.\begin{table}[h] \begin{center} \begin{tabular}{|l|l|} \hline \bf Problems Addressed & \bf Model Type \\ \hline \hline Explicit Agreement & -- LMs over verbs + subjects \\ & -- LMs over nouns + determiners + adjectives \\ \hline Long Span Coherence & -- LMs over POS Tags \\ \hline \end{tabular} \end{center} \caption{LM-based Agreement and Coherence Models} \label{tab:LM-models}\end{table}\subsubsection{}\subsubsection{Verb/Subject and Noun-Phrase Agreement Models}In a first set of experiments we used features derived from amorphological analysis of the training data to create n-gram languagemodels.We produced two models for experimentation. In one, NDA, number andgender features were generated for each noun, determiner and adjectivein a hypothesis during decoding. Non-NDA words deterministicallygenerated ``don't care'' features.In a second model, VNP, features required for Spanish verb-subjectagreement were chosen in addition the identity of preposition inhypothesized sentences. The inclusion of prepositions allows uspotentially to model the selection relationship between verbs andprepositions (though this is not strictly an agreement phenomenon).Table~\ref{tab:nda-vnp} shows the features used for both these modelsand their possible values.\begin{table}[h] \begin{center} \begin{tabular}{|l|} \hline \bf NDA Features \\ \hline \hline {\bf Gender:} {\it masc, fem, common, none} \\ \hline {\bf Number:} {\it sing, plural, invariable, none} \\ \hline \hline \bf VNP Features \\ \hline \hline {\bf Number:} {\it sing, plural, invariable, none} \\ \hline {\bf Person:} {\it 1p, 2p, 3p, none} \\ \hline {\bf Prep-ID:} {\it preposition, none} \\ \hline \end{tabular} \end{center} \caption{Latent Features used for NDA and VNP models} \label{tab:nda-vnp}\end{table}Since ``don't care'' values can intervene between words with NDA andVNP features, we also experimented with language models that skipwords lacking features of interest. This effectively increases thecontext length for our n-gram based models and should yield morerobust estimation. This is shown schematically inFigure~\ref{fig:skipped-lm-nda-vnp}. Factors marked with ``X'' are notscored in VNP or NDA language models.\begin{figure}[t]\begin{center}\includegraphics[width=10cm]{wade-skipped} \caption{Latent Factor Checking using Agreement Features}\label{fig:skipped-lm-nda-vnp}\end{center}\end{figure}Results with models {\tt EuroMini} corpus using the standard 60k-word3-gram language model and evaluated against the 2005 ACL WorkshopShared Task are shown in Table~\ref{tab:nda-vnp-perf}. Both NDA andVPN models improve performance over the baseline system. We ranadditional experiments that incorporated all morphological featuresfrom our analyzer and this too improved the performance of the system,though inclusion of part of speech information did not. The use ofall morphological features closes the gap between the baseline systemand it's large LM counterpart by 74\%, suggesting that better targetlanguage modeling could compensate for the sparsity of target languagedata.\begin{table}[h] \begin{center} \begin{tabular}{|l|r|} \hline \bf Model & \bf BLEU \\ \hline \hline {\bf Baseline} & 23.41 \\ \hline {\bf Baseline + 950k LM} & 25.10 \\ \hline {\it NDA} & 24.47 \\ \hline {\it VPN} & 24.33 \\ \hline {\bf BOTH} & {\bf 24.54} \\ \hline \hline {\it NDA w/skipping} & 24.03 \\ \hline {\it VPN w/skipping} & 24.16 \\ \hline \hline {\bf All Morph Features} & {\bf 24.66} \\ \hline {\it All Morph Features + POS Tag} & 24.25 \\ \hline \end{tabular} \end{center} \caption{Latent Features used for NDA and VNP models} \label{tab:nda-vnp-perf}\end{table}Unfortunately, the use of skipping models didn't improve performance.Further experiments will be necessary as interactions with pruningduring decoding may have limited the performance of these systems.\subsubsection{Lemma-based Models for Translation}The agreement models describe above use a factored approach to addstatistical constraints on target language sequences. We attempted toextend this model by improving the underlying translation modeling.To do this, we created a parallel translation model in which sourcewords are decomposed into base lemma and morphological features. Bothlemmas and morphological features are translated from the source tothe target language. Target words are then re-composed through astatistically trained generation process on the target side. Thisprocess is shown schematically in Figure~\ref{fig:parallel-trans}.Language models are then applied to both morphological features andsurface forms to constrain the output.\begin{figure}[t]\begin{center}\includegraphics[width=10cm]{wade-parallel-trans} \caption{A Factored Model for Parallel Translation of Lemmas and Morphology}\label{fig:parallel-trans}\end{center}\end{figure}Table~\ref{tab:lemma+morph-results} shows the performance of thissystem on a 2005 ACL Workshop Shared Task when a large language modelwas used (trained with 950k sentences). As theseexperiments show, the lemma+morphology can lead to improvements through better translation modeling and with the addition of a morphology LM.\begin{table}[h] \begin{center} \begin{tabular}{|l|r|} \hline \bf Model & \bf BLEU \\ \hline \hline {\it Baseline} & 25.10 \\ \hline {\it Lemma + Morph } & 25.93 \\ {\it Lemma + Morph plus Morph LM} & 26.11 \\ \hline \end{tabular} \end{center} \caption{Results from Lemma+Morphology Models} \label{tab:lemma+morph-r
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -