Departamento de
Traducción e Interpretación


Tema:   Automática.
Autor:   Sankaran, Baskaran
Año:   2013
Título:   Improvements in Hierarchical Phrase-based Statistical Machine Translation
Lugar:   Burnaby
Editorial/Revista:   Simon Fraser University
Idioma:   Inglés
Tipo:   Tesis
ISBN/ISSN/DOI:   ISBN: 9780499241436.
Resumen:   Hierarchical phrase-based translation (Hiero) is a statistical machine translation (SMT) model that encodes translation as a synchronous context-free grammar derivation between source and target language strings (Chiang, 2005; Chiang, 2007). Hiero models are more powerful than phrase-based models in capturing complex source-target reordering as well as discontiguous phrases, while being easier to estimate and decode with compared to their full syntax-based counterparts. In this thesis, we propose improvements to two broad aspects of the Hiero translation pipeline: i) learning Hiero translation model and estimating their parameters and ii) parameter tuning for discriminative log-linear models that are used to decode with such features. We use our own open-source implementation of Hiero called 'Kriya' (Sankaran et al., 2012b) for all the experiments in this thesis. This thesis contains the following specific contributions: (1) We propose a Bayesian model for learning Hiero grammars as an alternative to the heuristic method usually used in Hiero. Our model learns a peaked distribution of grammars, which consistently performs better than the heuristically extracted grammars across several language pairs (Sankaran et al., 2013a). (2) We propose a novel 'unified-cascade framework' for jointly learning alignments and the Hiero translation rules by removing the disconnect between the alignments and extracted synchronous context-free grammar. This is the first time a joint training framework is being proposed for Hiero, where we iterate the two step inference so that it learns in alternate iterations the phrase alignments and then the Hiero rules that are consistent with alignments. (3) We extend our Bayesian model for extracting 'compact' Hiero translation rules using arity-1 grammars, resulting in up to 57% reduction in model size while retaining the translation performance (Sankaran et al., 2011; Sankaran et al., 2012a). (4) We propose several novel approaches for parameter tuning of discriminative log-linear models for SMT which can be used for jointly optimizing towards multiple evaluation metrics. We show that our methods for multi-objective tuning for SMT yield substantial gains in translation quality measured through automatic as well as human evaluations (Sankaran et al., 2013b; Duh et al., 2013). [Source: Author]
2001-2021 Universidad de Alicante DOI: 10.14198/bitra
Comentarios o sugerencias
La versión española de esta página es obra de Javier Franco
Nueva búsqueda
European Society for Translation Studies Ministerio de Educación Ivitra : Institut Virtual Internacional de Traducció asociación ibérica de estudios de traducción e interpretación