Departamento de
Traducción e Interpretación


Tema:   Automática. Calidad. Problema.
Autor:   Henriquez Q., Carlos Alberto
Año:   2014
Título:   Improving statistical machine translation through adaptation and learning
Lugar:   Barcelona
Editorial/Revista:   Universitat Politècnica de Catalunya
Páginas:   109
Idioma:   Inglés.
Tipo:   Tesis.
Disponibilidad:   Acceso abierto.
Índice:   1. State of the Art; 2. Translation-based Word Alignment; 3. Derived Units.
Resumen:   This thesis proposes a new method to improve a Statistical Machine Translation (SMT) system using post-editions of translation outputs. The strategy can be related with domain adaptation, where the in-domain data correspond to post-editions coming from real users of the SMT system. The method compares the post-editions with the translation output in order to automatically detect where the decoder made a mistake and learn from it. Once the errors have been detected, a word alignment is computed between input and post-edition to extract translation units that are then incorporated into the baseline system to fix those errors for future translations. Results show statistically significant improvements with a post-edited collection that is only 0.5% the size of the training material. A qualitative analysis is also studied to validate this results. Improvements are mostly lexical and of word reordering, followed by morphological corrections. The strategy, which introduces the concepts of Augmented Corpus, similarity function and Derived Units, is tested with two SMT paradigms (N -gram- based and Phrase-based), two language pairs (Catalan-Spanish and English-Spanish) and in different domain adaptation scenarios, including an open world domain where the system was adapted to request of any domain collected from real users over the internet, all giving similar results. The results obtained are part of project FAUST (Feedback Analysis for User adaptive Statistical Translation), a project from the Seventh Framework Programme of the European Commission. [Source: Author]
Agradecimientos:   Record supplied by the Departament de Traducció i Interpretació i Estudis de l'Àsia Oriental (Universitat Autònoma de Barcelona).
2001-2021 Universidad de Alicante DOI: 10.14198/bitra
Comentarios o sugerencias
La versión española de esta página es obra de Javier Franco
Nueva búsqueda
European Society for Translation Studies Ministerio de Educación Ivitra : Institut Virtual Internacional de Traducció asociación ibérica de estudios de traducción e interpretación