Departamento de
Traducción e Interpretación


Tema:   Automática.
Autor:   Malik, Muhammad Ghulam Abbas
Año:   2010
Título:   Méthodes et outils pour les problèmes faibles de traduction [Methods and Tools for Weak Problems of Translation]
Lugar:   Grenoble
Editorial/Revista:   Université de Grenoble
Páginas:   262
Idioma:   Francés.
Tipo:   Tesis.
Disponibilidad:   Acceso abierto
Resumen:   Given a source language L1 and a target language L2, a written translation unit S in L1 of n words may have an exponential number N=O(kn)) number of valid translations T1. . . TN. We are interested in the case where N is very small because of the proximity of the written forms of L1 and L2. Our domain of investigation is the class of pairs of language and writing system combinations (Li-Wi, Lj-Wj) such that there may be only one or a very small number of valid translations for any given S of Li written in Wi. The problem of translating a Hindi/Urdu sentence written in Urdu into an equivalent one in Devanagari falls in this class. We call the problem of translation for such a pair a weak translation problem. We have designed and experimented methods of increasing complexity for solving in-stances of this problem, from simple finite-state transduction to the transformation of charts of partial syntax trees, with or without the inclusion of empirical (mainly probabilistic) methods. That leads to the identification of the translation difficulty of a (Li-Wi, Lj-Wj) pair as the degree of complexity of the translation methods achieving a de-sired goal (such as less than 15% error rate). Considering transliteration or transcription as a special case of translation, we have developed a method based on the definition of a universal intermediate transcription (UIT) for given groups of Li-Wi couples and used UIT as a phonetico-graphemic pivot. For handling interdialectal translation into lan-guages with rich flexional morphology, we propose to perform a limited on-demand surface analysis into partial syntax trees and to use it to update and propagate features such as gender and number and to handle word boundary phenomena. Beside large-scale experiments, this work has led to the production of linguistic re-sources such as parallel and tagged corpora and of running systems, all freely available on the Web. They include monolingual corpora, lexicons, morphological analyzers with limited vocabulary, phrase structure grammars of Hindi, Punjabi and Urdu, online web-services for transliteration between Hindi & Urdu, Punjabi (Shahmukhi) & Punjabi (Gurmukhi), etc. An interesting perspective is to apply our techniques to distant L-W pairs, for which they could efficiently produce active learning presentations in the form of multiple pidgin outputs. [Source: Author]
2001-2019 Universidad de Alicante DOI: 10.14198/bitra
Comentarios o sugerencias
La versión española de esta página es obra de Javier Franco
Nueva búsqueda
European Society for Translation Studies Ministerio de Educación Ivitra : Institut Virtual Internacional de Traducció asociación ibérica de estudios de traducción e interpretación