Tema:   Automática. Corpus. Terminología. Técnico. Género.
Autor:   Clouet, Elizaveta Loginova
Año:   2014
Título:   Traitement automatique des termes composés: segmentation, traduction et variation [Automatic management of compound tems: segmentation, translation and variation]
Lugar:   Nantes
Editorial/Revista:   Université de Nantes
Páginas:   160
Idioma:   Francés.
Tipo:   Tesis.
Disponibilidad:   Acceso abierto
Resumen:   The number of specialized terms continuously grows in the documents, at a pace which is difficult to follow for the terminology standardization organizations. The methods of bilingual term lexicon construction from the text corpora provide solutions. Our thesis falls into this topic: bilingual lexicon acquisition from comparable corpora. Compound terms (terms containing several roots, but a single graphical unit) are challenging for natural language processing applications. Given their graphical form, they are often handled in the same manner as single word terms, which prevents from apprehending their semantic complexity. Our involvement in an automatical terminology extraction evaluation allowed us to check our hypothesis: compound terms need a particular processing in a multilingual context. We proposed a method for compound terms recognition and splitting, which combines language-independent and language-specific features. It allowed us to obtain results comparable with those of state-of-the-art methods, while validating on a sample of languages from several families (germanic, slavic, romance languages), and adapting the method to specialized domains (tested on two domains: wind energy and breast cancer). We used the produced segmentations for compositional translation of compound terms, and for their multi-word variant recognition in the specialized texts. These two experiments illustrate that compound splitting is beneficial for the bilingual term lexicon acquisition task. [Source: Author]
