Departamento de
Traducción e Interpretación


Tema:   Corpus. Investigación.
Autor:   Fleuri, Lilian Jurkevicz
Año:   2013
Título:   Uma proposta de sistematização metodológica para compilação de corpus paralelo bilíngue e de pequena dimensão [A proposal of methodological systematization for the compilation of bilingual and small parallel corpora]
Lugar:   Florianópolis (Santa Catarina)
Editorial/Revista:   Universidade Federal de Santa Catarina - UFSC
Páginas:   425
Idioma:   Portugués
Tipo:   Tesis
Disponibilidad:   Acceso abierto
Resumen:   Within the context of research in the project CORDIALL and TRACOR developed at UFMG and UFSC (Brazil), this PhD thesis presents a methodological proposal for corpus compilation, based on the profiles and the needs of 20 Master's thesis in the interface with Translation Studies, Corpus Linguistics, and Systemic-Functional Linguistics, developed in Brazil between 2003 and 2010. The methodological and theoretical concepts that conduces this thesis are presented by the Corpus- Based Method presented in Corpus Linguistics (Barnbrook, 1996, Kennedy, 1998; Bowker, 2001; Mason, 2008) and the Corpus-Based Translation Studies (cf. Baker, 1995; Olohan, 2004; Vasconcellos., 2009; Assis, 2012; Feitosa, 2005; Fernandes, 2006). The methods followed in this research consist in: colecting and describing the 20 MA thesis mencioned before; studying their methods; creating a fast and economic methodological proposal for corpora compiling, using programming features of Word Processor and Spreadsheet Application; and testing this method in an Pilot Study. The analysis of the methodological profile of these 20 Master thesis identifies that their parallel bilingual small corpus compilation processes are the following: (i) corpus preparation for a semi-automatic alignment; (ii) alignment; (iii) corpus annotation and annotation edition; and (iv) data quantification. Nevertheless, the analysis points to methodological inconsistencies in the processes of corpus compilation, which can impair the investigation itself or the continuation of it in further studies. The inconsistencies concern the: (i) high amount of time spent on the compilation processes; (ii) high number of stages involved in just one process; (iii) high number of transitions between different software; and (iv) high production of documents. Based on these results, this Thesis proposes to solve such inconsistencies by creating an efficiate method of corpus compilation, that aims to: (i) fasten the compilation processes; (ii) reduce the number of stages involved in each process; (iii) reduce the number of software accessed during the corpus compilation; (iv) reduce the number of production of different documents; and plus (v) to turn the corpus annotation more flexible. The proposal is developed on MS Office software (MS Word and MS Excel). Templates with Macros and Formula are created and tested in a Pilot Study, whose results are compared with the corpus compilation results in Fleuri (2006). The methodological proposal reveals to fasten the process of (i) corpus preparation for the alignment; (ii) alignment; (iii) data quantification; and to make the corpus annotation more flexible and the data display more organized (in tables and graphics). The Pilot Study, comparing to Fleuri (2006), reduced to 1/4 the total time involved in the corpus compilation; to 1/2 the total number of stages involved in the corpus compilation; to 1/5 the total number of transitions among different software and to less than 1/2 the number of document produced. [Source: Author]
Agradecimientos:   Record supplied by Katia Aily Franco de Camargo – (Universidade Federal do Rio Grande do Norte – UFRN).
