On identification of bilingual lexical bundles for translation purposes
-
Łukasz Grabowski
Abstract
Grounded in phraseology and corpus linguistics, this paper aims to explore the use of bilingual lexical bundles to improve the degree of naturalness and textual fit of translated texts. More specifically, this study attempts to identify lexical bundles, that is, recurrent sequences of 3–7 words with similar discursive functions in a purpose-designed comparable corpus of English and Polish patient information leaflets, with 100 text samples in each language. Because of cross-linguistic differences, we additionally apply a number of formal criteria in order to filter out the bundles in each subcorpus. The results show that bilingual lexical bundles with overlapping discourse functions in texts and extracted from comparable corpora hold unexplored potential for machine translation, computer-assisted translation and bilingual lexicography.
Abstract
Grounded in phraseology and corpus linguistics, this paper aims to explore the use of bilingual lexical bundles to improve the degree of naturalness and textual fit of translated texts. More specifically, this study attempts to identify lexical bundles, that is, recurrent sequences of 3–7 words with similar discursive functions in a purpose-designed comparable corpus of English and Polish patient information leaflets, with 100 text samples in each language. Because of cross-linguistic differences, we additionally apply a number of formal criteria in order to filter out the bundles in each subcorpus. The results show that bilingual lexical bundles with overlapping discourse functions in texts and extracted from comparable corpora hold unexplored potential for machine translation, computer-assisted translation and bilingual lexicography.
Chapters in this book
- Prelim pages i
- Table of contents v
- About the editors vii
- Multiword units in machine translation and translation technology 1
-
Part 1. Multiword units in machine translation
- Analysing linguistic information about word combinations for a Spanish-Basque rule-based machine translation system 41
- How do students cope with machine translation output of multiword units? An exploratory study 61
- Aligning verb + noun collocations to improve a French-Romanian FSMT system 81
-
Part 2. Multiword units in multilingual NLP applications
- Multiword expressions in multilingual information extraction 103
- A multilingual gold standard for translation spotting of German compounds and their corresponding multiword units in English, French, Italian and Spanish 125
- Dutch compound splitting for bilingual terminology extraction 147
-
Part 3. Identification and translation of multiword units
- A flexible framework for collocation retrieval and translation from parallel and comparable corpora 165
- On identification of bilingual lexical bundles for translation purposes 181
- The quest for croatian idioms as multiword units 201
- Corpus analysis of croatian constructions with the verb doći ‘to come’ 223
- Anaphora resolution, collocations and translation 243
- Index 257
Chapters in this book
- Prelim pages i
- Table of contents v
- About the editors vii
- Multiword units in machine translation and translation technology 1
-
Part 1. Multiword units in machine translation
- Analysing linguistic information about word combinations for a Spanish-Basque rule-based machine translation system 41
- How do students cope with machine translation output of multiword units? An exploratory study 61
- Aligning verb + noun collocations to improve a French-Romanian FSMT system 81
-
Part 2. Multiword units in multilingual NLP applications
- Multiword expressions in multilingual information extraction 103
- A multilingual gold standard for translation spotting of German compounds and their corresponding multiword units in English, French, Italian and Spanish 125
- Dutch compound splitting for bilingual terminology extraction 147
-
Part 3. Identification and translation of multiword units
- A flexible framework for collocation retrieval and translation from parallel and comparable corpora 165
- On identification of bilingual lexical bundles for translation purposes 181
- The quest for croatian idioms as multiword units 201
- Corpus analysis of croatian constructions with the verb doći ‘to come’ 223
- Anaphora resolution, collocations and translation 243
- Index 257