Chapter 3. Word alignment in a parallel corpus of Old English prose
-
Javier Martín Arista
Abstract
This chapter proposes a model of syntactic annotation for the Parallel Corpus of Old English Prose, an aligned corpus of Old English and Present Day English texts. The research focuses on areas of syntactic divergence between the aligned texts. Syntactic divergence is described in terms of four types of alignment asymmetry (markedness, constituency, order, and configuration) and is represented by means of two components: a structural description and a dependency tree. The main conclusion is that these two components constitute a historical micro-grammar that identifies stability and change with respect to specific categories and constructions.
Abstract
This chapter proposes a model of syntactic annotation for the Parallel Corpus of Old English Prose, an aligned corpus of Old English and Present Day English texts. The research focuses on areas of syntactic divergence between the aligned texts. Syntactic divergence is described in terms of four types of alignment asymmetry (markedness, constituency, order, and configuration) and is represented by means of two components: a structural description and a dependency tree. The main conclusion is that these two components constitute a historical micro-grammar that identifies stability and change with respect to specific categories and constructions.
Chapters in this book
- Prelim pages i
- Table of contents v
- Corpus resources and tools 1
-
Part I. Corpus resources and tools
- Chapter 1. Now what ? 23
- Chapter 2. ZHEN 49
- Chapter 3. Word alignment in a parallel corpus of Old English prose 75
- Chapter 4. Semantic textual similarity based on deep learning 101
- Chapter 5. TAligner 3.0 125
- Chapter 6. Developing a corpus-informed tool for Spanish professionals writing specialised texts in English 147
-
Part II. Corpus-based studies and explorations
- Chapter 7. English and Spanish discourse markers in translation 177
- Chapter 8. The discourse markers well and so and their equivalents in the Portuguese and Turkish subparts of the TED-MDB corpus 209
- Chapter 9. Variation of evidential values in discourse domains 233
- Chapter 10. The translation for dubbing of Westerns in Spain 257
- Chapter 11. Generic analysis of mobile application reviews in English and Spanish 283
- Chapter 12. Exploring variation in translation with probabilistic language models 307
- Chapter 13. Binomial adverbs in Germanic and Romance Languages 325
- Index 343
Chapters in this book
- Prelim pages i
- Table of contents v
- Corpus resources and tools 1
-
Part I. Corpus resources and tools
- Chapter 1. Now what ? 23
- Chapter 2. ZHEN 49
- Chapter 3. Word alignment in a parallel corpus of Old English prose 75
- Chapter 4. Semantic textual similarity based on deep learning 101
- Chapter 5. TAligner 3.0 125
- Chapter 6. Developing a corpus-informed tool for Spanish professionals writing specialised texts in English 147
-
Part II. Corpus-based studies and explorations
- Chapter 7. English and Spanish discourse markers in translation 177
- Chapter 8. The discourse markers well and so and their equivalents in the Portuguese and Turkish subparts of the TED-MDB corpus 209
- Chapter 9. Variation of evidential values in discourse domains 233
- Chapter 10. The translation for dubbing of Westerns in Spain 257
- Chapter 11. Generic analysis of mobile application reviews in English and Spanish 283
- Chapter 12. Exploring variation in translation with probabilistic language models 307
- Chapter 13. Binomial adverbs in Germanic and Romance Languages 325
- Index 343