Describing a translational corpus
-
Michael P. Oakes
Abstract
There are a number of different ways to describe a single corpus. We consider how the frequencies of linguistic features may be quantified, such as in terms of their “average” occurrence, dispersion among text segments, and whether they follow the familiar “bell curve” characteristic of a normal distribution. We describe how to determine the required corpus size so that these things can be measured with the required degree of confidence. We consider “aboutness”: the extent to which individual linguistic features characterise the corpus as a whole. We describe the vocabulary richness, the extent to which the author of a text constantly brings in new vocabulary, and collocations: groups of words which are found together more often than one would expect by chance.
Abstract
There are a number of different ways to describe a single corpus. We consider how the frequencies of linguistic features may be quantified, such as in terms of their “average” occurrence, dispersion among text segments, and whether they follow the familiar “bell curve” characteristic of a normal distribution. We describe how to determine the required corpus size so that these things can be measured with the required degree of confidence. We consider “aboutness”: the extent to which individual linguistic features characterise the corpus as a whole. We describe the vocabulary richness, the extent to which the author of a text constantly brings in new vocabulary, and collocations: groups of words which are found together more often than one would expect by chance.
Chapters in this book
- Prelim pages i
- Table of contents v
- Preface vii
- List of contributors ix
-
Part I. Theoretical exploration
- Explicit and tacit 3
- Regression analysis in translation studies 35
- Hypothesis testing in corpus-based literary translation studies 53
-
Part II. Essential corpus statistics
- Compiling a Norwegian-Spanish parallel corpus 75
- Describing a translational corpus 115
- Clustering a translational corpus 149
-
Part III. Quantitative exploration of literary translation
- A Corpus study of early English translations of Cao Xueqin’s Hongloumeng 177
- Determining translation invariant characteristics of James Joyce’s Dubliners 209
- The great mystery of the (almost) invisible translator 231
-
Part IV. Quantitative exploration of translation lexis
- Translation and scientific terminology 251
- The games translators play 275
- Multivariate analyses of affix productivity in translated English 301
- Lexical lectometry in corpus-based translation studies 325
- Appendices 347
- Index 357
Chapters in this book
- Prelim pages i
- Table of contents v
- Preface vii
- List of contributors ix
-
Part I. Theoretical exploration
- Explicit and tacit 3
- Regression analysis in translation studies 35
- Hypothesis testing in corpus-based literary translation studies 53
-
Part II. Essential corpus statistics
- Compiling a Norwegian-Spanish parallel corpus 75
- Describing a translational corpus 115
- Clustering a translational corpus 149
-
Part III. Quantitative exploration of literary translation
- A Corpus study of early English translations of Cao Xueqin’s Hongloumeng 177
- Determining translation invariant characteristics of James Joyce’s Dubliners 209
- The great mystery of the (almost) invisible translator 231
-
Part IV. Quantitative exploration of translation lexis
- Translation and scientific terminology 251
- The games translators play 275
- Multivariate analyses of affix productivity in translated English 301
- Lexical lectometry in corpus-based translation studies 325
- Appendices 347
- Index 357