Presented to you through Paradigm Publishing Services

John Benjamins Publishing Company

Visit our Partner Page See all our books

Chapter

Describing a translational corpus

Abstract

There are a number of different ways to describe a single corpus. We consider how the frequencies of linguistic features may be quantified, such as in terms of their “average” occurrence, dispersion among text segments, and whether they follow the familiar “bell curve” characteristic of a normal distribution. We describe how to determine the required corpus size so that these things can be measured with the required degree of confidence. We consider “aboutness”: the extent to which individual linguistic features characterise the corpus as a whole. We describe the vocabulary richness, the extent to which the author of a text constantly brings in new vocabulary, and collocations: groups of words which are found together more often than one would expect by chance.

You are currently not able to access this content.

Abstract

There are a number of different ways to describe a single corpus. We consider how the frequencies of linguistic features may be quantified, such as in terms of their “average” occurrence, dispersion among text segments, and whether they follow the familiar “bell curve” characteristic of a normal distribution. We describe how to determine the required corpus size so that these things can be measured with the required degree of confidence. We consider “aboutness”: the extent to which individual linguistic features characterise the corpus as a whole. We describe the vocabulary richness, the extent to which the author of a text constantly brings in new vocabulary, and collocations: groups of words which are found together more often than one would expect by chance.

You are currently not able to access this content.

Chapters in this book

Prelim pages i
Table of contents v
Preface vii
List of contributors ix
Part I. Theoretical exploration
Explicit and tacit 3
Regression analysis in translation studies 35
Hypothesis testing in corpus-based literary translation studies 53
Part II. Essential corpus statistics
Compiling a Norwegian-Spanish parallel corpus 75
Describing a translational corpus 115
Clustering a translational corpus 149
Part III. Quantitative exploration of literary translation
A Corpus study of early English translations of Cao Xueqin’s Hongloumeng 177
Determining translation invariant characteristics of James Joyce’s Dubliners 209
The great mystery of the (almost) invisible translator 231
Part IV. Quantitative exploration of translation lexis
Translation and scientific terminology 251
The games translators play 275
Multivariate analyses of affix productivity in translated English 301
Lexical lectometry in corpus-based translation studies 325
Appendices 347
Index 357

Quantitative Methods in Corpus-Based Translation Studies

This chapter is in the book Quantitative Methods in Corpus-Based Translation Studies

https://doi.org/10.1075/scl.51.05oak

Chapters in this book

Prelim pages i
Table of contents v
Preface vii
List of contributors ix
Part I. Theoretical exploration
Explicit and tacit 3
Regression analysis in translation studies 35
Hypothesis testing in corpus-based literary translation studies 53
Part II. Essential corpus statistics
Compiling a Norwegian-Spanish parallel corpus 75
Describing a translational corpus 115
Clustering a translational corpus 149
Part III. Quantitative exploration of literary translation
A Corpus study of early English translations of Cao Xueqin’s Hongloumeng 177
Determining translation invariant characteristics of James Joyce’s Dubliners 209
The great mystery of the (almost) invisible translator 231
Part IV. Quantitative exploration of translation lexis
Translation and scientific terminology 251
The games translators play 275
Multivariate analyses of affix productivity in translated English 301
Lexical lectometry in corpus-based translation studies 325
Appendices 347
Index 357

Downloaded on 5.4.2026 from https://www.degruyterbrill.com/document/doi/10.1075/scl.51.05oak/html