Home Linguistics & Semiotics III. Existing corpora
Chapter
Licensed
Unlicensed Requires Authentication

III. Existing corpora

Become an author with De Gruyter Brill
Volume 1
This chapter is in the book Volume 1
III. Existing corpora20. Well-known and inluential corpora1. Introduction2. National corpora3. Monitor corpora4. Corpora of the Brown family5. Synchronic corpora6. Diachronic corpora7. Spoken corpora8. Academic and professional English corpora9. Parsed corpora10. Developmental and learner corpora11. Multilingual corpora12. Non-English monolingual corpora13. Well-known distributors of corpus resources14. Conclusion15. Appendix: URLs16. Literature1. IntroductionAs corpus building is an activity that takes times and costs money, readers may wish touse ready-made corpora to carry out their work. However, as a corpus is always designedfor a particular purpose, the usefulness of a ready-made corpus must be judged withregard to the purpose to which a user intends to put it. There are thousands of corporain the world, but most of them are created for specific research projects and are notpublicly available. This article introduces well-known and influential corpora, which aregrouped in terms of their primary uses so that readers will find it easier to choose corpusresources suitable for their particular research questions. Note, however, that overlapsare inevitable in our classification. It is used in this article simply to give a better accountof the primary uses of the relevant corpora. The higher number of English corporacovered here might reflect the fact that English was the forerunner in corpus research,though as we will see shortly, many other languages are catching up. Information on theweb site addresses for the corpora discussed in this article are given in the appendix.2. National corporaNational corpora are normally general reference corpora which are supposed to repre-sent the national language of a country. They are balanced with regard to genres anddomains that typically represent the language under consideration. While an ideal na-tional corpus should cover proportionally both written and spoken language, most exist-
© 2008 Walter de Gruyter GmbH & Co. KG, Genthiner Str. 13, 10785 Berlin.

III. Existing corpora20. Well-known and inluential corpora1. Introduction2. National corpora3. Monitor corpora4. Corpora of the Brown family5. Synchronic corpora6. Diachronic corpora7. Spoken corpora8. Academic and professional English corpora9. Parsed corpora10. Developmental and learner corpora11. Multilingual corpora12. Non-English monolingual corpora13. Well-known distributors of corpus resources14. Conclusion15. Appendix: URLs16. Literature1. IntroductionAs corpus building is an activity that takes times and costs money, readers may wish touse ready-made corpora to carry out their work. However, as a corpus is always designedfor a particular purpose, the usefulness of a ready-made corpus must be judged withregard to the purpose to which a user intends to put it. There are thousands of corporain the world, but most of them are created for specific research projects and are notpublicly available. This article introduces well-known and influential corpora, which aregrouped in terms of their primary uses so that readers will find it easier to choose corpusresources suitable for their particular research questions. Note, however, that overlapsare inevitable in our classification. It is used in this article simply to give a better accountof the primary uses of the relevant corpora. The higher number of English corporacovered here might reflect the fact that English was the forerunner in corpus research,though as we will see shortly, many other languages are catching up. Information on theweb site addresses for the corpora discussed in this article are given in the appendix.2. National corporaNational corpora are normally general reference corpora which are supposed to repre-sent the national language of a country. They are balanced with regard to genres anddomains that typically represent the language under consideration. While an ideal na-tional corpus should cover proportionally both written and spoken language, most exist-
© 2008 Walter de Gruyter GmbH & Co. KG, Genthiner Str. 13, 10785 Berlin.
Downloaded on 18.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/9783110211429.1.383/html?srsltid=AfmBOoqkbrmGFwf983nIetUio6CaZGEhCbiNGB2sf23Z9uwfdg5ZDSk_
Scroll to top button