Book
Licensed
Unlicensed
Requires Authentication
Programming for Corpus Linguistics
How to Do Text Analysis with Java
-
Oliver Mason
Language:
English
Published/Copyright:
2000
About this book
The ability to program a computer has become increasingly important in work that involves corpora. Specialised research needs can no longer be met by available software, and purchasing customised programs is usually not an option. This book enables the researcher to write programs for text and corpus processing. Useful techniques are illustrated with the popular programming language Java, which is very well suited for handling textual data, and at the same time easy to learn.
Key Features
- a general introduction to programming for readers with a linguistic background
- a practical introduction to corpus linguistics for readers with a programming background who are new to corpus processing
- a guide to relevant aspects of Java which will be useful for text processing
- a variety of sample programs which are in themselves useful tools for corpus research.
Topics
-
Download PDFPublicly Available
Frontmatter
i -
Download PDFPublicly Available
Contents
v - Part I. Programming and Corpus Linguistics
-
Requires Authentication UnlicensedLicensed
1. Introduction
3 -
Requires Authentication UnlicensedLicensed
2. Introduction to Basic Programming Concepts
13 -
Requires Authentication UnlicensedLicensed
3. Basic Corpus Concepts
31 -
Requires Authentication UnlicensedLicensed
4. Basic Java Programming
47 -
Requires Authentication UnlicensedLicensed
5. The Java Class Library
61 -
Requires Authentication UnlicensedLicensed
6. Input/Output
97 -
Requires Authentication UnlicensedLicensed
7. Processing Plain Text
133 -
Requires Authentication UnlicensedLicensed
8. Dealing with Annotations
153 - Part II. Language Processing Examples
-
Requires Authentication UnlicensedLicensed
9. Stemming
179 -
Requires Authentication UnlicensedLicensed
10. Part of Speech Tagging
195 -
Requires Authentication UnlicensedLicensed
11. Collocation Analysis
213 - Part III. Appendices
-
Requires Authentication UnlicensedLicensed
12. Appendix
231 -
Requires Authentication UnlicensedLicensed
Index
237
Publishing information
Pages and Images/Illustrations in book
eBook published on:
February 15, 2022
eBook ISBN:
9781474470780
Pages and Images/Illustrations in book
Main content:
256
eBook ISBN:
9781474470780
Keywords for this book
Language & Linguistics
Audience(s) for this book
College/higher education;