Using Corpora in the Calculation of Language Relationships
-
Anke Lüdeling
Abstract
This paper explores the relationships between traditional genetic trees of languages and (unrooted) similarity trees. Genetic trees are typically based on lexical and phonological data and ignore information from any other level of analysis. Similarity trees can directly be computed from corpus data. Since a corpus can be seen as corresponding strings (text, annotation tags), string comparison methods from bioinformatics can be used to compute 'distance scores' between languages. Similarity trees based on phonological and syntactic features are presented for five historical varieties of German, and the scope and limits of such surface-based methods in the calculation of language relationships are discussed
© 2014 by Walter de Gruyter Berlin/Boston
Articles in the same Issue
- Masthead
- Inhalt
- Editorial
- Introduction
- Pedagogical Applications of Corpora: Some Reflections on the Current Scope and a Wish List for Future Developments
- How Reliable are the Results? Comparing Corpus-Based Studies of the Present Perfect
- Distributional Data and Grammatical Structures: The Case of So-Called ’Subject Extraposition’
- The Distribution of Also and Too: A Preliminary Corpus Study
- How Random is a Corpus? The Library Metaphor
- Some Proposals towards a More Rigorous Corpus Linguistics
- Corpora and (the Need for) Other Methods in a Study of Lancashire Dialect
- Using Corpora in the Calculation of Language Relationships
- Buchbesprechung
- Die Autoren dieses Heftes
Articles in the same Issue
- Masthead
- Inhalt
- Editorial
- Introduction
- Pedagogical Applications of Corpora: Some Reflections on the Current Scope and a Wish List for Future Developments
- How Reliable are the Results? Comparing Corpus-Based Studies of the Present Perfect
- Distributional Data and Grammatical Structures: The Case of So-Called ’Subject Extraposition’
- The Distribution of Also and Too: A Preliminary Corpus Study
- How Random is a Corpus? The Library Metaphor
- Some Proposals towards a More Rigorous Corpus Linguistics
- Corpora and (the Need for) Other Methods in a Study of Lancashire Dialect
- Using Corpora in the Calculation of Language Relationships
- Buchbesprechung
- Die Autoren dieses Heftes