Explorations in automated language classification

Eric W. Holman; Søren Wichmann; Cecil H. Brown; Viveka Velupillai; André Müller; Dik Bakker

doi:10.1515/FLIN.2008.331

Article

Explorations in automated language classification

, , , , and

Published/Copyright: February 16, 2009

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Volume 42 Issue 3-4

An earlier paper, to which some authors of the present paper have contributed (Brown et al. 2008), describes a method for automating language classification based on the 100-item referent list of Swadesh (1955). Here we discuss a refinement of the method, involving calculation of relative stabilities of list items and reduction of the list to a shorter one by eliminating least stable items. The result is a 40-item referent list. The method for determining stabilities is explained, as well as a method for comparing the classificatory performance of different-sized reduced lists with that of the full 100-item list. A statistical investigation of the relationship of lexical similarity of languages to their geographical proximity is presented. Finally, we test the possibility that information involving typological features of languages can be combined with lexical data to enhance classificatory accuracy.

Keywords:: language classification; lexicostatistics; word stabilities; Swadesh list

Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D–04103 Leipzig, Germany. e-mail: wichmann@eva.mpg.de

Received: 2007-11-25

Accepted: 2008-02-26

Published Online: 2009-02-16

Published in Print: 2008-November

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/FLIN.2008.331

Keywords for this article

language classification; lexicostatistics; word stabilities; Swadesh list