Startseite Sampling for variety
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Sampling for variety

  • Matti Miestamo EMAIL logo , Dik Bakker und Antti Arppe
Veröffentlicht/Copyright: 27. September 2016

Abstract

Variety sampling aims at capturing as much of the world’s linguistic variety as possible. The article discusses and compares two sampling methods designed for variety sampling: the Diversity Value method, in which sample languages are picked according to the diversity found in family trees, and the Genus-Macroarea method, in which genealogical stratification is primarily based on genera and areal stratification pays attention to the proportional representation of the genealogical diversity of macroareas. The pros and cons of the methods are discussed, some additional features are introduced to the Genus-Macroarea method, and the ability of both methods to capture crosslinguistic variety is tested with computerized simulations drawing on data in The world atlas of language structures database.

Acknowledgements

We are grateful to Kaius Sinnemäki and the three anonymous reviewers for their valuable comments on earlier versions of this article. We also wish to thank the audience at the ALT 9 conference in Hong Kong in 2011.

Abbreviations

COMP

completeness

CS

core sample

DV

diversity value

E13

Ethnologue, 13th edn. (Grimes (ed.) 1996)

E15

Ethnologue, 15th edn. (Gordon (ed.) 2005)

E18

Ethnologue, 18th edn. (Lewis et al. (eds.) 2015)

ES

extended sample

GLOT

Glottolog (Hammarström et al. 2015)

GM

genus-macroarea

GS

genus sample

PS

primary sample

RS

restricted sample

SAT

saturation

WALS

World atlas of language structures (Haspelmath et al. (eds.) 2005; Dryer & Haspelmath (eds.) 2013).

Appendix: DV samples discussed in Section 6

Table A-1:

Numbers of languages included in DV samples based on E15.

50100150200250300350400450500550600650700750800850900
Africa4420354759738497109120132143156166179189200
Afro-Asiatic1147912151821232629313436394144
Khoisan111111222333444555
Niger-Congo1112212936434956636976828995102108114
Nilo-Saharan1136810131518202224262931333537
Eurasia81115192631374247525864687681879397
Altaic111223455677899101112
Andamanese011111111111112222
Basque011111111111111111
Chukotko-Kamchatkan011111111111112222
Dravidian111123344556678899
Indo-European11471114172023262932343840434648
Japanese111111111111222222
Kartvelian111111111112222222
North Caucasian111122334455566788
Uralic111122334456677889
Yeniseian001111111111111111
Yukaghir111111111111111111
Southeast Asia & Oceania46274458728699114128141156168182196208222235
Austro-Asiatic11346891113151719202224252728
Austronesian11183140505967778694103112121130138147156
Cant011111111111111111
Hmong-Mien011111122233334445
Sino-Tibetan1136810131417192124262830323436
Tai-Kadai111122344556677899
Australia & New Guinea10152333435264708390101110120127137146155167
Amto-Musan111111111111111111
Australian11471012161822242729323437394245
Bayono-Awbono011111111111111111
East Bird’s Head111111111111111112
East Papuan111112223334444556
Geelvink Bay011111122233334444
Harakmbet111111111111111111
Kwomtari-Baibai011111111112222222
Left May011111111111111111
Lower Mamberamo011111111111111111
Sepik-Ramu111234556789101112121314
Sko111111111111111112
Torricelli111122334455667789
Trans-New Guinea116121722273036394448525660656973
West Papuan111111222333444555
North America1127272932383949505761657477838895101
Algic111122233445556677
Caddoan011111111111222222
Chimakuan111111111111111111
Chumash011111111111111111
Coahuiltecan011111111111111111
Eskimo-Aleut011111111111222222
Gulf011111111111111111
Hokan1112233445667789910
Huavean011111111111111111
Iroquoian111111122233334444
Keres011111111111111111
Kiowa-Tanoan011111111111222222
Mayan111122344556677899
Mixe-Zoque011111111222222333
Muskogean111111111111111112
Na-Dene111112233444556677
Oto-Manguean1111233445567788910
Penutian01122334556778891010
Salishan011112233344455667
Siouan011111122222333344
Subtiaba-Tlapanec011111111111111111
Tarascan011111111111111111
Totonacan111111111111111112
Uto-Aztecan011112233444556677
Wakashan111111111111111112
Witotoan111111111111222222
Yuki011111111111111111
South America133536384246495457626771758085909498
Alacalufan011111111111111111
Arauan111111111111111111
Araucanian011111111111111111
Arawakan111111222333444555
Arutani-Sape001111111111111111
Aymaran011111111111111111
Barbacoan011111111222223333
Cahuapanan011111111111111111
Carib111112233344455667
Chapacura-Wanham111111111111111112
Chibchan111122334455667788
Choco111112223334445566
Chon111111111111111111
Guahiban011111111111111111
Hibito-Cholon011111111111111111
Jivaroan111111111111111111
Katukinan011111111111111111
Lule-Vilela011111111111111111
Macro-Ge11123445678991011121313
Maku011111111111111111
Mascoian011111111111111111
Mataco-Guaicuru011111111111111112
Misumalpan111111111111111111
Mura011111111111111111
Nambiquaran011111111111111111
Panoan111122233445566677
Peba-Yaguan011111111111111111
Quechuan011111111111222222
Salivan011111111111111111
Tacanan011111111111222222
Tucanoan111112233344455666
Tupi0112334556788910111112
Uru-Chipaya011111111111111111
Yanomam111111111111111111
Zamucoan011111111111111111
Zaparoan011111111111111111
Language iIsolate011111111111111111
Unclassified011111111111111111
Table A-2:

Numbers of languages included in DV samples based on GLOT.

50100150200250300350400450500550600650700750800850900
Africa718173138567387100115128142155167180191206219
Afro-Asiatic1111157912161923252932343840
Atlantic-Congo11116203545556371808896104111119126
Central Sudanic111111122344556677
Daju010111111111111111
Dizoid000111111111111111
Dogon011111111112222223
Eastern Jebel000111111111111111
Furan000111111111111111
Gonga-Gimojan111111111122222333
Heiban011111111111111112
Ijoid011111111111111111
Kadugli-Krongo011111111111111111
Katla-Tima001111111111111111
Khoe-Kwadi010111111111111122
Koman010111111111111111
Kresh-Aja000111111111111111
Kuliak000111111111111111
Kxa010011111111111111
Maban001111111111111111
Mande111111122333445566
Mao010111111111111111
Narrow Talodi011111111111111111
Nilotic011111111222333444
Nubian001111111111111112
Nyimang000111111111111111
Rashad001111111111111111
Saharan101111111111111111
Songhay010111111111111111
South Omotic001111111111111111
Surmic110111111111111112
Tama000111111111111111
Temein000011111111111111
Tuu000111111111111111
Eurasia69121616232836445056647178819195103
Abkhaz-Adyge000111111111111111
Chukotko-Kamchatkan010111111111111111
Dravidian011111112233444556
Great Andamanese011111111111111111
Hurro-Urartian001111111111111111
Indo-European111118132027313641465255606468
Japonic011111111111111222
Jarawa-Onge000111111111111111
Kartvelian011111111111111111
Mongolic101111111112222223
Nakh-Daghestanian111111111112222333
Tungusic101111111111111112
Turkic011111111222333444
Uralic111111122334455667
Yeniseian101111111111111111
Yukaghir000111111111111111
Southeast Asia & Oceania44551028506787101116130144160174188203215
Austroasiatic11111345791012131618192223
Austronesian111151529384855626976849097104110
Hmong-Mien001111111122223333
Sino-Tibetan111128142128333843485357626771
Tai-Kadai111111223344556778
Australia & New Guinea13326577100106111116121128137144153163172181189198
Alor-Pantar101111111122223334
Amto-Musan010111111111111111
Angan011111111111222223
Arafundi000111111111111111
Awin-Pa001111111111111111
Baibai-Fas001011111111111111
Baining010111111111111111
Bayono-Awbono000011111111111111
Border101111111111111111
Bosavi001011111111111111
Bulaka River001111111111111111
Bunaban010011111111111111
Dagan100111111111111111
Doso-Turumsa000011111111111111
East Bird’s Head011111111111111111
East Kutubu000111111111111111
East Strickland011111111111111111
East Timor-Bunaq001111111111111111
Eastern Daly010011111111111111
Eastern Trans-Fly001111111111111111
Garrwan001111111111111111
Geelvink Bay011111111122222333
Giimbiyu011111111111111111
Goilalan001111111111111111
Gunwinyguan001111111111111122
Harakmbut000111111111111111
Hatam-Mansim001011111111111111
Inanwatan000111111111111111
Inland Gulf011111111111111111
Iwaidjan Proper011111111111111111
Jarrakan010111111111111111
Kaure-Narau000011111111111111
Kayagar001111111111111111
Kiwaian111111111111111222
Koiarian011111111111111111
Kolopom001111111111111111
Konda-Yahadian101111111111111111
Kwalean001111111111111111
Kwerbic001111111111111111
Kwomtari000111111111111111
Lakes Plain011111111111122222
Left May010111111111111111
Lepki-Murkim000111111111111111
Limilngan001111111111111111
Lower Sepik-Ramu111111111222333444
Mailuan011111111111122222
Mairasi001111111111111111
Mangarrayi-Maran000111111111111111
Maningrida001111111111111111
Manubaran001111111111111111
Marind001011111111111111
Marrku-Wurrugu000111111111111111
Maybrat000111111111111111
Mirndi101111111111111111
Mombum000111111111111111
Mongol-Langam001111111111111111
Monumbo010011111111111111
Morehead-Wasur011111111111112222
Namla-Tofanma001111111111111111
Ndu001111111111111112
Nimboran011111111111111111
North Bougainville010111111111111111
North Halmahera001111111111112222
North-Eastern Tasmanian001011111111111111
Northern Daly000011111111111111
Nuclear Eleman001011111111111111
Nuclear Torricelli1111112334556788910
Nuclear Trans New Guinea1111146810131619212527303334
Nyulnyulan011111111111111111
Pahoturi001111111111111111
Pama-Nyungan1111146810121518212426293233
Pauwasi000111111111111111
Piawi011111111111111111
Senagi000011111111111111
Sentani001111111111111111
Sepik101111112223334445
Sko001111111111111111
Somahai001111111111111111
South Bird’s Head001111111111111111
South Bougainville001011111111111111
South-Eastern Tasmanian000111111111111111
Southern Daly000011111111111111
Suki-Gogodala001011111111111111
Tangkic100011111111111111
Taulil-Butam011011111111111111
Teberan000111111111111111
Tirio011111111111111111
Tor-Orya011111111111222223
Turama-Kikori000011111111111111
Umbugarla000011111111111111
Walio001111111111111111
West Bird’s Head001011111111111111
West Bomberai000111111111111111
Western Daly001111111111111111
Western Tasmanian010111111111111111
Worrorran001111111111111111
Yangmanic000111111111111111
Yareban111111111111111111
Yawa000011111111111111
Yuat-Maramba001111111111111112
North America81924334040404344485053565863666872
Algic111111122334455667
Athapaskan-Eyak-Tlingit111111111222333444
Caddoan010111111111111111
Chimakuan001011111111111111
Chinookan001111111111111111
Chumashan011111111111111111
Cochimi-Yuman001111111111111111
Coosan000111111111111111
Eskimo-Aleut010111111111111112
Haida001011111111111111
Huavean000011111111111111
Iroquoian101111111111112222
Jicaquean000111111111111111
Kalapuyan001011111111111111
Keresan010111111111111111
Kiowa-Tanoan011111111111111111
Lencan000011111111111111
Maiduan010111111111111111
Mayan111111111122222333
Misumalpan001111111111111111
Miwok-Costanoan010111111111111111
Mixe-Zoque101111111111222223
Muskogean001111111111111111
Otomanguean011111122334456677
Palaihnihan010011111111111111
Pomoan011111111111111111
Sahaptian011111111111111111
Salishan011111112223334445
Shastan000111111111111111
Siouan111111111111112222
Tarascan000111111111111111
Tequistlatecan001011111111111111
Totonacan110111111111111111
Tsimshian001111111111111111
Uto-Aztecan111111122344556677
Wakashan010111111111111111
Wintuan001111111111111111
Xincan001111111111111111
Yokutsan000111111111111111
Yuki-Wappo000111111111111111
South America121627374445464952566165697278818791
Araucanian001011111111111111
Arawakan1111122345678910101213
Arawan001111111111112222
Aymara011011111111111111
Barbacoan000111111111111111
Boran100111111111111111
Bororoan000111111111111111
Cahuapanan000011111111111111
Cariban111111122334456677
Chapacuran101111111111111222
Charruan000011111111111111
Chibchan011111111122223333
Chiquitano000111111111111111
Chocoan111111111111222223
Chonan010111111111111111
Guahibo011111111111111111
Guaicuruan011111111111111111
Hibito-Cholon001011111111111111
Huarpean000011111111111111
Huitotoan001111111111111111
Jivaroan001111111111111111
Kakua-Nukak000111111111111111
Kamakanan110111111111111111
Kariri000111111111111111
Katukinan001111111111111111
Kawesqar100111111111111111
Lengua-Mascoy000111111111111111
Matacoan001111111111111111
Nadahup001111111111111111
Nambiquaran001111111111111111
Nuclear Macro-Je011111111222333344
Panoan111111112233344455
Peba-Yagua000111111111111111
Puri101111111111111111
Quechuan111111112223334445
Saliban001111111111111111
Tacanan011111111111111111
Ticuna-Yuri000111111111111111
Tucanoan111111111122222333
Tupian11111123345677891011
Uru-Chipaya000011111111111111
Yanomam001111111111111111
Zamucoan011111111111111111
Zaparoan000111111111111111
Language isolate010011111111111111
Unclassified010111111111111111

References

Bakker, Dik. 2011. Language sampling. In Jae Jung Song (ed.), The Oxford handbook of linguistic typology, 100–127. Oxford: Oxford University Press.Suche in Google Scholar

Bell, Alan. 1978. Language samples. In Joseph H. Greenberg (ed.), Universals of human language, Vol. 1: Method & theory, 123–156. Stanford: Stanford University Press.Suche in Google Scholar

Bickel, Balthasar. 2007. Typology in the 21st century: Major current developments. Linguistic Typology 11. 239–25110.1515/LINGTY.2007.018Suche in Google Scholar

Bickel, Balthasar. 2008. A refined sampling procedure for genealogical control. Language Typology and Universals 61. 221–233.10.1524/stuf.2008.0022Suche in Google Scholar

Bickel, Balthasar & Johanna Nichols. 2013. The Autotyp genealogy and geography database 2013 release. http://www.autotyp.uzh.ch/available.htmlSuche in Google Scholar

Bybee, Joan, Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago: University of Chicago Press.Suche in Google Scholar

Campbell, Lyle. 1997. American Indian languages: The historical linguistics of Native America. Oxford: Oxford University Press.10.1093/oso/9780195094275.001.0001Suche in Google Scholar

Croft, William. 2003. Typology and universals. 2nd edn. Cambridge: Cambridge University Press.Suche in Google Scholar

Cysouw, Michael. 2011. Understanding transition probabilities. Linguistic Typology 15. 415–431.10.1515/lity.2011.028Suche in Google Scholar

Dahl, Östen. 2008. An exercise in a posteriori language sampling. Language Typology and Universals 61. 208–220.10.1524/stuf.2008.0021Suche in Google Scholar

Dryer, Matthew S. 1989. Large linguistic areas and language sampling. Studies in Language 13. 257–292.10.1075/sl.13.2.03drySuche in Google Scholar

Dryer, Matthew S. 1992. The Greenbergian word order correlations. Language 68. 81–138.10.1353/lan.1992.0028Suche in Google Scholar

Dryer, Matthew S. 2000. Counting genera vs. counting languages. Linguistic Typology 4. 334–350.Suche in Google Scholar

Dryer, Matthew S. 2005a. Genealogical language list. In Haspelmath et al. (eds.) 2005, 584–644. Updates to the classification available in the online version of 2008 at http://blog.wals.info/errata-in-the-printed-edition-of-2005/Suche in Google Scholar

Dryer, Matthew S. 2005b. Order of subject, object and verb. In Haspelmath et al. (eds.) 2005, 330–333.Suche in Google Scholar

Dryer, Matthew S. 2013. Genealogical language list. In Dryer & Haspelmath (eds.) 2013. http://wals.info/languoid/genealogySuche in Google Scholar

Dryer, Matthew S. & Martin Haspelmath (eds.). 2013. The world atlas of language structures online. Leipzig: Max-Planck-Institut für evolutionäre Anthropologie. http://wals.info/Suche in Google Scholar

Gordon, Raymond G., Jr. (ed.). 2005. Ethnologue: Languages of the world. 15th edn. Dallas: SIL International. http://archive.ethnologue.com/15/web.aspSuche in Google Scholar

Grimes, Barbara F. (ed.). 1996. Ethnologue: Languages of the world. 13th edn. Dallas: Summer Institute of Linguistics.Suche in Google Scholar

Grimes, Joseph E. & Barbara F. Grimes. 1996. Ethnologue: Language family index to the thirteenth edition of the Ethnologue. Dallas: Summer Institute of Linguistics.Suche in Google Scholar

Hammarström, Harald. 2009. Sampling and genealogical coverage in WALS. Linguistic Typology 13. 105–119.10.1515/LITY.2009.006Suche in Google Scholar

Hammarström, Harald & Mark Donohue. 2014. Some principles on the use of macro-areas in typological comparison. Language Dynamics and Change 4. 167–187.10.1163/22105832-00401001Suche in Google Scholar

Hammarström, Harald, Robert Forkel, Martin Haspelmath & Sebastian Bank. 2015. Glottolog 2.4. Leipzig: Max-Planck Institut für evolutionäre Anthropologie. http://glottolog.orgSuche in Google Scholar

Haspelmath, Martin, Matthew Dryer, David Gil & Bernard Comrie (eds.) 2005. The world atlas of language structures. Oxford: Oxford University Press.Suche in Google Scholar

Henriksen, Carol & Johan van der Auwera. 1994. The Germanic languages. In Ekkehard König & Johan van der Auwera (eds.), The Germanic languages, 1–18. London: Routledge.Suche in Google Scholar

Himmelmann, Nikolaus P. 2000. Towards a typology of typologies. Sprachtypologie und Universalienforschung 53. 5–12.10.1524/stuf.2000.53.1.5Suche in Google Scholar

Janhunen, Juha. 2009. Proto-Uralic – what, where, and when? In Jussi Ylikoski (ed.), The quasquicentennial of the Finno-Ugrian Society (Mémoires de la Société Finno-Ougrienne 258), 57–78. Helsinki: Finno-Ugrian Society.Suche in Google Scholar

Koptjevskaja-Tamm, Maria & Bernhard Wälchli. 2001. The Circum-Baltic languages: An areal-typological approach. In Östen Dahl & Maria Koptjevskaja-Tamm (eds.), Circum-Baltic languages, Vol. 2: Grammar and typology, 615–750. Amsterdam: Benjamins.10.1075/slcs.55.15kopSuche in Google Scholar

Levinson, Stephen C., Simon J. Greenhill, Russell D. Gray & Michael Dunn. 2011. Universal typological dependencies should be detectable in the history of language families. Linguistic Typology 15. 509–534.10.1515/lity.2011.034Suche in Google Scholar

Lewis, M. Paul, Gary F. Simons & Charles D. Fennig (eds.). 2015. Ethnologue: Languages of the world. 18th edn. Dallas: SIL International. http://www.ethnologue.com/Suche in Google Scholar

Maslova, Elena. 2000. A dynamic approach to the verification of distributional universals. Linguistic Typology 4. 307–333.10.1515/lity.2000.4.3.307Suche in Google Scholar

Miestamo, Matti. 2003. Clausal negation: A typological study. Helsinki: Helsingin yliopisto doctoral dissertation.Suche in Google Scholar

Miestamo, Matti. 2005. Standard negation: The negation of declarative verbal main clauses in a typological perspective. Berlin: Mouton de Gruyter.10.1515/9783110197631Suche in Google Scholar

Miestamo, Matti. 2009. Implicational hierarchies and grammatical complexity. In Geoffrey Sampson, David Gil & Peter Trudgill (eds.), Language complexity as an evolving variable, 80–97.Oxford: Oxford University Press.10.1093/oso/9780199545216.003.0006Suche in Google Scholar

Murdock, George Peter. 1968. World sampling provinces. Ethnology 7. 305–326.10.2307/3772896Suche in Google Scholar

Nichols, Johanna. 1992. Linguistic diversity in space and time. Chicago: University of Chicago Press.10.7208/chicago/9780226580593.001.0001Suche in Google Scholar

Perkins, Revere D. 1989. Statistical techniques for determining language sample size. Studies in Language 13. 293–315.10.1075/sl.13.2.04perSuche in Google Scholar

Perkins, Revere D. 1992. Deixis, grammar, and culture. Amsterdam: Benjamins.10.1075/tsl.24Suche in Google Scholar

Perkins, Revere D. 2000. The view from hologeistic linguistics (Commentary on Maslova 2000). Linguistic Typology 4. 350–353.10.1515/lity.2000.4.3.334Suche in Google Scholar

Rankin, Robert L. 1993. On Siouan chronology. Paper presented at the Annual Meeting of the American Anthropological Association, Washington, DC.Suche in Google Scholar

Rijkhoff, Jan. 2009. On the (un)suitability of semantic categories. Linguistic Typology 13. 95–104.10.1515/LITY.2009.005Suche in Google Scholar

Rijkhoff, Jan & Dik Bakker. 1998. Language sampling. Linguistic Typology 2. 263–314.10.1515/lity.1998.2.3.263Suche in Google Scholar

Rijkhoff, Jan, Dik Bakker, Kees Hengeveld & Peter Kahrel. 1993. A method of language sampling. Studies in Language 17. 169–203.10.1075/sl.17.1.07rijSuche in Google Scholar

Ruhlen, Merritt. 1991. A guide to the world’s languages, Vol. 1: Classification, with a postscript on recent developments. Stanford: Stanford University Press. Originally published in 1987 without postscript.Suche in Google Scholar

Stassen, Leon. 1997. Intransitive predication. Oxford: Oxford University Press.10.1093/oso/9780198236931.001.0001Suche in Google Scholar

Stolz, Thomas & Traude Gugeler. 2000. Comitative typology. Sprachtypologie und Universalienforschung 53. 53–61.10.1524/stuf.2000.53.1.53Suche in Google Scholar

Tomlin, Russell S. 1986. Basic word order: Functional principles. London: Croom Helm.Suche in Google Scholar

Voegelin, Charles F. & Florence M. Voegelin. 1977. Classification and index of the world’s languages. New York: Elsevier.Suche in Google Scholar

Wichman, Søren & David Kamholz. 2008. A stability metric for typological features. Sprachtypologie und Universalienforschung 61. 251–262.10.1524/stuf.2008.0024Suche in Google Scholar

Received: 2013-6-16
Revised: 2015-12-30
Published Online: 2016-9-27
Published in Print: 2016-10-1

©2016 by De Gruyter Mouton

Heruntergeladen am 22.11.2025 von https://www.degruyterbrill.com/document/doi/10.1515/lingty-2016-0006/html?lang=de
Button zum nach oben scrollen