A network analysis of the semantic evolution of ‘fruit’ and ‘stone’ in Tibeto-Burman languages

Yu Li; Mengyang Qiu

doi:10.1515/psicl-2024-0024

Article Open Access

A network analysis of the semantic evolution of ‘fruit’ and ‘stone’ in Tibeto-Burman languages

Yu Li and Mengyang Qiu

Published/Copyright: May 23, 2025

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Poznan Studies in Contemporary Linguistics Volume 61 Issue 2

Abstract

The lexemes ‘fruit’ and ‘stone’ are known as the origins of the numeral classifiers for small round objects in many Tibeto-Burman languages. This paper employs a correlation-based network construction method to investigate the colexification networks of the two concepts in 58 + 68 Tibeto-Burman languages. A total of 104 concepts colexified with ‘fruit’ and 99 concepts colexified with ‘stone’ are organized into macro semantic classes. Semantic networks on the basis of the similarities in colexification patterns of concepts, as well as languages networks on the basis of the similarities in colexification patterns of languages, are constructed for ‘fruit’ and ‘stone’, respectively. The results indicate that classifiers for small round objects evolved from either ‘fruit’ or ‘stone’ are directly colexified with class terms in compound nouns denoting varieties of fruits/stones and the shape class of small round objects, indicating that they are diachronically related. However, ‘fruit’ and ‘stone’ differ significantly in their modes of deriving a classifier. Moreover, languages that have developed classifiers from ‘fruit’ are mostly from the Ngwi subgroup, whereas languages whose classifiers are colexified with ‘stone’ evolved independently.

Keywords: Tibeto-Burman; numeral classifier; semantic evolution; network; colexification

1 Introduction: the etymology of classifiers for small round objects

Languages from the Tibeto-Burman (TB) family vary in the presence/absence and the degree of grammaticalization of numeral classifier system (Jiang 2009). Some subgroups of TB languages are known for the scarcity of numeral classifiers (e.g. Bodish and Kuki-Chin-Naga), while other subgroups (e.g. Karenic, Baic, Burmic) have full-fledged classifier systems. Numeral classifiers were not part of the proto-Sino-Tibetan language, but evolve individually in quite a few of the languages in the family (LaPolla 2017: 46, cf. Xu 1987, 1989; Dai 1994, 1997a, 1997b). Classifiers are not reconstructed for Proto-Tibeto-Burman (PTB) either (Matisoff 2003). It is thus important to know when and how numeral classifiers were developed in this family, as well as what caused the differences in the degree of grammaticalization.

Like many nearby East and Southeast Asian languages, Tibeto-Burman numeral classifiers are mostly derived from nouns (Aikhenvald 2000, 2022; Bisang 1996; DeLancey 1986). There is a highly frequent type of classifiers attested in almost every classifier language of this family – the wide-spread classifiers for small round or 3-dimensional objects (hereafter 3D-classifier) – were derived from different etymological sources. Despite the absence of a proto-form in PTB, this classifier has been hypothesized to exist in many proto languages in different subgroups of Tibeto-Burman (e.g. Wood 2008; Ebert 1994; Kazuyuki 2009; Malla 1990; Genetti 2007, 2017; Sun 1993; Post and Sun 2017; Bradley 1979, 2012; Hansson 2017; Luangthongkum 2013, inter alia).

The etymology of 3D-classifiers in TB is complex. It can be evolved from the concepts of ‘fruit’ (Proto-Bodo-Garo and Proto-Ngwi, cf. Wood 2008; Bradley 1979), ‘stone’ (Proto-Bodo-Garo and Proto-Karen, cf. Wood 2008; Luangthongkum 2013), ‘egg/testicle’ (Classic Newar and Burmic, cf. Kazuyuki 2009; Bradley 2012), ‘round things’ (Proto-Burmic, cf. Bradley 2012), the affix meaning ‘mother/female’ (Ngwi, Zhang 2016), among others. In the grammaticalization of classifiers, nominal compounding is an important historical stage (Aikhenvald 2022; Bisang 1993; DeLancey 1986). Among those concepts, lexemes with the lexical meanings ‘fruit’ and ‘stone’ are identified as two common sources of 3D-classifiers crosslinguistically, including in many Southeast Asian languages, and ‘fruit’ may further develop into a general classifier that classifies all inanimate nouns (Aikhenvald 2000, 2021).

Matisoff (2003) has reconstructed two etyma for the lexeme ‘fruit’ in Proto-Tibeto-Burman (PTB), namely *sey ‘fruit/rose/round object’(#1019^[1]) and *b-ras ‘rice/fruit/bear fruit/round object’ (#2071). Both etyma also mean ‘round objects’. *sey is the dominant proto-form for ‘fruit’ throughout the Tibeto-Burman family while *b-ras is only found in Tibetan, Northern and Central Chin,^[2] and a few Bodo and rGyalrong languages.

The morpheme ‘fruit’ often appears as a “class term” (CT) (DeLancey 1986) in compound nouns denoting various kinds of fruits. Fruits containing a ‘fruit’ CT mostly have a round shape, involving apple, fig, grape, lime, mango, melon, peach, pear, persimmon, banana, pomegranate, tangerine, and nuts, among others. Table 1 displays languages that employ the CT derived from the PTB *sey ‘fruit’ in compounds denoting fruits.

Table 1:

CT ‘fruit’ < PTB *sey ‘fruit/rose/round object’.

Subgroup	Language	Word form	Gloss	Source
Ngwi	Lisu	si ³⁵ sɯ ³¹	‘fruit’	Bradley (2012)
		tɕhe̱ ³¹ le̱ ³¹ sɯ ³¹	‘grape’
		sɯ ³¹ lɯ̱ ³³ bɯ ³³	‘persimmon’
Burmish	Longchuan Achang	ʂə ³¹	‘fruit’	Huang (1992); Hill and List (2017)
Burmish	Longchuan Achang	ʂə ³¹ om ³¹	‘peach’	Huang (1992); Hill and List (2017)
Deng	Yidu	ɹuŋ ⁵⁵ ɕi ⁵⁵	‘fruit’	Huang (1992)
		ɑ ³¹ jim ⁵⁵ ɕi ⁵⁵	‘grape’
		ɑ ⁵⁵ mu ⁵⁵ ɕi ⁵⁵	‘peach’
Kiranti	Bahing	siː *tsi*	‘fruit’	Michailovsky (1989)
		gramu *tsi*	‘banana’
		khɔmal *tsi*	‘peach’
Jingpho	Jingpho	si ³¹ ;nam ³¹ si ³¹	‘fruit’	Huang (1992)
Jingpho	Jingpho	să ⁵⁵ pjiʔ ⁵⁵ si ³¹	‘grape’	Huang (1992)
Bodo-Garo	Garo	bi- te	‘fruit’	Burling (2003)
		te •-rik	‘banana’
		te •-ga-chu	‘mango’

The etymon *sey is also frequently found in compounds referring to seeds/grains, body parts, small animals, and other small inanimate objects with the round shape. Table 2 presents languages employing the PTB etymon *sey ‘fruit’ in compounds referring to small and round objects.

Table 2:

CT ‘small round object’ < PTB *sey ‘fruit/rose/round object’.

Subgroup	Language	Word form	Gloss	Source
Ngwi	Lüchun Hani	si ³¹	‘fruit’	Huang (1992)
		tshe ⁵⁵ si ³¹	‘rice (unhusked)’
		ɣø ³¹ si ³¹	‘kidney’
		phe ⁵⁵ si ³¹	‘button’
Kiranti	Bahing	si	‘fruit’	Michailovsky (1989)
		nœgat si	‘ear’
		nam si	‘grain’
		mo si	‘hail’
Burmish	Rangoon Burmese	ɑ ⁵³ *tθi* ⁵⁵	‘fruit’	Huang (1992)
Burmish	Rangoon Burmese	mo ⁵⁵ *tθi* ⁵⁵	‘hail’	Huang (1992)
Nungic	Dulong	ɕiŋ ⁵⁵ ɕi ⁵⁵ ;ɕi ⁵³	‘fruit’	Huang (1992)
Nungic	Dulong	tɯ³¹ ɕi⁵⁵	‘gall bladder’	Huang (1992)
Jingpho	Jingpho	si ³¹ ;nam ³¹ si ³¹	‘fruit’	Huang (1992)
		tsoʔ ³¹ si ³¹	‘key’
		tiŋ ³¹ si ³¹	‘bell’
Bodic	Motuo Menba	se	‘fruit’	Huang (1992)
Bodic	Motuo Menba	toŋ toŋ se	‘uvula’	Huang (1992)
Bodo-Garo	Garo	bi- te	‘fruit’	Burling (2003)
Bodo-Garo	Garo	sil- te	‘hail’	Burling (2003)

The etymon *sey may develop into a sortal classifier for small and round objects or a mensural classifier, which is mostly from the sub-branches of Ngwi and Burmish. Examples in Table 3 illustrate that the Lüchun Hani cognate si³¹ ‘fruit’ (< PTB *sey) and the Mojiang Hani cognate ɕi³¹ ‘fruit’ can serve to classify small round objects like eggs, stones, grain of rice, and bowls. The Zaiwa (Atsi) cognate ʃi²¹ ‘fruit’ is used as a standard measure classifier meaning ‘fingersbreath’; ‘fruit’ in the Lüchun and Mojiang dialects of Hani can be used as a measurement of time, meaning ‘month’s (work)’.

Table 3:

Sortal/mensural classifier < PTB *sey ‘fruit/rose/round object’.

Subgroup	Language	Word form	Gloss	Source
Ngwi	Lüchun Hani	si ³¹	‘fruit’	Huang (1992)
	Lüchun Hani	si ³¹	‘CL:eggs/stones’ ‘CL:month’s work’	Huang (1992)
	Mojiang Hani	ɔ ³¹ ɕi ³¹	‘fruit’	Huang (1992)
	Mojiang Hani	ɕi ³¹	‘CL:eggs/grain of rice’ ‘CL:month’s work’	Huang (1992)
Burmish	Zaiwa (Atsi)	ʃi²¹	‘fruit’	Huang (1992)
Burmish	Zaiwa (Atsi)	ʃi²¹	‘CL:fingersbreath’	Huang (1992)

The etymon *b-ras referring to ‘fruit’ and ‘rice’ may appear in the root of both concepts, as in Old Tibetan (Tibetan) (e.g. ɦbras bu ‘fruit’ and ɦbras ‘rice’) (Sun 1991; Huang 1992). However, unlike *sey, this etymon has limited productivity in deriving nouns in Old Tibetan. Except for ‘rice’, none of the aforementioned shape classes contain the morpheme ɦbras. Lushai is the other language has the etymon *b-ras. But the Lushai rah ‘fruit’ did not derive any nouns referring to small round objects (cf. Bhaskararao 1996; VanBik 2009).

It should be noted that there exists another etymon *si(ŋ/k) (#2658) in PTB, glossed as ‘tree/wood/firewood’ (Matisoff 2003), that is colexified with ‘fruit’ in several sub-groups of TB, as presented in Table 4. The lexeme ‘fruit’ is frequently found in compound nouns with reference to plants/plant parts and the related wood and wood products. For example, the Bahing and Hayu cognate si is most plausibly a root meaning ‘plant’, which is found in concepts with reference to ‘fruit’, ‘tree’, and plant parts. The root sî ‘tree’ in old Burmese also appears in the root for ‘fruit’, marking ‘fruit’ as a part of plant. In Xide Yi and old Tibetan, the cognates sɿ̄³³ and ɕiŋ appear in nouns meaning ‘fruit’, ‘tree’, and wood products. It is posited that *si(ŋ/k) is a general class term for ‘plant’ that covers both ‘fruit’ and ‘tree’. It was narrowed down to ‘fruit’ in some languages (e.g. Yi) but in others it remained as a morpheme meaning ‘plant’ (Qumutiexi 2010).

Table 4:

Colexification of PTB *sey ‘fruit/rose/round object’ and PTB *si(ŋ/k) ‘tree/wood/firewood’

Subgroup	Language	Word form	Gloss	Source
Kiranti	Bahing	si	‘fruit’; ‘tree’	Michailovsky (1989)
		dhɛk si	‘tree’
		toː si	‘pine tree’
		prypt si	‘bud’
	Hayu	si	‘fruit’	Michailovsky (1989)
		dõː si	‘plant’
		kok si	‘fodder tree’
Burmish	Old Burmese	ə- sî	‘fruit’	Benedict (1976)
Burmish	Old Burmese	sî	‘kind of tree’	Benedict (1976)
Ngwi	Xide Yi	*sɿ̄* ³³ dzɑ ³³ lu̱ ³³ mɑ ³³	‘fruit’	Sun (1991); Huang (1992)
		*sɿ̄* ³³ bo ³³	‘tree’
		*sɿ̄* ³³ ḷ(u̱) ³³ sɿ̄ ³³ tɕe ³³	‘wood’
		*sɿ̄* ³³ phi ²¹	‘plank/board’
Tibetan	Old Tibetan	*ɕiŋ* tog	‘fruit’	Sun (1991)
		*ɕiŋ* ɦdzer	‘wedge’
		*ɕiŋ*	‘wood’
		gȵaɦ *ɕiŋ*	‘yoke’

Three etyma are reconstructed for ‘stone’ in Matisoff (2003), including *r-lu(ŋ/k) (#1269), *b-rak (#2166), and *suaŋ (#4677). *r-lu(ŋ/k) prevails the entire Tibeto-Burman family. *b-rak is a salient feature of Central and Southern Ngwi.^[3] It is also a common etymon of ‘stone/rock’ in many Tibetan languages. In addition to Ngwi and Tibetan, this proto-form is found in a few languages from the subgroups of Tani, Garo, Jingpho, Tibeto-Kanauri, Kiranti, rGyalrong, Nungic, and Tujia. *suaŋ is much rarer, only attested in Peripheral Chin.

The morpheme ‘stone’ regularly occurs as a class term (CT) in the compound nouns with reference to different varieties of rocks and stones, ranging from rocks and stones in the nature (i.e. boulder, cave, cliff, pebble, coral, limestone) to products made of stone/rock (i.e. millstone, whetstone, hearth-stone, wall, flight of steps). The etymon *r-lu(ŋ/k) (#1269) is the most productive form found in compounds for varieties of rocks and stones and *b-rak (#2166) is much less frequent. Table 5 gives examples from different subgroups that contain the CT ‘stone’ in compounds for various types of stones and rocks.

Table 5:

CT ‘stone’ < PTB *r-lu(ŋ/k)/*b-rak ‘stone/rock’.

Etymon	Subgroup	Language	Word form	Gloss	Source
*r-lu(ŋ/k)	Western Tani	Bokar	ɯ *lɯŋ*	‘stone’	Huang (1992); Sun (1993)
			*lɯŋ* -duŋ	‘boulder/huge rock/cliff’
			*lɯŋ* pɯk	‘cave(mountain)’
			*lɯŋ* -reː	‘pebble’
	Nungic	Dulong	luŋ ⁵⁵	‘stone’	Huang (1992)
			ɑ ³¹ pɹɑʔ ⁵⁵ *luŋ* ⁵⁵	‘rock’
			*luŋ* ⁵⁵ pɑŋ ⁵⁵	‘cave/hole’
			tɕɑ ³¹ mɑʔ ⁵⁵ *luŋ* ⁵⁵	‘flint’
	Ngwi	Lisu	lo̱ ³³	‘stone’	Huang (1992)
			ɕo ³¹ *lo̱* ³³	‘coral’
			tɕhɛ ³⁵ dʑɯ̱ ³¹ *lo̱* ³³	‘flint’
			*lo̱* ³³ kho ³¹	‘valley’
*b-rak	Tujia	Tujia	ɣa ²¹ (pa ²¹ )	‘stone’	Sun (1991)
	Tujia	Tujia	ɣa ²¹ kho ²¹	‘rock/cliff’	Sun (1991)
	Ngwi	Lancang Lahu	xɑ ³⁵ pɯ ³³ ɕi ¹¹	‘stone’	Huang (1992)
			xɑ ³⁵ pɯ ³³	‘rock’
			xɑ ³⁵ pɯ³³ go³³	‘flight of steps’
			xɑ ³⁵ tshi ³³	‘cliff’
			ɑ ³¹ mi ¹¹ xɑ ³³ pɯ ³³	‘flint’

The etymon for ‘stone’ also frequently appears in compound nouns with reference to small round animals, body parts, and inanimate objects. The etymon *r-lu(ŋ/k) (#1269) is the most productive etymon found in this type of compounds, whereas *b-rak (#2166) is less productive in deriving nouns of small round shape (Table 6).

Table 6:

CT ‘small round object’ < PTB *r-lu(ŋ/k)/*b-rak ‘stone/rock’.

Etymon	Subgroup	Language	Word form	Gloss	Source
*r-lu(ŋ/k)	Western Tani	Bokar	ɯ *lɯŋ*	‘stone’	Sun (1993)
			jup- *lɯŋ* -ki-bo	‘caterpillar’
			*lɯŋ* guŋ	‘neck/throat’
			*lɯŋ* ɕuk	‘trivet’
	Nungic	Dulong	luŋ ⁵⁵	‘stone’	Huang (1992)
			*luŋ* ⁵⁵ dʑin ⁵³	‘ginger’
			ɑm ⁵⁵ *luŋ* ⁵⁵	‘(unhusked) rice’
			nɯ ³¹ lɛŋ ³¹ *luŋ* ⁵⁵	‘testicle’
	Kiranti	Limbu	luŋ	‘stone’	Michailovsky (1989)
			*luŋ* si	‘maggot’
			*luŋ* ma	‘heart’
	Ngwi	Lisu	lo̱ ³³	‘stone’	Sun (1991); Huang (1992)
			bo̱ ³¹ *lo̱* ³³	‘ant’
			o ⁵⁵ go̱ ³¹ *lo̱* ³³	‘pillow’
			po ⁴⁴ lo ⁴⁴	‘bullet’
*b-rak	Tibetan	Batang Tibetan	tshɑʔ ⁵³	‘pit/stone’	Huang (1992)
			tshɑʔ ⁵³	‘sieve / sifter’
			tɕa ¹³ *tshɑʔ* ⁵³ gẽ ⁵⁵ mo ⁵³	‘locust’
	Ngwi	Lancang Lahu	xɑ ³⁵ pɯ ³³ ɕi ¹¹	‘stone’	Huang (1992)
			xɑ ³⁵ pɯ ³³ qɑ ¹¹	‘turtledove’
			xɑ ³⁵ pɯ ³³ ɕi ³⁵ ɣɤ ²¹	‘coal’

Like ‘fruit’, the etymon *r-lu(ŋ/k) (#1269) ‘stone’ has derived a handful of numeral classifiers classifying small round objects in several Tibeto-Burman languages, as shown in Table 7. However, Burmic languages (i.e. mostly Ngwi) did not derive 3D-classifiers from ‘stone’ but rather from ‘fruit’. ‘Stone’ and ‘fruit’ are in complementary distribution in forming numeral classifiers in the Burmic subgroup. Concepts classified by a classifier evolved from *r-lu(ŋ/k) ‘stone’ typically have the round shape, ranging from stones/rocks, eggs, bowls, to grains. No evidence in the available sources shows that the other etymon *b-rak for ‘stone’ has derived numeral classifiers in Tibeto-Burman.

Table 7:

Sortal classifier < PTB *r-lu(ŋ/k)/PTB *b-rak ‘stone/rock’.

Subgroup	Language	Word form	Gloss	Source
Karenic	Karen	lø ³¹	‘stone’	Huang (1992)
Karenic	Karen	ph lø ³¹	‘CL:eggs/grain (of rice)/ rocks/stones’	Huang (1992)
Nungic	Dulong	luŋ ⁵⁵	‘stone’	Huang (1992)
	Dulong	luŋ ⁵⁵	‘CL:eggs/grain (of rice)/ rocks/stones’	Huang (1992)
	Nung	ḷuŋ ⁵⁵	‘stone’	Huang (1992)
	Nung	(thi ³¹ ) *ḷuŋ* ⁵⁵	‘CL:eggs/rocks/stones’	Huang (1992)
Bodo-Garo	Garo	roŋ	‘stone’	Benedict (1972); Wood (2008)
		*roŋ* -brak	‘rock’
		*roŋ-*	‘CL: round objects’

Both etyma for ‘stone’ may derive nouns with the denotation of human beings and gods. It is found in a handful of lexical forms for adult, man, girl, and grandchild. For example, the morpheme lʊ̃ ‘stone’ (< PTB *r-lu(ŋ/k)) is seen in lʊ̃ːtso ‘man/male’ in Hayu (Kiranti) (Michailovsky 1989); both ‘stone’ and ‘man’ are lụ in Tangut (rGyalrongic, Li 1997). Similar extension has been attested in Lyuzu, Naxi, and Limbu too (cf. Huang 1992; Michailovsky 1989). The occurrence of *b-rak in nouns referring to humans/gods is found in Batang Tibetan (e.g. tshɑʔ⁵³ ‘stone’, mba¹³ tshɑʔ⁵³ ‘person with pockmarked face’, cf. Huang 1992) and Lancang Lahu (e.g. xɑ³⁵ ‘stone’, xɑ³⁵ ɯ¹¹ phɑ⁵³ ‘adult’, ʑa⁵³ mi⁵³ xɑ³⁵ ‘girl’).

Occasionally, the etyma for ‘stone/rock’ are found in nouns encoding directions and predicates encoding process/property. For example, lɯm⁵⁵ ‘stone’ in Taraon Darang (Deng) is a part of the compound lɯm⁵⁵ koŋ⁵⁵ ‘inside’ (Sun 1991); in Old Burmese (Burmish), the noun kyok ‘stone’ is also used as the verb for ‘kick’ or ‘push off (boat)’ (Benedict 1976; Hansson 1989).

From the data presented above, we may conclude that though ‘fruit’ and ‘stone’ were etymologically distinct nouns, they seem to evolve into the same shape-based numeral classifier via a highly identical path in semantic extension. Nevertheless, the above generalizations are primarily based on empirical data collected from individual languages. It is not known to what extent patterns of semantic extension from the same noun origin converge in languages with and without such a classifier, and under what condition a noun for ‘fruit’ or ‘stone’ will develop into a classifier. In this study, two questions are addressed with respect to this. The first question concerns the mechanism behind the evolution of numeral classifier: is there any salient path of semantic extension associated with the derivation of a classifier for 3-dimensional objects from ‘fruit’ and ‘stone’? Second, if yes how effective it is in predicting the presence/absence of a classifier in a particular language? Tibeto-Burman languages diverge largely in the grammaticalization of a classifier, making them an ideal objective to examine the similarity/divergence in semantic extension of a concept and the result of it.

With this research objective, the approach of semantic network (§2.2) is used to construct the colexification network (Jackson et al. 2019) of the lexeme ‘fruit’ and ‘stone’ in over 60 Tibeto-Burman languages. By examining the structure of the colexification network of ‘fruit’/ ‘stone’, we aim to 1) identify any semantic colexification patterns of ‘fruit’/‘stone’ that are indicative of the development of a classifier in Tibeto-Burman languages from distinct subgroups, and 2) evaluate the effectiveness of the colexification patterns in explaining the presence/absence of a classifier in a particular language.

In the sections below, we will first outline the methods in §2, including the data sampling procedure and a network-based approach that generates the colexification networks of ‘fruit’ and ‘stone’. The result of the network analysis will be presented in §3. The crosslinguistic colexification patterns of ‘fruit’ and ‘stone’ will be carefully examined in this section. Specific paths of semantic extension from nouns to 3D-classifiers in each concept network will be inspected in §4. In §5, we will discuss the common path of semantic extension from a noun to a numeral classifier, as well as distinct colexification patterns governing the derivation of a classifier from ‘fruit’ and ‘stone’. §6 concludes the paper.

2 Methods

2.1 Dataset

A sample of 58 + 68 languages are collected on the basis of Sagart et al. (2019) to study the colexification patterns of ‘fruit’ and ‘stone’, respectively (i.e. see Appendix 1). Data collection and curation followed the following steps:

By querying the concepts ‘the fruit’ and ‘the stone’ in the database of Sagart et al. (2019),^[4] (https://dighl.github.io/sinotibetan/), two lists of languages containing the concepts ‘the fruit’ and ‘the stone’ in a sample of 117 Tibeto-Burman languages are compiled. The two lists are comprised of 19 cognate sets involving the lexeme ‘the fruit’ and 26 cognate sets involving the lexeme ‘the stone’ within the family of Tibeto-Burman. They are used as the starting points of the analysis. 104 languages are identified containing the concept ‘the fruit’ and 116 languages containing the concept ‘the stone’.
To ensure that word forms in a particular semantic network are derived from the same etymon, we use the database of STEDT^[5] (Matisoff 2015) (https://stedt.berkeley.edu/∼stedt-cgi/rootcanal.pl) as the reference point.^[6] Based on the word forms for the concepts ‘the fruit’ and ‘the stone’ in the two lists of languages, we searched words containing the morpheme(s) glossed as ‘fruit’ and ’stone’ in STEDT for each sampled language. Concepts that are colexified with ‘fruit’ and ‘stone’ in all sampled languages are gathered through this procedure. Since compounding is a productive strategy of word formation in the Tibeto-Burman family, the lexemes ‘fruit’ and ‘stone’ can be monosyllabic or disyllabic. The selection criteria is: if a word shares at least one syllable with the word ‘the fruit’/ ‘the stone’, it is included in the dataset for further analysis. By way of illustration, to compile the colexified word forms of ‘fruit’ in Tibetan (Batang), we first search the word(s) glossed as ‘fruit’ in STEDT. ‘Fruit’ in this language is a compound xhĩ⁵⁵thoʔ⁵³ in which the first morpheme xhĩ⁵⁵ is derived from the etymon ‘TREE/WOOD/FIREWOOK’(#2658) in PTB. Word forms in STEDT containing either xhĩ⁵⁵ or thoʔ⁵³ are then obtained. Table 8 shows the word forms containing the form xhĩ⁵⁵ in Tibetan (Batang).

Table 8:

Word forms colexified with xhĩ⁵³ in Tibetan (Batang).

Manually annotate the proto-form and the etyma tag of the colexified lexical forms of ‘fruit’ and ‘stone’ for each language on the basis of the etyma tag (e.g. #2658 in Table 8) and the corresponding proto-form in STEDT.
Among the languages that are initially collected from step 1, languages in which the cognate set of ‘fruit’ and ‘stone’ is undetermined (Cog ID = 0) are excluded, except for languages with apparent cognates (e.g. Ngwi-Burmish languages all have the cognate of *sey for ‘fruit’). Consequently, the sample contains 58 languages for ‘fruit’ and 68 languages for ‘stone’.
Since colexification network only concerns polysemy (i.e. senses that are related), it is necessary to filter out homonyms (i.e. senses that are unrelated). A practical solution is to filter the spurious links on a colexification network, as they appear in only one or two languages (Di Natale and Garcia 2023; List et al. 2018; Rzymski et al. 2020). For this reason, we first sort the concepts by frequency in the language sample and remove the concepts that only occur in one language in the entire dataset.

Table I and Table II in Appendix 1 present the sampled languages and their subgroups, the word forms of ‘fruit’ and ‘stone’ in each language, the cognate set ID and the etymon tag in STEDT of each word form, and the number of concepts that have the same form as ‘fruit’ and ‘stone’ (i.e. colexify) in a particular language.

2.2 A network-based methodology

Network-based approaches, increasingly popular due to recent advancements in computational science and graph theory, have found wide application in diverse fields such as physics, psychology, and knowledge engineering (Castro and Siew 2020; Kenett and Faust 2019; Siew 2020). In semantics, these methods have proven especially beneficial. A semantic network, as outlined by Steyvers and Tenenbaum (2005), consists of nodes representing words/concepts, and links portraying the relationships between them. Semantic networks constructed within or across languages have thus facilitated investigations into a range of phenomena including semantic changes across the lifespan and bilingualism (Borodkin et al. 2016; Wulff et al. 2022), organization of nouns and verbs in the mental lexicon (Qiu et al. 2021), and cross-linguistic colexification (List et al. 2013).

There are several benefits to using semantic networks to analyze word meaning. First, network representation provides a powerful tool for visualizing semantic relations and dynamics, allowing researchers to easily identify patterns and structures that may not be apparent in other forms. In addition, network analysis techniques can help identify important properties and behaviors of semantic networks mathematically using graph theory. For example, network analysis can reveal the properties of individual nodes, as well as clusters, sub-networks, and the overall network. One of the most important node-level properties is the degree, which refers to the number of links a node has. Words/concepts with a high degree are highly connected to other words, and thus have a greater influence on the overall structure of the semantic network. Important cluster/network-level properties include the clustering coefficient (CC; a measure of the local density of links) and average shortest path length (ASPL; the average number of steps along the shortest paths for all pairs of nodes). When semantic networks exhibit high CC and low ASPL, they are considered to have a “small-world” structure. This type of structure is characterized by highly connected clusters of nodes, or “communities”, that are themselves relatively well connected to one another. At the same time, the network as a whole maintains short path lengths between any two nodes, allowing for efficient communication and the spread of information across the network (for a review, see Siew et al. 2019).

2.2.1 Network estimation method

To compute the colexification networks of ‘fruit’ and ‘stone’, we utilized a correlation-based network construction approach. The foundation of this approach is that meaningful semantic relationships can be quantified using measures such as correlation coefficients or cosine similarity, providing a numerical representation of the inherent organization and structure of concepts. This method has been successfully applied in the analysis of semantic relations from verbal fluency data, where each concept is represented as either produced (‘1’) or not produced (‘0’) across a set of participants (Borodkin et al. 2016; Siew and Guru 2023).

Similarly, for our study, we first created separate binary response matrices for ‘fruit’ and ‘stone’. Each column in the matrix denotes a colexified concept, each row represents a language from the Tibeto-Burman family, and each cell indicates whether the concept is present (‘1’) or absent (‘0’) in the respective language. We then calculated a symmetric association matrix by determining the cosine similarity between every pair of concepts.^[7]

Following this, we constructed a weighted and undirected network by connecting concepts that had non-zero cosine similarity values. The Triangulated Maximally Filtered Graph (TMFG; Massara et al. 2017) method was then applied to eliminate links with weaker weights (lower cosine similarity values), while preserving the network’s triangulated nature. The TMFG method effectively removes edges that don’t contribute to the maximum degree sequence or to the network’s triangulated property, keeping the crucial structural properties of the original network intact.

Lastly, the resulting TMFG network was simplified further to an unweighted and undirected network by converting all edge weights to 1, which facilitates easier analysis and interpretation. Network estimation and analyses were conducted using the igraph (v1.3.5; Cśardi and Nepusz 2006) and NetworkToolbox (v1.4.2; Christensen 2018) packages in R (v4.2.3; R Core Team 2023). The implementation is publicly accessible on our GitHub repository at https://github.com/mengyangq/semantic_evolution.

2.2.2 Network measures

To evaluate the structural properties of the two colexification networks, we computed common node-level and network-level measures, including the degree distribution, average degree, clustering coefficient (CC), and average shortest path length (ASPL). We compared the CC and ASPL for each network against the same network measures obtained from 1,000 Erdős-Rényi random network simulations (Erdős and Rényi 1960) with the same number of nodes and edges as the colexification networks. A small-world structure is indicated by a CC that is much larger than CC_random, and an ASPL that is similar to or slightly larger than ASPL_random. We also conducted a one-sample z-test for the two measures to assess whether the colexification networks were structurally meaningful and significantly different from their corresponding random network simulations.

To further analyze the structure of the colexification networks, we also explored their community or cluster structure. This was achieved by utilizing the cluster_optimal function from the igraph package (Cśardi and Nepusz 2006). This function computes the optimal community structure of a graph, essentially categorizing nodes into groups or communities in a way that maximizes the measure of modularity across all possible partitions. By doing so, we can identify clusters of closely interconnected concepts within the larger network, thereby revealing additional layers of organization and offering further insights into the semantic relatedness of ‘fruit’ and ‘stone’ concepts across different languages.

2.2.3 Network stability

In this study, we also employed a bootstrapping approach to examine and compare the stability and structural characteristics of the ‘stone’ and ‘fruit’ networks (Borodkin et al. 2016). The analysis was grounded in generating 1,000 bootstrap iterations for each network. Each bootstrap iteration involved randomly selecting rows with replacement from the original binary matrices of the networks, followed by constructing a new network for each iteration, using the same network estimation method, as stated in §2.2.1. We then compared two key network metrics for these bootstrapped networks: CC and ASPL. To statistically compare the variability in these metrics between the two networks, we conducted Levene’s tests for homogeneity of variance.

3 Result of network analysis

3.1 Semantic classes colexified with ‘fruit’ and ‘stone’

Using the sampling procedure outlined in §2.1, 104 and 99 concepts are identified in the sampled languages that share at least one morpheme (i.e. colexify) with the lexeme ‘fruit’ and ‘stone’, respectively. Those concepts constitute the crosslinguistic colexification networks of the two nouns in Tibeto-Burman. In the initial sampling of languages, 58 languages are selected for ‘fruit’ and 68 for ‘stone’. Due to the existence of languages that do not colexify with any concept, 52 and 54 languages are eventually included in the network estimation of 'fruit' and 'stone', respectively.^[8]

Concepts colexified with ‘fruit’ are subdivided into 11 semantic classes: varieties of fruits, seeds/grains, body parts, small animals, small round objects, plants/plant parts, wood/tree, human/god, process/property, classifiers, and miscellaneous. They can be further grouped into 7 macro semantic classes.^[9] Concepts that are found in at least one language in the sample are included for analysis (i.e. Appendix 2, Table III). Table 9 presents the most frequent 10 concepts colexified with ‘fruit’ in the 58 sampled languages.

Table 9:

The frequency rank of concepts colexified with ‘fruit’ in 58 Tibeto-Burman languages.

Concept	Number of languages	Semantic class
fruit	58	variety of fruits
bear fruit	21	process/property
peach	18	variety of fruits
persimmon	15	variety of fruits
rice (unhusked/glutinous/paddy/plant/uncooked)	14	seeds/grains
grape	11	variety of fruits
nut, seed (gen.); bead	11	seeds/grains
plantain, banana	10	variety of fruits
cucumber	9	variety of fruits
pear	9	variety of fruits

Concepts colexified with ‘stone’ are also divided into 11 semantic classes: varieties of stones, tools, fabrics, body parts, small animals, small round objects, direction, process/property, human/god, numeral classifiers, and miscellaneous.^[10] Table IV in Appendix 2 presents the concepts that are found in at least one language in the sample. Table 10 presents the most frequent 10 concepts containing the morpheme ‘stone’ in the 68 sampled languages.

Table 10:

The frequency rank of concepts colexified with ‘stone’ in 68 Tibeto-Burman languages.

Concept	Number of languages	Semantic class
pit/stone/rock	68	varieties of stones
flint (to make fire)	22	varieties of stones
coal	10	varieties of stones
maggot, worm	10	small animals
pestle (small, stone)	9	varieties of stones
cliff / rocky outcrop	7	varieties of stones
heart, liver	7	body parts
flight of steps	6	varieties of stones
wall(stone)	6	varieties of stones
cave	5	varieties of stones

3.2 Colexification networks of concepts

The network approach (§2.2) is applied to concepts colexified with ‘fruit’ (§3.2.1) and ‘stone’ (§3.2.2) to discover the structure of concepts in terms of their colexification similarity. The colexification networks of ‘fruit’ and ‘stone’ are characterized by Figure 1.

Figure 1:

Colexification Network of ‘Fruit’ (top panel) and ‘Stone’ (bottom panel). In this visualization, varying text colors denote different semantic classes (§3.1) of concepts. Node colors signify distinct clusters within the network, illustrating how concepts are grouped based on their colexification patterns. The size of each node corresponds to the degree of the concept, with larger nodes indicating concepts that have a higher degree of colexification with other concepts.

Table 11 presents the network measures for the colexification networks for ‘fruit’ and ‘stone’. Results of the one-sample z-tests between the CC and ASPL of the colexification networks and their corresponding random networks demonstrated that the two colexification networks were significantly different from the simulated random networks (p’s < 0.001), suggesting that the two networks were structurally meaningful and not simply the result of chance. In addition, the two networks exhibited small-world properties (i.e., CC ≫ CC_random and ASPL ≥ ASPL_random), which was further supported by the degree distribution, as shown in Figure 2, where a few nodes have a very high number of connections, while most nodes have only a few connections.

Table 11:

Parameters of the ‘fruit’ and ‘stone’ colexification networks.

	Fruit	Stone
Nodes	104	99
Edges	301	267
Average degree	5.79	5.39
CC	0.46	0.43
ASPL	4.31	3.77
CC_random	0.06***	0.05***
ASPL_random	2.81***	2.88***

***p < 0.001.

Figure 2:

Relative Frequencies of Node Degrees in the ‘Fruit’ (top panel) and ‘Stone’ (bottom panel) Networks. The bar charts show the distribution of node degrees in the two colexification networks, where the x-axis (i.e. degree) represents the number of connections a node has, while the y-axis indicates the relative frequency of each degree. The skewed distribution, with a majority of nodes having few connections and a few nodes having many, is typical of small-world networks.

3.2.1 ‘Fruit’

104 concepts are identified colexified with ‘fruit’ in 58 TB languages. Table 12 presents the 10 concepts of the highest degree.

In the colexification network of ‘fruit’ (i.e. the top panel of Figure 1), ‘pear’ (d = 15) is the highest ranked concept in degree, indicating that its colexified concepts outnumber any other concepts in the network.^[11] ‘Walnut’ (d = 14), ‘peach’ (d = 12), ‘pumpkin’ and ‘cucumber’ (d = 10) are additional central concepts for varieties of fruits. Small round body parts form another class of central concepts in the network of ‘fruit’. The concepts ‘ankle’ (d = 12), ‘fingernail’ and ‘bladder’ (d = 10) all exhibit relatively high degree. The two 3D-classifiers exhibit high degree as well, suggesting that this type of classifiers are colexified with many concepts in the sample.

Table 12:

Ranks of concepts in the network of ‘fruit’ in terms of degree.

Concept	Degree	Cluster
pear	15	3
CL:rocks,stones	14	4
walnut	14	3
ankle	12	1
peach	12	3
CL: eggs	11	4
fig (tree)	11	5
firewood	11	2
grind (flour)	11	6
bladder; fingernail	10	1
cucumber	10	5
pumpkin	10	3
sugarcane	10	2

Overall, the semantic network of 'fruit' in Figure 1 is comprised by six distinct sub-networks of semantics. It represents a “small-world” network and the cluster membership is strongly associated with the semantic category of a particular concept (χ² = 122.36, df = NA,^[12] p < 0.001,Cramer’s V = 0.49). Concept nodes within each sub-network are frequently colexified, while concepts from distinct clusters are rarely colexified and the semantic relationships across sub-networks are weak. The concept ‘fruit’ is most frequently colexified with concepts in Cluster 3 but is more difficult to be colexified with a concept from other sub-networks. Because 3D-classifiers and ‘fruit’ are distributed in distinct clusters, it suggests that the derivation of a 3D-classifier from ‘fruit’ must through some intermediate semantic classes.

Based on the degree of all concepts in the network, Cluster 3 and 1 are the central clusters in the entire network of ‘fruit’ which contain the highest ranked concepts.

Cluster 3 is in the center of the network. Concepts for varieties of fruits predominantly occupy the central area of the entire network, including ‘pear’, ‘walnut’, ‘peach’, ‘pumpkin’, indicating this semantic class serves as the basic meaning. A few concepts for objects with the small round shape, including small inanimate objects (i.e. ‘bullet’, ‘hail’), seeds (i.e. seed), body part (i.e. fist), are closely connected with the concepts for fruits, exhibiting highly identical colexification patterns.

Cluster 1 features the prominent status of body parts and is adjacent to Cluster 3. Three concepts for small round body parts, i.e. ‘ankle’, ‘fingernail’, ‘bladder’, have high degrees. Body parts are not only colexified with many concepts in this sub-network but also have strong links with important concepts in Cluster 3 in the center (i.e. both ‘ankle’ and ‘fingernail’ have multiple links with varieties of fruits in Cluster 3).

Concepts in Cluster 4 form another major cluster in the network of ‘fruit’. Two 3D-classifiers, i.e. ‘CL:rocks, stones’ (d = 14) and ‘CL:eggs’ (d = 11) have the highest degree, suggesting they have most links with other concepts of this sub-network. However, ‘CL:rocks, stones’ and ‘CL:eggs’ are merely local center of this sub-network. The entire sub-network centered around these two classifiers is somewhat split from the center and consists of less important concepts in the overall network.

The remaining three clusters (Cluster 2, 5, 6) form relatively independent sub-networks of the concepts related to ‘fruit’ that are not quite related to the semantic classes in Cluster 1 and 3.

3.2.2 ‘Stone’

99 concepts are colexified with the lexeme ‘stone’ in 68 Tibeto-Burman languages. The degree values in Table 13 reflect the significance of a concept in the network.

Table 13:

Ranks of concepts in the network of ‘stone’ in terms of degree.

Concept	Degree	Cluster
flint (to make fire)	24	8
fling/toss	14	1
coal	13	4
pillow	13	1
pit/stone/rock	13	4
ant	11	3
cloth	11	7
steel	11	2
CL:grain (of rice)	10	6
key	10	2
cliff	9	7
sand	9	8

In the colexification network of the lexeme ‘stone’ (i.e. the bottom panel of Figure 1), the concept of the highest degree is ‘flint (to make fire)’ (d = 24) in the center of the network, indicating its colexified concepts outnumber any other concepts. Other core concepts in this network include concepts for varieties of stones (i.e. ‘pit/stone/rock’, ‘coal’, ‘cliff’), 3D-classifiers (i.e. ‘CL: grain (of rice)’), small (in)animate objects (i.e. ‘ant’, ‘pillow’, ‘cloth’, ‘key’, ‘sand’), and actions (i.e. ‘fling/toss’).

Eight clusters of concepts can be distinguished. The overall network exhibits the properties of a “small-world” structure. Concepts within the cluster (a “small-world”) are well-connected while each cluster is well distinguished from one another. The cluster membership and the semantic category of a concept are moderately associated (χ² = 99.43, df = NA, p = 0.01, Cramer’s V = 0.38). Like ‘fruit’, the concept for ‘stone’ and 3D-classifiers are found in distinct clusters, implying the presence of intermediate semantic classes in the semantic extension from ‘stone’ to a 3D-classifier. However, the clustering of concepts from the same semantic class is not as strong as that of ‘fruit’.

The center of the ‘stone’ network (Cluster 8) is a fairly small cluster built around ‘flint’ (d = 24). Next to it is Cluster 4, which is dominated by two concepts for stones and rocks, i.e. ‘pit/stone/rock’ (d = 13) and ‘coal’ (d = 11). Varieties of stones are distributed in at least 4 clusters as the center of the sub-networks. In addition to Cluster 8 and 4, ‘cliff’, ‘flight’ are both central concepts in Cluster 7 (d = 9) and Cluster 3 (d = 8), respectively. In each of these clusters, the ‘stone’-related compounds are colexified with concepts of a variety of semantic classes. No strong colexification pattern within each concept cluster of ‘stone’ can be identified.

3D-classifiers are found in two clusters: Cluster 6 and Cluster 2. Cluster 6 exhibits a strong colexification pattern involving ‘CL:grain (of rice)’, ‘CL: eggs’, ‘CL:stones’. Those 3D-classifiers are not significantly colexified with concepts from other clusters but rather form an independent ‘3D-classifier’ cluster that are most often colexified with objects of the round shape (e.g. ‘rice’). Cluster 2 is more isolated, in which ‘CL:bowls’ is colexified with body parts (e.g. ‘finger’) and small round objects (e.g. hoetool).

Provided the separation of sub-networks between ‘stone’ (and ‘stone’-related compounds) and 3D-classifiers, we still found several direct links between ‘stone’-related compounds and 3D-classifiers. For example, ‘flint’ is directly colexified with all three 3D classifiers in Cluster 6; ‘flight’ is directly colexified with ‘CL:grain (of rice)’. However, given the fact that they pertain to distinct clusters, those colexification links seem to be restricted to a limited number of languages.

3.3 Network stability

The results from Levene’s tests indicated significant differences in the variances for both the CC and ASPL between the ‘stone’ and ‘fruit’ networks. For CC, the test yielded an F-value of 420.48 (p < 0.001), suggesting a significant higher variability of clustering in the ‘stone’ network (SD = 0.03) compared to the ‘fruit’ network (SD = 0.01). Similarly, for ASPL, the test resulted in an F-value of 6.91 (p < 0.05), reinforcing the conclusion of higher variability in the ‘stone’ network (SD = 0.46) than in the ‘fruit’ network (SD = 0.40).

Further, Welch’s t-tests were performed to compare the mean differences of the CC and ASPL between the networks. The t-test for ASPL showed a mean of 4.18 for the ‘stone’ network and 4.66 for the ‘fruit’ network, with a t-value of -24.66, indicating a statistically significant difference in the ASPL between the two networks (p < 0.001). For the CC, the mean values were 0.46 (‘stone’) and 0.48 (‘fruit’), with a t-value of -19.77, also signifying a significant structural difference in terms of clustering (p < 0.001).

These findings suggest notable differences in the stability of network construction between the ‘stone’ and ‘fruit’ networks, with the latter being significantly more stable. The significant variations in both the CC and ASPL imply that the two networks are structurally dissimilar, with the ‘stone’ network exhibiting a shorter average path length and lower clustering coefficient compared to the ‘fruit’ network.

4 Semantic extension from nouns to 3D-classifiers

According to the results of concept network analysis from §3.2 and §3.3, ‘fruit’ exhibits a salient and stable path associated with the semantic evolution of ‘fruit’, while the semantic extension of ‘stone’ is unstable and language-specific. In this section, we will turn to specific paths of semantic extension from nouns to 3D-classifiers in each concept network. A closer examination on the language-specific data shows how ‘fruit’ and ‘stone’ diverge in the mode of semantic extension towards a classifier.

4.1 Semantic extension from ‘fruit’ to 3D-classifiers

The 58 sampled languages can be divided into two classes: languages have a 3D-classifier and languages do not develop a 3D-classifier. Table 14 presents the 9 languages in our sample that employ a 3D-classifier derived from ‘fruit’, most of them are from the Ngwi subgroup.

Table 14:

TB languages that derive ‘CLF: 3D’ from ‘fruit’.

subgroup	language	‘fruit’	3D-CLF (e.g. fruits, eggs, stones, grains of rice)
Southern Ngwi	Hani_Lüchun	a ⁵⁵ si ³¹	si³¹ (eggs; stones)
Southern Ngwi	Hani_Mojiang	ɔ ³¹ ɕi ³¹	ɕi³¹ (eggs; grains of rice)
Central Ngwi	Jinuo	a ⁴⁴ sɯ ⁴⁴	sɯ⁴⁴ (grains of rice)
	Lahu_Lancang	i ³⁵ ɕi ¹¹	ɕi¹¹ (eggs; stones; grains of rice)
	Yi_Sani	sz̩ ¹¹ mɒ ³³	sz̩¹¹ (grains of rice)
	Lisu	si ³⁵ sɯ ³¹	sɯ³¹ (grains of rice)
Northern Ngwi	Yi_Nanhua	sæ ²¹	sæ²¹ (grains of rice)
Western Tani	Tani_Bokar	a pɯ	pɯ rɯ (bowls)
Naxi	Naxi	dzɚ ²¹ ly ³³	ly³³ (grains of rice; eggs; bowls)

The language clusters^[13] in Figure 3 on the basis of the similarity of colexification networks across the sampled languages reveal that the colexification pattern of ‘fruit’ is strongly associated with the subgroup of a particular language (χ² = 131.85, df = NA,p < 0.001, Cramer’s V = 0.80). It clearly shows the clustering of two groups of Ngwi languages (i.e. Cluster 1 and 2), indicating Ngwi languages are identical in their colexification networks of ‘fruit’. 7 out of the 9 languages employing at least one 3D-classifier are from these two clusters of Ngwi.^[14] There is a split between Northern (Yi) and Southern (Hani) Ngwi, and Central Ngwi (i.e. Lahu_Lancang, Jinuo, Yi_Sani, Lisu) in between. We speculate there exists some hidden layers to account for the split that is not relevant to the derivation of 3D-classifiers.

Figure 3:

Language Networks of ‘Fruit’. In this figure, distinct text colors correspond to different subgroups for languages. Node colors designate various clusters, showcasing how languages group together based on shared colexification patterns for ‘fruit’. The size of each node represents the degree of the language, with larger nodes indicating languages that have a higher degree of colexification with other languages in the network.

The most salient pattern that colexifies ‘fruit’ and 3D-classifiers in the network analysis (§3.2.1), as characterized in (1), is the shared pattern of Ngwi languages in our sample. Due to the strong colexification of ‘fruit’, compounds for fruits, and small round body parts (i.e. Cluster 3 (varieties of fruits) and Cluster 1 (small round body parts) are the two central concept clusters in the network of ‘fruit’), semantic extension from ‘fruit’ to 3D-classifiers is most likely through the mediation of compounds for varieties of fruits and small body parts.

(1)

the first colexification pattern of ‘fruit’ and 3D-CLF (Ngwi type)

‘fruit’ – CT in compound nouns for varieties of fruit – CT in compound nouns for small round body parts – Shape-based classifier (3D-CLF)

Diachronically, (1) indicates that the noun ‘fruit’ does not directly derive classifiers. For both languages with and without a classifier, ‘fruit’ usually first developed into a CT in compounds marking varieties of fruits. However, not all languages that involve the CT ‘fruit’ in the fruit compounds have developed a classifier. The CT ‘fruit’ tends to be present in compounds denoting small round body parts from Cluster 1 before developing the use as a classifier.

Table 15 presents four types of languages in our sample that exhibit variation in (1), resulting in the presence/absence of a 3D-classifier.^[15]

Table 15:

Languages that colexify ‘fruit’ with varieties of fruits, small round body parts, and 3D-classifiers.

Type	Language	Subgroup	‘Fruit’	Varieties of fruits	Body parts	3D-CLF
I	Jinuo	Central Ngwi	a⁴⁴ sɯ⁴⁴	sɯ⁴⁴ jɛ⁴⁴ a⁴⁴ sɯ⁴⁴ ‘peach’; ŋa⁴² sɯ⁴⁴ ‘banana’	la⁵⁵ sɯ⁴⁴ ‘claw / talon’ (1)^a	+
	Lahu_Langcang	Central Ngwi	i ³⁵ ɕi ¹¹	ɑ³⁵ vɛ⁵³ ɕi¹¹ ‘peach’; mɑ³⁵ li³¹ ɕi¹¹ ‘pear’; mɑ³⁵ mɑ³¹ ku³³ ɕi¹¹ ‘walnut’	khu³³ mɛ⁵⁴ ɕi¹¹ ‘ankle’ (1); ni¹¹ ɕi¹¹ u³³ ‘testicle’ (1); ɔ³¹ lɑ⁵³ ɕi¹¹ ‘kidney’ (4)	+
	Yi_Nanhua	Northern Ngwi	sæ ²¹	sæ²¹ ɣɯ²¹ ‘peach’; sæ²¹ ‘pear’; sæ²¹ mi³³ ‘walnut’	tɕhi³³ me̱³³ sæ²¹ ‘ankle’ (1); sɛ²¹ ‘liver’ (1); də³³ xu³³ sæ²¹ ‘testicle’ (1)	+
	Yi_Sani	Central Ngwi	sz̩ ¹¹ mɒ ³³	z̊³³ m̩¹¹ sz̩¹¹ mɒ³³ ‘grape’; sz̩¹¹ ɣɯ¹¹ mɒ³³ ‘peach’; sz̩¹¹ tʂhʐ̩³³ mɒ³³ ‘pear’	sz̩¹¹ phɒ³³ mɒ³³ ‘bladder’ (1); sz̩¹¹ ‘liver’ (1)	+
	Yi_Weishan	Northern Ngwi	sɿ̄ ³³ sᴇ ²¹ ʔlo ³³ sᴇ ²¹	sᴇ²¹ ʔy²¹ ‘peach’; sᴇ²¹ tʂhɿ⁵⁵ ‘pear’	dᴇ³³ sᴇ²¹ ‘testicle’ (1); di⁵⁵ sᴇ²¹ ‘kidney’ (4)	−
II	Dulong	Nungic	ɕiŋ ⁵⁵ ɕi ⁵⁵	ɹɯŋ³¹ ɕi⁵³ ‘grape’	tɯ³¹ ɕi⁵⁵ ‘gall/bladder’ (1/4)	−
	Jingpho	Jingpho	nam ³¹ si ³¹	să³³ ŋum³³ si³¹; sum³³ wum³³ si³¹ ‘peach’; să⁵⁵ pjiʔ⁵⁵ si³¹ ‘grape’	pa̱u³³ si³¹ ‘uvula’ (4)	−
	Bantawa	Kiranti	si;si wa	TaTnamsi ‘grape’	chuk ku si/chuk-si-ma ‘finger’ (4)	−
	Bola_Luxi	Burmish	ʃɿ ³⁵	pu⁵⁵ ʃɿ³⁵ ‘walnut’; pui⁵⁵ ka⁵⁵ ʃɿ³⁵ ‘persimmon’	−	−
	Burmese (Rangoon)	Burmish	ɑ ⁵³ tθi ⁵⁵	tθɑ⁵³ bjiʔ⁴ tθi⁵⁵ ‘grape’; mɛʔ⁴ mũ²² tθi⁵⁵ ‘peach’; tθiʔ⁴ tɔ²² tθi⁵⁵ ‘pear’	−	−
	Chepang	Kham-Magar-Chepang	sayʔ	ʔay.sayʔ ‘cucumber’	dut.sayʔ/ ʔoh.sayʔ ‘nipple’ (6)	−
III	Hani_Mojiang	Southern Ngwi	ɔ ³¹ ɕi ³¹	ɕi³¹ mu³¹ ‘peach’; ɕi³¹ pa³¹ ‘grape’; ɕi³¹ phɛ⁵⁵ ‘pear’	−	+
	Hani_Lüchun	Southern Ngwi	a ⁵⁵ si ³¹	si³¹ bja̱³¹ ‘grape’; si³¹ ɣɔ³¹ ‘peach’;	ɣø³¹ si³¹ ‘kidney’ (4) la³³ si³¹ ‘mouth’	+
	Lisu	Central Ngwi	si ³⁵ sɯ ³¹	tɕhe̱³¹ le̱³¹ sɯ³¹ ‘grape’; sɯ³¹ lɯ̱³³ bɯ³³;sɯ³¹ lɯ³³ bɯ⁴⁴ ‘persimmon’	−	+
IV	Tani_Bokar	Western Tani	a pɯ	−	a pɯ ‘gall’ (4)	+
IV	Naxi	Naxi	dzɚ ²¹ ly ³³	−	lɑ²¹ ly³³ ‘finger’ (4); by³³ ly³³;mby³³ ly³³ ‘kidney ’(4)	+

^aNumber in the parenthesis indicates the concept cluster.

The type I languages follow the path in (1) and are primarily from the Ngwi subgroup. Four Ngwi languages that colexify ‘fruit’ with the Cluster 3 fruit compounds as well as the Cluster 1 body part compounds also possess a 3D-classifier. The only exception is Yi_Weishan, which satisfies both conditions in (1) but does not contain a colexified 3D-classifier.

The type II languages colexify ‘fruit’ with the fruit compounds in Cluster 3 but not the body parts from Cluster 1. As expected, they do not develop a 3D-classifier. Languages outside the Ngwi subgroup may colexify ‘fruit’ with fruit compounds but do not contain a classifier. This is probably due to that the ‘fruit’ morpheme is not colexified with body part terms, as demonstrated by Bola_Luxi and Burmese_Rangoon, or the colexified body part term is from clusters other than Cluster 1 (i.e. Cluster 4 or Cluster 6), as in as Dulong, Jingpho, Bantawa, and Chepang.

The type III languages colexify ‘fruit’, fruit compounds and 3D-classifiers but do not colexify ‘fruit’ with body parts from Cluster 1. Those languages involve Lisu, Hani_Mojiang, and Hani_Lüchun. However, taking a closer look at those Ngwi languages, they should not be excluded from the Type I languages. We found an array of Cluster 1 body part compounds that are colexified with ‘fruit’ in different sources.^[16] Therefore, the path in (1) still holds for the majority of Ngwi languages.

The type IV languages do not colexify ‘fruit’ with fruit compounds in Cluster 3 nor body parts from Cluster 1 but have developed a 3D-classifier. Naxi and Tani_Bokar in our sample exhibit this pattern.

There are of course other minor paths that derive a 3D-classifier in addition to (1). In Type IV languages, classifiers in Cluster 4 are not derived from ‘fruit’-related concepts but seem to be directly derived from body parts. For example, Naxi has the classifier ly³³ classifying objects like grains, eggs, and bowls. However, ly³³ does not colexify with any compound meaning ‘fruit’ but rather with body parts like ‘finger’ and ‘kidney’. Tani_Bokar uses the classifier pɯ rɯ to classify bowls, which is originally derived from *pɯ ‘egg’ in Proto-Tani (Sun 1993). Like Naxi, it is not found colexified with varieties of fruits but rather is found in ‘gall’.

Based on the above discussion, the presence of a numeral classifier in the colexification network of ‘fruit’ is largely due to that the language is from the Ngwi subgroup. Both Type I and Type III Ngwi languages derive a 3D-classifier via the path in (1). The concept ‘fruit’ may derive the meaning of classifier as a result of shared innovation of Ngwi languages. This is congruent with Bradley (1979), in which the proto *si² is reconstructed for both ‘fruit’ and the classifier ‘CLF: fruit’ in Proto-Ngwi.

4.2 Semantic extension from ‘stone’ to 3D-classifiers

The 68 sampled languages can be divided into two groups depending on the presence/absence of a 3D-classifier. Table 16 presents the 7 languages in our sample that employ a 3D-classifier derived from ‘stone’. In contrast to ‘fruit’, which is highly biased to Ngwi languages, languages deriving a 3D classifier from ‘stone’ are distributed over 6 subgroups. The subgroup (χ² = 108.36, df = NA, p = 0.04) of a particular language is marginally associated with the colexification pattern of ‘stone’.

Table 16:

TB languages that derive ‘3D-CLF’ from ‘stone’.

Subgroup	Language	‘Stone’	3D-CLF (e.g. stone, egg, bowl)
Nungic	Dulong	luŋ ⁵⁵	luŋ⁵⁵ (classify stones, eggs, grains of rice)
Nungic	Nung	ḷuŋ ⁵⁵	(thi³¹) ḷuŋ⁵⁵ (classify stones and eggs)
rGyalrongic	Daofu	rgə ma	(a) rgə (classify bowls)
Bodo-Garo	Garo	roŋ	roŋ- (classify round objects)^a
Tibetan	Written Tibetan	rdo	rdog (gtɕig) (classify grains of rice)
Bai	Bai_Jianchuan	tso̱ ²¹ khue ⁵⁵ ; tso ⁴² khui ⁵⁵	(ke⁵⁵ sẽ̱²¹ ɑ²¹) khue⁵⁵ (classify eggs); kʰou³³ (classify grains of rice)^b
Karenic	Karen	lø ³¹	phlø³¹ (classify stones, eggs, grains of rice)

^aThe specific nouns that can be classified by the 3D-classifier roŋ-in Garo is not explicit. However, based on descriptions in Burling (2003) and Wood (2008), it is provisionally assumed that all types of round/globular objects, including grain of rice, egg, stone/rocks, and bowls are classified by roŋ-in Garo. ^bIn Huang (1992), the entry for the classifier for grain of rice is recorded as (me⁴⁴ɑ²¹)o⁴⁴. While in other sources, this classifier is recorded as kho³³ (Sun 1991) or kʰou³³ in Jianchuan Bai (Allen 2007), and reconstructed as *qʰɔ² in Proto-Bai (Wang 2012). None of the Bai dialects in Wang (2012) has the o⁴⁴ form and only one dialect in Allen (2007) (i.e. Qiliqiao Bai) has the form of ɔ³³. In this paper, we adopt the form kʰou³³ in Jianchuan Bai from Allen (2007) for the classifier ‘CLF:grain of rice’.

According to Figure 1, no salient path from ‘stone’ to 3D-classifiers can be identified in the concept network of ‘stone’, since the sub-networks of classifiers are diverse and somewhat isolated from the stone-related compounds. Compounding involving ‘stone’ as an intermediate stage in the derivation of classifier is not as productive as (1) in TB languages.

Figure 4 shows that at least two groups of languages (i.e. Cluster 1: Garo, Dulong, Karen, Bai_Jianchuan, Nung, Daofu; Cluster 4: Tibetan_Written) that exhibit distinct colexification patterns of ‘stone’ have independently developed 3D-classifiers. Further splits can be made within Cluster 1 once scrutinizing this cluster. It is due to the high variability of the network of ‘stone’, indicating that the path deriving a 3D-classifier from ‘stone’ is very unstable (§3.3). It follows that TB languages exhibit dissimilar colexification networks of ‘stone’, which are ineffective in predicting the presence of classifiers in a particular language. Below we will take a closer look at the 7 languages that possess a 3D-classifier and demonstrate the inability to associate the colexification pattern of ‘stone’ with a 3D-classifier.

Figure 4:

Language Networks of ‘Stone’

Table 17 displays the 7 languages that colexify ‘stone’ and 3D-classifiers. ‘Stone’/ ‘stone’-related compounds and 3D-classifier exhibit relatively direct links and shorter path in derivation as the colexified ‘stone’ compounds are confined to a small number of concepts (most prominently ‘flint’). Nung, Dulong, Jianchuan_Bai, Karen, and Written Tibetan all colexify ‘stone’ with compounds for varieties of stones, while the root ‘stone’ is not found in ‘stone’-related compounds in Garo and Daofu. The path ‘Noun – CT in compound nouns – 3D-classifier’ holds for most of those languages. Nevertheless, the concept network result in §3.2.2 shows that ‘stone’ concepts of the same semantic class are not strongly colexified. Notably concepts pertaining to the semantic class of ‘small round objects’ in Table 17 are quite diverse across languages and hence no intermediate stage concerning compounds for ‘small round object’ should be posited between ‘stone’-related compounds and 3D-classifiers. It implies that the overall colexification pattern of ‘stone’ is not strongly related to the presence of a 3D-classifier, as the classifier is more directly derived from ‘stone’ and a limited number of ‘stone’-related compounds.

Table 17:

Languages that colexify ‘stone’ with varieties of stones, small round objects, and 3D-classifiers.

Language	Language cluster	‘Stone/rock’	Varieties of stones	Small round objects	3D-CLF
Nung	1	luŋ ⁵⁵	xo³¹ bi³¹ luŋ⁵⁵ ‘flint’ (8)	thi³¹ vɛn³¹ luŋ⁵⁵ ‘hail’ (8) luŋ⁵⁵ ‘bark’ (6) ȵi⁵⁵ luŋ⁵⁵ ‘eye’ (6)	(thi³¹) ḷuŋ⁵⁵ ‘CL:eggs’ (6) (thi³¹) ḷuŋ⁵⁵ ‘CL:stones’ (6)
Dulong	1	luŋ ⁵⁵	tɕɑ³¹ mɑʔ⁵⁵ luŋ⁵⁵ ‘flint’ (8) luŋ⁵⁵ pɑŋ⁵⁵ ‘cave’ (6)	ɑŋ³¹ luŋ⁵⁵ kɑn⁵⁵ ‘radish’ (6) ɑm⁵⁵ luŋ⁵⁵ ‘rice’ (6) nam³¹ luŋ⁵⁵ ‘sun’ (6)	luŋ⁵⁵ ‘CL:grain’ (6)
Bai_Jianchuan	1	tso̱ ²¹ khue ⁵⁵	pa̱²¹ tsa⁵⁵ tso̱²¹ ‘flint’ (8)	mi⁵⁵ tso̱²¹ ‘ladle’(2) tso̱³³ ŋue³³ khue⁵⁵‘steel’ (2) tso̱²¹ kə⁵⁵ ‘key’ (2)	(ke⁵⁵ sẽ̱²¹ ɑ²¹) khue⁵⁵ ‘CL:eggs’ (6) kʰou³³ ‘CL:grains of rice’(6)
Karen	1	lø ³¹	lø³¹ me̱³¹ ‘flint’ (8) lø³¹ tθui³¹ la̱⁵⁵ ‘coal’ (4) lø³¹ bi̱³¹ ba̱³¹ ‘flight’ (3)	me³³ khua⁵⁵ phlø³¹ ‘bowl’ (5) dă³¹ lø³¹ tha⁵⁵ ‘trivet’ (7)	phlø³¹ ‘CL:stones’(6) phlø³¹ ‘CL:eggs’ (6)
Written Tibetan	4	rdo	rdo sol ‘coal’ (4) rdo skas ‘flight’ (3) me rdo ‘flint’ (8) rdo.? ‘pebble’ (7)	rdo lo ‘pestle’ (4) rgja rdo ‘weight’ (1) ɦbru rdog ‘rice’ (6)	rdog (gtɕig) ‘CL:grain’ (6)
Garo	1	roŋ	−	−	roŋ- ‘CL:eggs’ (6) roŋ- ‘CL:grain’ (6) roŋ- ‘CL:stone’ (6) roŋ- ‘CL:bowl’ (2)
Daofu	1	rgə ma	−	lo ma ‘finger’ (2) ma phjo ‘ladle’ (2) bji ma ‘sand’ (8) ɬtɕɑ mar ‘steel’ (2)	(a) rgə ‘CL:bowls’ (2)

Combining the network instability (§3.3), the language clustering in Figure 4, and the language-specific data in Table 17, 3D-classifiers seem to be derived from ‘stone’ independently in individual languages and cannot be uniformly accounted for by semantics.

5 Discussion

In this discussion, we attempt to address the two research questions in the beginning of the article on the basis of the findings drawn from the concept networks: 1) Is there any salient path of semantic extension associated with the derivation of classifiers for 3-dimensional objects from ‘fruit’ and ‘stone’? 2) How effective the colexification patterns can predict the presence/absence of a classifier in a particular language? Regarding the first question, the path of “Noun – Class term (CT) in compound nouns - Shape-based classifier” has been identified as a common pattern in the semantic extension towards a 3D-classifier in Tibeto-Burman, in which generic-specific compound nouns play a critical role (§5.1). With respect to the second question, only the colexification pattern of ‘fruit’ but not ‘stone’ can effectively predict the presence/absence of a 3D-classifier (§5.2). An implicational universal is proposed based on that. The important role played by body part concepts in noun categorization is briefly reviewed in §5.3.

5.1 Generic-specific compound nouns in the semantic extension towards 3D-classifiers

There is a strong tendency of grammaticalization of numeral classifiers from nouns in Tibeto-Burman and worldwide (Aikhenvald 2000; Corbett 1991; Evans 2022; Huang 2022; Post 2022; Grinevald 2000, 2002; Seifart 2010). Results from the colexification networks of concepts in this study (§3.2) point to that compound nouns serve as a critical semantic class in this process. Among the TB languages that have developed 3D-classifiers classifying small round objects such as ‘egg’, ‘grain of rice’ and ‘stones, rocks’, 3D-classifiers are indirectly colexified with the nouns for ‘fruit’ and ‘stone’ because 3D-classifiers and ‘fruit’/ ‘stone’ are distributed in distinct clusters in both networks. The centrality and high degrees of compound nouns containing the morpheme ‘fruit’/ ‘stone’ in both concept networks strongly indicate that compounds are colexified with more concepts than 3D-classifiers. It follows that compounding is an intermediate stage in the derivation of 3D-classifiers from nouns in Tibeto-Burman langauges. This finding can be empirically supported by crosslinguistic data.

Classifier systems typically emerge in two distinct contexts (Little et al. 2022): the context of counting individual items which are of particular cultural importance (Bisang 1999:158), such as Chinese and Japanese; the context of taxonomic or meronomic compounding process, which is prominent in Tai, Hmong-Mien, and Tibeto-Burman languages (Bisang 1999; DeLancey 1986; Enfield 2004; Vittrant and Allassonnière-Tang 2021). ‘Class noun’ (or class term) as a part of the noun root is conceived as an important stage in the process of deriving a “classifier-for-noun” (Little et al. 2022). (2) through (4) illustrate distinct types of classified nouns in Mandarin Chinese, Hmong, and nDrapa.

(2)

Classical Chinese (Sinitic)

竹竿万个。(《史记·货殖列传》) (汉)
zhúgān	wàn	gè
bamboo_rod	ten_thousand	CLF:general
‘Ten thousand bamboo rods’ (Shiji-huozhi liezhuan, Han Dynesty) (Zhang 2012: 310)

(3)

Hmong (Mon-Khmer)
ib-tug	tub-sab
one-CLF:ANIM	CN:PERSON-thief
‘A thief’ (White 2019:231)

(4)

nDrapa (Tibeto-Burman)
láɕheʂtsʊji	tɛ-jî
apprentice	one-CLF:GEN
‘An apprentice’ (Huang 2022: 231)

The general classifier gè in Classical Chinese was not evolved from any generic terms in compounds but initially used as a general classifier, whose occurrences as a numeral classifier can be dated back to as early as Han Dynasty (Zhang 2012). In Hmong, by contrast, there are class terms like ntoo ‘tree’, txiv ‘fruit’, tub ‘son’, etc. (Bisang 1999: 167), which serve as generic nouns specifying a superordinate level concept. They are the lexical origin of numeral classifiers, as in (3). The small set of numeral classifiers in the Tibeto-Burman language nDrapa are also grammaticalized from compound nouns that hold a generic-specific relationship between the classifying morpheme and the host noun. The ji coda of the noun ‘apprentice’ in (4) means ‘man/human’ and serves as a class term specifying the superordinate class of láɕheʂtsʊ ‘pupil’. It further developed into a general classifier jî (Huang 2022: 231).

Findings of the present quantitative study support the claim that compound nouns play a critical role in the grammaticalization of TB classifiers. We propose the path in (5) to characterize the semantic extension of 3D-classifiers from the nouns for ‘fruit’ and ‘stone’, in which the lexeme ‘fruit’ or ‘stone’ serve as a class term (CT) in compound nouns before turning into a pure classifier. The CT and the other part of the compound hold a generic-specific semantic relationship.

(5)

Path of semantic extensions of ‘fruit’ and ‘stone’

Noun – CT in compound nouns - Shape-based classifier

This grammaticalization cline is not only widespread in TB languages that have rich classifiers, such as Ngwi-Burmish and Tani languages (Bradley 2012; Post 2022), but also common in languages with limited number of classifiers, as in nDrapa (Qiangic, see Shirai 2022; Huang 2022).

‘Fruit’ first developed into a bound morpheme (CT) in compounds for varieties of fruits, such as ‘pear’, ‘peach’ and ‘walnut’. The CT ‘fruit’ was extended to compounds of the same shape as fruits and finally became a shaped-based classifier. As illustrated by Figure 1, ‘CLF: eggs’, ‘CLF: stones, rocks’, ‘CLF: bowls’ and ‘CLF: grain (of rice)’ all have direct links with compound nouns for types of fruits or small round objects, but none of them are directly colexified with the noun ‘fruit’. Taking Lisu (Central Ngwi) as an example. Table 18 presents words colexified with the lexeme ‘fruit’, including the classifier ‘CLF: grain (of rice)’ (Sun 1991;Huang 1992). sɯ³¹ initially appeared in the disyllabic generic noun for ‘fruit’ (Stage 1); it then developed as a class term in compound nouns for types of fruits (Stage 2); in Stage 3, sɯ³¹ occurred in compounds for grains, seeds, body parts, and other round-shaped objects; it finally became a sortal classifier for grains (Stage 4).

Table 18:

Colexified concepts of ‘fruit’ in Lisu.

Word form	Gloss
Stage 1:
si ³⁵ sɯ ³¹	‘fruit’
Stage 2:
tɕhe̱ ³¹ le̱ ³¹ sɯ ³¹	‘grape’
sɯ ³¹ lɯ ³³ bɯ ⁴⁴	‘persimmon’
Stage 3:
tɕhɯ̱ ³³ sɯ ³¹	‘rice’
dze̱ ³¹ sɯ ³¹	‘seed’
miɛ ⁴⁴ sɯ ³¹	‘Eye’
tsɯ ⁵⁵ sɯ ³¹	‘charcoal’
Stage 4:
sɯ ³¹	‘CLF: grain (of rice)’

In the case of ‘stone’, as shown in Figure 1, it is not directly linked with the classifiers ‘CLF: stones, rocks’, ‘CLF: grain (of rice)’, and ‘CLF: eggs’ but rather derived those classifiers via compound nouns for varieties of stones and other round things. The compounds for ‘flint (to make fire)’, ‘flight (of steps)’, and ‘coal’ are most important nouns that derived classifiers in an array of languages (§4.2.2). The morpheme ‘stone’ appears as a class term in those compounds before turning into a classifier. Table 19 presents the colexified concepts of ‘stone’ in Dulong (Nungic, Huang 1992). The noun root luŋ⁵⁵ ‘stone’ (Stage 1) first appears in the compounds for different types of stones and rocks (Stage 2), holding a specific-generic relationship between the basic and the superordinate level concept; luŋ⁵⁵ then developed into a class term accompanying inanimate round objects that are small in size (Stage 3). Finally, it became a 3D-classifier (Stage 4).

Table 19:

Colexified concepts of ‘stone’ in Dulong.

Word form	Gloss
Stage 1:
*luŋ* ⁵⁵	‘stone’
Stage 2:
lɯ ³¹ ka ⁵⁵ luŋ ⁵⁵ paŋ ⁵⁵	‘cave’
tɕɑ ³¹ mɑʔ ⁵⁵ *luŋ* ⁵⁵	‘flint (to make fire)’
Stage 3:
nɯ ³¹ lɛŋ ³¹ *luŋ* ⁵⁵	‘egg/testicle’
ɑŋ ³¹ *luŋ* ⁵⁵ kɑn ⁵⁵	‘radish’
ɑm ⁵⁵ *luŋ* ⁵⁵	‘rice’
nam ³¹ luŋ ⁵⁵	‘sun’
Stage 4:
*luŋ* ⁵⁵	‘CLF:stones,rocks’
*luŋ* ⁵⁵	‘CLF:eggs’
*luŋ* ⁵⁵	‘CLF:grain (of rice)’

Shape is considered the most important semantic parameter in noun categorization, including numeral classifiers (Adams and Conklin 1973; Aikhenvald 2000; Senft 1996). The semantic colexification pattern of ‘fruit’ and ‘stone’ of the sampled Tibeto-Burman languages demonstrates a crosslinguistic tendency that before this parameter comes into play, the “shape” lexeme is a taxonomical classifying morpheme that classifies nouns according to their superordinate level category.

5.2 The effectiveness of colexification patterns in predicting 3D-classifiers

‘Fruit’ and ‘stone’ diverge in the mode of deriving 3D-classifiers from compounds. The significant difference in their network structure and stability (§3) and the language-specific data (§4) strongly suggest that only ‘fruit’ but not ‘stone’ exhibits a stable and salient path in semantic extension from nouns to 3D-classifiers. Only the colexification pattern of ‘fruit’ but not ‘stone’ can effectively predict the presence/absence of a 3D-classifier.

The most salient path of semantic extension from ‘fruit’ to 3D-classifiers in Tibeto-Burman experiences the 4 stages characterized in (1) in §4.1 (i.e. repeated below), as illustrated by the Lisu example in Table 18.

(1)

the first colexification pattern of ‘fruit’ and 3D-CLF (Ngwi type)

‘fruit’ – CT in compound nouns for varieties of fruit – CT in compound nouns for small round body parts – Shape-based classifier (3D-CLF)

(1) is a path attested in most Ngwi languages in our sample. Ngwi languages that developed 3D-classifiers from ‘fruit’ tend to colexify ‘fruit’ with the body part compounds ‘testicle’, ‘fingernail’, ‘ankle’, ‘claw’, ‘bladder’, ‘liver’, and ‘throat’ (Cluster 1) in Stage 3. A 3D-classifier is commonly absent in languages in which ‘fruit’ does not colexify with those body part concepts (i.e. Type II languages in Table 15). An implicational universal can be formulated to predict the presence/absence of a 3D-classifier based on their distributions in TB (i.e. Table 15):

(6) If a TB language colexifies ‘fruit’ with the ‘fruit’ CT in fruit compounds from Cluster 3 and the shape morpheme in body part concepts from Cluster 1, the language tends to have a 3D-classifier.^[17]

The four types of languages investigated in Table 15 (§4.1) empirically support the prediction in (6). The path in (1) turns out to be fairly in Tibeto-Burman that accounts for half of the languages that possess a 3D-classifier and is highly skewed toward Ngwi.

‘Stone’, to the contrary, does not exhibit any crosslinguistic salient path of semantic extension like ‘fruit’. The high variability of the network of ‘stone’(§3.3) and the weak colexification patterns of the ‘stone’-related concepts (§3.2.2), together with the language clustering in Figure 4 and the language-specific data in Table 17 (§4.2), all indicate that the path deriving a 3D-classifier from ‘stone’ is very unstable. The shorter average path length and the direct links between individual ‘stone’ concepts (e.g. ‘flint’) and 3D-classifiers in the network of ‘stone’ (§3.3) suggest that the semantic extension ‘stone’ – ‘stone’related compounds- 3D-classifiers is incidental. Such extension is not semantically motivated as ‘fruit’ but rather the result of semantic extension of invidual nouns.

Indeed, the evolutionary situations from ‘stone’ to classifier are incidental and language-specific. Among the 7 languages that develop a 3D-classfier from ‘stone’, only Garo (Bodo-Garo) and Karen (Karenic) have reconstructed classifiers in the proto languages. In Karen (Karenic), lø³¹ ‘stone’ is colexified with the classifier phlø³¹, which classifies stones, eggs, and grain of rice (Huang 1992). This classifier has already existed in the Proto-Karen (i.e. *phloŋ^B ‘CLF: stones/rocks’), which was colexified with the proto form for stones (i.e. *loŋ^B ‘stone/rock’) (Luangthongkum 2013). In other languages, the colexification of ‘stone’ and classifier is absent in the their corresponding proto languages. Classifiers are a part of Proto-Bodo-Garo (Wood 2008), Proto-Tani (Post and Sun 2017), and Proto-Ngwi (Bradley 1979). But none of the classifier roots in those proto languages are related to ‘stone’. Some languages have experienced replacement of the 3D -classifier in history, resulting in the colexification of ‘stone’ and 3D-classifier in modern languages. Proto-Boro-Garo (PBG) has a handful of numeral classifiers (Wood 2008), among which the PBG classifier of round objects has the same proto form as the lexeme of ‘fruit’ (i.e. *thái)^[18] but later in some languages this classifier was replaced by the lexeme of ‘stone’ (*roŋ-). This is what has been observed in Garo (Wood 2008:75).

Given the above facts, we may conclude that though ‘fruit’ and ‘stone’ both derive classifiers via an intermediate stage of compound nouns, their colexification networks strongly imply two distinct modes of deriving classifiers from nouns. The colexification network of ‘fruit’ but not that of ‘stone’ can effectively predict the occurrence of a 3D-classifier in a particular language, which is highly correlated with the subgroup of the language.

5.3 Body part concepts in noun categorization

The essesntial status of body part compounds in the derivation of 3D-CLF in TB highlights the importance of body part concepts in noun categorization. The close relationship between body parts and the small round shape has been characterized in many noun categorization systems. For example, the gender assignment in New Guinea (Aikhenvald 2012) is associated with the shape of the body part (i.e. round or long). In Nilo-Saharan languages, body parts typically categorize nouns in terms of their shape (Blench 2015). Body parts appear to be a common lexical source for shape-based numeral classifier in Asian classifier languages (Aikenvald 2000; Bisang 1996) as well as noun classes/gender in other parts of the world, as in Bahnar (Adams 1989), Kana (Ikoro 1996), and Gumuz (Blench 2015) in Africa, and Palikur in Amazon (Aikhenvald and Green 1998). Findings of this paper suggest that body part compounds serve as a critical stage in the development of 3D-classifiers in Tibeto-Burman languages. However, it is noteworthy that only a specific subset of body part concepts, including ‘testicle’, ‘fingernail’, ‘ankle’, ‘claw’, ‘bladder’, ‘liver’, and ‘throat’(Cluster 1), are relevant to the derivation of a 3D-classifier. There is another small set of body part concepts, such as ‘eye’, ‘heart’, and ‘naval’, in Cluster 6 that do not play a role in the derivation of 3D-classifiers. Colexification with those body parts is not a sufficient condition of the presence of a classifier.

6 Conclusions

In this study, we adopt a network-based approach to explore the semantic evolution of 3D-classifiers from ‘fruit’ and ‘stone’ in 58 + 68 Tibeto-Burman languages by examining their semantic colexification networks and evaluating the effectiveness of the colexification patterns of the two nouns in predicting the presence/absence of a classifier in a particular language.

Findings of the present study confirm that ‘fruit’ and ‘stone’ are frequent sources of the 3D-classifiers in Tibeto-Burman. The colexification networks of ‘fruit’ and ‘stone’ support the claim that compound nouns play a critical role in the grammaticalization of TB classifiers (Aikhenvald 2022; Bisang 1999; DeLancey 1986; Vittrant and Allassonnière-Tang 2021). We postulate that numeral classifiers for small round objects in a substantial number of TB languages were originated from the noun roots such as ‘fruit’ and ‘stone’. They were then developed into class terms in compound nouns denoting varieties of fruits/stones and the shape class of small round objects. Finally, those noun roots lost their concrete meanings and derived into shape-based classifiers. The recurrent cline of semantic change ‘fruit > round > generic’ (Aikhenvald 2000) is attested in Tibeto-Burman languages, in which shape serves as a critical semantic basis in the semantic evolution of a classifier.

Nevertheless, ‘fruit’ and ‘stone’ differs significantly in their specific mode of semantic extension in Tibeto-Burman. The colexification pattern of ‘fruit’ but not that of ‘stone’ can effectively predict the occurrence of a 3D-classifier in a particular language, as the latter is more unstable. An implicational universal is proposed to predict the occurrence of a ‘fruit’-related classifier in Tibeto-Burman. The colexification network of ‘fruit’ represents a well-established cross-linguistic pattern that derives 3D-classifiers following the path ‘fruit-compounds for fruits-compounds for round body parts-3D-CLF’. This pattern is strongly associated with the subgroup of languages and is most prominent in Ngwi. To the contrary, the colexification network of ‘stone’ is somehow language-specific. No salient cross-linguistic semantic extension pattern can be generalized to account for the derivation of classifier from ‘stone’ in Tibeto-Burman languages.

One remaining issue that is not fully addressed in the paper is the role of language contact in the semantic evolution from nouns to 3D-classifiers. It is generally agreed that classifiers are readily to be borrowed across languages (Allassonnière-Tang et al. 2021; Greenhill et al. 2017; Her and Li 2023). Indeed, contact-induced classifier borrowings are common among TB languages. Classifiers in Bodo-Garo, Newar, Tani, and Burmic languages all developed under the influences from the classifier systems in the nearby Tai languages and Mandarin Chinese (Bradley 2012; Evans 2022; Hyslop 2008; Post 2022; Weidert 1984). From the results of network analysis, we have seen that provided the common grammaticalization path shared by both ‘fruit’ and ’stone’, the effect of language contact is evident quantitatively in the distinct modes of semantic evolution of the two nouns. The instability of the network of ‘stone’ strongly points to a contact effect that may result in language-specific borrowings of a ‘stone’-related 3D-classifier.

Corresponding author: Yu Li, School of Chinese Language and Literature, Wuhan University, Wuhan, China, E-mail: chloelibuffalo@gmail.com

Funding source: National Social Science Fund of China (NSSFC)

Award Identifier / Grant number: 22BYY063

Research funding: This study is supported by the research grant of The National Social Science Fund of China (NSSFC) entitled “A variationist study of the endangered multilingualism in Yunnan Zauzou”「云南若柔人濒危多语状态下的语言变异研究」(Grant No. 22BYY063).

Abbreviations

ANIM: animate
CLF: classifier
CN: class noun
GEN: genitive
PERSON: person

Appendix 1:

See Tables I-IV

Table I:

The concept ‘fruit’ in 58 Tibeto-Burman languages.

ID	Subgroup	STEDT proto-form (etyma tag)	Cog ID	Language	Word form of ‘fruit’	Number of colexified concepts
Hani_Lüchun	Southern Loloish	1019	0	Hani_Lüchun	a⁵⁵si³¹	21
Hani_Mojiang	Southern Loloish	1019	0	Hani_Mojiang	ɔ³¹ɕi³¹	18
Jinuo	Central Loloish		0	Jinuo	a⁴⁴ sɯ⁴⁴	14
Lahu_Lancang	Central Loloish	1019	0	Lahu_Lancang	i³⁵ɕi¹¹	35
Namuyi	Qiangic		0	Namuyi	sɿ⁵⁵ pu³¹	2
Nusu_Central	Northern Loloish	1019	0	Nusu_Central	ʂi⁵⁵	3
Pumi_Jiulong	Qiangic	1019	0	Pumi_Jiulong	sẽ¹¹sy⁵⁵	2
Queyu_Xinlong	Qiangic		0	Queyu_Xinlong	ɕe⁵⁵ tye⁵⁵ ri¹³ ro³³	5
Yi_Nanhua	Northern Loloish	1019	0	Yi_Nanhua	sæ²¹	24
Yi_Sani	Central Loloish	1019	0	Yi_Sani	sz̩¹¹mɒ³³	17
Yi_Weishan	Northern Loloish		0	Yi_Weishan	sɿ̄³³sᴇ²¹ ʔlo³³sᴇ²¹	19
Yi_Xide	Northern Loloish	2658	0	Yi_Xide	sɿ̄³³dzɑ³³lu̱³³mɑ³³	12
Darang_Taraon	Deng		320	Darang_Taraon	pɯ³¹rɯm⁵⁵;pɯ³¹ɹɯm⁵⁵	4
Tani_Bokar	Western Tani	1654	320	Tani_Bokar	a pɯ	4
Byangsi	Tibeto-Kanauri		383	Byangsi	le	1
Old_Tibetan	Tibetan	1019;2658;2071	817	Old_Tibetan	se-	16
rGyalrong_Daofu	rGyalrong	2658	817	rGyalrong_Daofu	ɕhõ tho	1
rGyalrong_Maerkang	rGyalrong		817	rGyalrong_Maerkang	ʃəŋ tok	3
Tibetan_Batang	Tibetan	2658; N/A	817	Tibetan_Batang	xhĩ⁵⁵thoʔ⁵³	12
Tibetan_Lhasa	Tibetan	2658; N/A	817	Tibetan_Lhasa	ɕiŋ⁵⁵to⁵²	12
Tibetan_Xiahe	Tibetan	2658; N/A	817	Tibetan_Xiahe	shi toχ	4
Bai_Jianchuan	Bai		1716	Bai_Jianchuan	ɕy⁵⁵li³³tɑ⁴²xə³³	2
Lyuzu	Qiangic	2658; 1019	1720	Lyuzu	se³³sɿ⁵³	6
Qiang_Mawo	Qiangic	1019	1720	Qiang_Mawo	sɪj miɛ; səʴ mi	5
Naxi	Naxi		1721	Naxi	dzɚ²¹ly³³;ndzəɹ³¹ kv³³ ndzəɹ³¹ ly³³	14
Tujia	Tujia		1724	Tujia	pu³⁵li⁵⁵	9
Pumi_Lanping	Qiangic		1725	Pumi_Lanping	ku⁵⁵tʂu⁵⁵	4
Khaling	Kiranti		3504	Khaling	sasrus	1
Achang_Longchuan	Burmish		3525	Achang_Longchuan	ʂə³¹	5
Achang_Xiandao	Burmish		3525	Achang(Xiandao)	ʂɿ³¹	4
Bahing	Kiranti	1019	3525	Bahing	siːtsi	22
Bantawa	Kiranti	1019	3525	Bantawa	si; si wa	13
Bola_Luxi	Burmish		3525	Bola(Luxi)	ʃɿ³⁵	6
Burmese_Rangoon	Burmish	2658; 1019	3525	Burmese (Rangoon)	ɑ⁵³tθi⁵⁵	18
Dulong	Nungic	2658; 1019	3525	Dulong	ɕiŋ⁵⁵ɕi⁵⁵; ɕi⁵³	4
Hakha_Chin	Kuki-Chin		3525	Hakha_Chin	thei; thingthei; tlai	6
Hayu	Kiranti	1019	3525	Hayu	si	6
Jingpho	Jingpho	1019	3525	Jingpho	si³¹;nam³¹ si³¹	11
Kulung	Kiranti	1019	3525	Kulung	se	3
Lashi	Burmish		3525	Lashi	ʃɿ⁵⁵	1
Limbu	Kiranti	1019	3525	Limbu	seʔ;iːseba;iːseqba	6
Lisu	Central Loloish	2658; 1019	3525	Lisu	si³⁵sɯ³¹	8
Lushai	unknown	1019; 2071	3525	Lushai	thei◦rah;thei; rah; ràh	5
Maru	Burmish	1019	3525	Maru	ʃi³⁵	5
Mikir_Karbi	Mikir	1019	3525	Mikir (Karbi)	athe	2
Motuo_Menba	Bodic	1019	3525	Motuo_Menba	se	8
Old_Burmese	Burmish	1019	3525	Old_Burmese	ə-sî ; tθi³	20
Rabha	Bodo-Garo	1019	3525	Rabha	tʰé	1
Thulung	Kiranti		3525	Thulung	bopsesi	3
Tibetan_Alike	Tibetan	1019	3525	Tibetan_Alike	si	1
Ukhrul	Tangkhul	1019	3525	Ukhrul	tʰej	7
Yidu	Deng		3525	Yidu	ɹuŋ⁵⁵ɕi⁵⁵	6
Zaiwa_Atsi	Burmish	1019	3525	Zaiwa (Atsi)	ʃi²¹	12
Zhaba_Daofu_County	Qiangic		3525	Zhaba_Daofu_County	shɛ³³ɕʌ⁵⁵	2
Chepang	Kham-Magar-Chepang		3818	Chepang	sayʔ;chyak-	14
Garo	Bodo-Garo	1019	3944	Garo	bi-te	6
Japhug	rGyalrongic		4805	Japhug	sɯmat; mat; ɯ-mat	2
Tangut	rGyalrongic		4805	Tangut	mja̠;mjaa; rjɨr; kiọ; ne̱w	4

Table II:

The concept ‘stone’ in 68 Tibeto-Burman languages.

ID	Subgroup	STEDT proto-form	Cog ID	Language	Word form of ‘fruit’	Number of colexified concepts
Bahing	Kiranti	1269	81	Bahing	luŋ	4
Bantawa	Kiranti	1269	81	Bantawa	luN	7
Bokar	Western Tani	1269	81	Bokar	ɯ lɯŋ	10
Byangsi	Tibeto-Kanauri		81	Byangsi	uŋ	1
Dulong	Nungic	1269	81	Dulong	luŋ⁵⁵	10
Garo	Bodo-Garo	1269	81	Garo	roŋ; roŋʔ te	1
Hakha_Chin	Kuki-Chin		81	Hakha_Chin	lung	4
Hayu	Kiranti	1269	81	Hayu	lʊ̃ːphʊ	5
Jingpho	Jingpho	1269	81	Jingpho	n³¹luŋ³¹	4
Khaling	Kiranti	1269	81	Khaling	lung ; lūŋ	1
Kulung	Kiranti	1269	81	Kulung	luŋˍ	5
Limbu	Kiranti	1269	81	Limbu	luŋ	8
Lushai	unknown	1269	81	Lushai	lǔŋ ; lung	7
Mikir	Mikir	1269	81	Mikir	ar lōŋ(ʔ)	7
Motuo_Menba	Bodic	1269	81	Motuo_Menba	luŋ	5
Rabha	Bodo-Garo	1269	81	Rabha	róŋ-ka	5
Thulung	Kiranti	1269	81	Thulung	luŋ	9
Ukhrul	Tangkhul	1269	81	Ukhrul	ŋə-luŋ ; ŋə-luŋ-kuj	3
Bunan	Tibeto-Kanauri		856	Bunan	graŋ	1
Atsi	Burmish	1269	874	Atsi	luʔ²¹kok²¹	2
Bola_Luxi	Burmish	1269	874	Bola (Luxi)	lauʔ³¹taŋ³¹	2
Lashi	Burmish		874	Lashi	luk³¹tsəŋ³¹	1
Maru	Burmish	1269	874	Maru	lauk³¹tsaŋ³¹	3
Xiandao	Burmish	1269	874	Xiandao	luʔ⁵⁵koʔ⁵⁵	1
Old_Burmese	Burmish	1269	1078	Old_Burmese/written Burmese	kjɔk⁴; kyok; kjɔk	10
Rangoon	Burmish	1269	1078	Rangoon	tɕɑuʔ⁴	7
Tibetan_Written	Tibetan		2631	Tibetan_Written	rdo	10
Japhug	rGyalrongic		3156	Japhug	rdɤstaʁ	1
Tibetan_Alike	Tibetan		3156	Tibetan_Alike	rdo	3
Tibetan_Batang	Tibetan	N/A; 2166	3156	Tibetan_Batang	dᴜ⁵³	6
Tibetan_Lhasa	Tibetan		3156	Tibetan_Lhasa	to¹³	4
Tibetan_Xiahe	Tibetan		3156	Tibetan_Xiahe	do	6
Lisu	Central Loloish	1269	3582	Lisu	lo̱³³tshi³⁵	17
rGyalrong_Maerkang	rGyalrongic	1269	3594	rGyalrong_Maerkang	ɟjə lək	3
Zhaba_Daofu_County	Qiangic		3594	Zhaba_Daofu_County	je⁵⁵po⁵⁵	9
Daofu	rGyalrongic		3598	Daofu	rgə ma	11
Hakha_Lai	Chin	1269	3739	Hakha (Lai)	luŋ; lûŋ	4
Pumi_Lanping	Qiangic		3740	Pumi_Lanping	zgø¹³	1
Tujia	Tujia	1269; 2166	3741	Tujia	ɣa²¹pa²¹	4
Bai_Jianchuan	Bai		3742	Bai_Jianchuan	tso̱²¹khue⁵⁵; tso⁴² khui⁵⁵	10
Qiang_Mawo	Qiangic	1269	3743	Qiang_Mawo	ɹa ʁuɑ; ʁlu pi	5
Lyuzu	Qiangic	1269	3745	Lyuzu	luo³³bo⁵³; luo³³mæ⁵³	7
Naxi	Naxi	1269	3745	Naxi	lv̩³³	7
Chepang	Kham-Magar-Chepang	1269; 1381	3890	Chepang	hluŋ.baŋ	1
Xumi	Qiangic	1269	4739	Xumi	jũ³³guɛ⁵³; jũ³³ kuɐ⁵⁵	3
Achang_Longchuan	Burmish	1269	10317	Achang_Longchuan	laŋ³¹kɔʔ⁵⁵	4
Darang_Taraon	Deng	1269	10580	Darang_Taraon	lɯm⁵⁵	3
Yidu	Deng	1269	10580	Yidu	ɑ³¹lɑŋ⁵⁵	1
Rongpo	Tibeto-Kanauri	1269	10581	Rongpo	uŋ	1
Tangut	rGyalrongic	1269	10582	Tangut	lụ	3
Yi_Weishan	Northern Loloish		0	Yi_Weishan	ka⁵⁵lo³³	6
Gazhuo	Northern Loloish	1269; 297	0	Gazhuo	no⁵³ma³³	7
Yi_Xide	Northern Loloish	1269	0	Yi_Xide	lu̱³³mɑ⁵⁵	6
Yi_Sani	Central Loloish	1269	0	Yi_Sani	lu⁴⁴ mɒ³³	6
Hani_Luchun	Southern Loloish	1269; 2166	0	Hani_Lüchun	xa³¹lu̱³³	7
Lahu_Lancang	Central Loloish	2166	0	Lahu_Lancang	xɑ³⁵pɯ³³ɕi¹¹	8
Hani_Mojiang	Southern Loloish	1269	0	Hani_Mojiang	l̥u³³mɔ³³	1
Karen	Karenic	1269	0	Karen	lø³¹	9
Cuona_Menba	Bodic	2166	0	Cuona_Menba	tʂə³⁵ pu⁵³	1
Yi_Wuding	Northern Loloish	1269	0	Yi_Wuding	lɤ¹¹ bɤ¹¹	1
Jinuo	Central Loloish	1269	0	Jinuo	lo⁴²mɔ³³	9
Pumi_Jiulong	Qiangic	1269	0	Pumi_Jiulong	guo¹¹lũ⁵⁵	2
Kaman_Miju	Deng	1269	0	Kaman_Miju	lɑ̆uŋ³⁵	4
Namuyi	Qiangic	1269	0	Namuyi	lu⁵⁵ qua³¹; lo⁵⁵ quɑ³³	4
Nusu_Central	Northern Loloish	1269	0	Nusu_Central	lu̱⁵³	1
Yi_Nanhua	Northern Loloish	1269	0	Yi_Nanhua	lu̪³³ ; lu̱³³	9
Queyu_Xinlong	Qiangic		0	Queyu_Xinlong	rdə¹³tɕe⁵⁵	1
Nung	Nungic	1269	0	Nung	ḷuŋ⁵⁵	7

Table III:

The semantic classes containing the lexeme ‘fruit’ in 58 Tibeto-Burman languages.

Fruit	Small round shape				Plants/plant parts	Wood/wood product	Human/god	Process/property	Classifier	Miscellaneous
Fruit	Seed/grain	Body part	Small animal	Small round object	Plants/plant parts	Wood/wood product	Human/god	Process/property	Classifier	Miscellaneous
Adam’s apple	buckwheat (tartary, hulless, duck wheat)	ankle	locust	bullet	branch/twig	firewood	brothers	bear fruit	CL: eggs	building
chestnut	foodstuff / grain	arm	nit	button	bud	wedge	carpenter	good	CL: grain (of rice)
chilli, red pepper	gruel	bladder	porcupine	charcoal	crops	wood / log	god/deity	grind (flour)	CL: month’s (work)
cucumber	maize / corn	claw / talon		coal / smouldering log	flower	yoke	host	hear, listen, obey; perceive, test, feel (within oneself) (with Reflexive), understand to be case	CL: rocks, stones
fig/ (tree)	nut, seed (gen.); bead	ear		comb	grass	plane (tool)		know; see	CL:bowls
fruit	rice (unhusked/glutinous/paddy/plant/uncooked)	eye		crupper-strap	pine/(tree)	plank / board		poor	CL:fingersbreadth
grape	sesame	finger		gold	plant	boat		recognize
lime	sorghum	fingernail		fabric (satin)	sugarcane			ripe, be (fruit)
mango		fist		hail	tree			whet (a knife)
melon / gourd		heart		key	vegetable
olive/ (tree/wood)		gall		pit / stone	forest
peach		kidney		sand
pear		liver
persimmon		navel
plantain, banana		nipple
pomegranate		pelvis, hip
pumpkin		mole
tangerine		testicle
walnut		throat
peanut		uvula
pea / bean
soybean

Table IV:

The semantic classes containing the lexeme ‘stone’ in 68 Tibeto-Burman languages.

rock/stone	Small round shape					Human/god	Direction	Process/property	CLF	Miscellaneous
rock/stone	Tools	Fabrics	Body parts	Small animal	Small round object	Human/god	Direction	Process/property	CLF	Miscellaneous
boulder, huge rock	axe	pillow	back (of body)	ant	charcoal	adult	backward	bark (V)	CL: rocks, stones	cypress
cave	bowl	wool	back of an animal	butterfly	hail	person w / pockmarked face	inside/in	be strong	CL:bowls	kitchen
cliff / rocky outcrop	hammer	multicolored / patterned (cloth)	neck	frog	sand	girl		bear (fruit)	CL:eggs	day(time)
coal	hoetool		baldhead	grasshopper	fruit	grandson		climb	CL:grain (of rice)	noon
coral	ladle (gourd) / dipper (wooden)		eye	lizard	radish	hearth-god		fling / toss	CL:pile (of excrement)	thunder/thunderbolt
valley	jar (earthen)		eyeball	locust	rice (unhusked)	man		fold up (a quilt)	CL:measure of weight (=1 / 2 kilogram)/load on an animal’s back	cast of mind, line of thinking, implication
limestone	key		finger	maggot, worm	smallpox, cowpox			get / fetch		fortune / luck
millstones	trivet		throat	owl	money			hoe up (weeds)
pebble	metal weight on steelyard		egg, eggshell, testicle	sparrow (hawk)	treasured object / treasure			kick; push off ( boat )
pit/stone/rock	pestle (small, stone)		marrow		Adam’s apple			receive
wall(stone)	sieve / sifter		head		sun			throw / toss/hurl
whetstone	steel (for flint)		heart, liver					wrap
hearth-stones	fish-net		kidney					yellow
flight of steps			knee
flint (to make fire)

Appendix 2:

References

Adams, Karen L. 1989. Systems of numeral classification in the Mon-Khmer, Nicobarese and Aslian Subfamilies of Austroasiatic. Canberra: Pacific Linguistics.Search in Google Scholar

Adams, Karen L. & Nancy F. Conklin. 1973. Towards a theory of natural classification, Papers from the Ninth Regional Meeting of the Chicago Linguistic Society 9. 1–10.Search in Google Scholar

Aikhenvald, Alexandra Y. 2000. Classifiers: A typology of noun categorization devices. Oxford & New York: Oxford University Press.10.1093/oso/9780198238867.001.0001Search in Google Scholar

Aikhenvald, Alexandra Y. 2012. Round women and long men: Shape, size, and the meanings of gender in new Guinea and beyond. Anthropological Linguistics 54(1). 33–86. https://doi.org/10.1353/anl.2012.0005.Search in Google Scholar

Aikhenvald, Alexandra Y. 2021. One of a kind: On the utility of specific classifiers. Cognitive Semantics 7(2). 232–257. https://doi.org/10.1163/23526416-07020001.Search in Google Scholar

Aikhenvald, Alexandra Y. 2022. Classifiers: Setting the scene, an introduction to the special issue on classifiers in the languages of Asia. Asian Languages and Linguistics 3(2). 141–152. https://doi.org/10.1075/alal.22022.aik.Search in Google Scholar

Aikhenvald, Alexandra Y. & Diana Green. 1998. Palikur and the typology of classifiers. Anthropological Linguistics 40(3). 429–480.Search in Google Scholar

Allassonnière-Tang, Marc, Olof Lundgren, Maja Robbers, Sandra Cronhamn, Filip Larsson, One-Soon Her, Harald Hammarström & Gerd Carling. 2021. Expansion by migration and diffusion by contact is a source to the global diversity of linguistic nominal categorization systems. Humanities & Social Sciences Communications 8. 331. https://doi.org/10.1057/s41599-021-01003-5.Search in Google Scholar

Allen, Bryan. 2007. Bai dialect survey. Dallas: SIL International.Search in Google Scholar

Basumatary, Guddu. 2015. Numeral classifiers in Bodo. Nepalese Linguistics 30. 19–24.Search in Google Scholar

Benedict, Paul K. 1972. Sino-Tibetan: A conspectus. (Princeton-Cambridge Series in Chinese Linguistics, #2). New York: Cambridge University Press.10.1017/CBO9780511753541Search in Google Scholar

Benedict, Paul K. 1976. Rhyming dictionary of Written Burmese. Linguistics of Tibeto-Burman Area 3(1). 1–93. https://doi.org/10.32655/ltba.3.1.02.Search in Google Scholar

Bhaskararao, Peri. 1996. A computerized lexical database of Tiddim Chin and Lushai. In Tsuyoshi Nara & Kazuhiko Machida (eds.), A computer-assisted study of Asian and African Languages, 27–143. Tokyo: ILCAA.Search in Google Scholar

Bisang, Walter. 1993. Classifiers, quantifiers and class nouns in Hmong. Studies in Language 17(1). 1–51. https://doi.org/10.1075/sl.17.1.02bis.Search in Google Scholar

Bisang, Walter. 1996. Areal typology and grammaticalization: Processes of grammaticalization based on nouns and verbs in East and mainland South East Asian languages. Studies in Language 20(3). 519–597. https://doi.org/10.1075/sl.20.3.03bis.Search in Google Scholar

Bisang, Walter. 1999. Classifiers in East and Southeast Asian languages: Counting and beyond. In Jadranka Gvozdanović (ed.), Numeral Types and changes worldwide, 113–186. Berlin and New York: Mouton de Gruyter.10.1515/9783110811193.113Search in Google Scholar

Blench, Roger. 2015. The origins of nominal classification markers in MSEA languages: Convergence, contact and some African parallels. In Nick Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia, 558–585. Berlin/Boston: De Gruyter Mouton.10.1515/9781501501685-013Search in Google Scholar

Borodkin, Katy, Yoed N. Kenett, Miriam Faust & Nira Mashal. 2016. When pumpkin is closer to onion than to squash: The structure of the second language lexicon. Cognition 156. 60–70. https://doi.org/10.1016/j.cognition.2016.07.014.Search in Google Scholar

Bradley, David. 1979. Proto-loloish. London & Malmö: Curzon Press.Search in Google Scholar

Bradley, David. 2012. The characteristics of the Burmic family of Tibeto-Burman. Language and Linguistics 13(1). 171–192.Search in Google Scholar

Burling, Robbins. 2003. The language of the Modhupur Mandi, Garo : Vol. I : Grammar. Ann Arbor, Michigan: The Scholarly Publishing Office.10.3998/spobooks.bbv9808.0001.001Search in Google Scholar

Castro, Nichol & Cynthia S. Q. Siew. 2020. Contributions of modern network science to the cognitive sciences: Revisiting research spirals of representation and process. Proceedings of the Royal Society A 476(2238). 20190825. https://doi.org/10.1098/rspa.2019.0825.Search in Google Scholar

Corbett, Greville. 1991. Gender. Cambridge: Cambridge University Press.Search in Google Scholar

Christensen, Alexander P. 2018. NetworkToolbox: Methods and measures for brain, cognitive, and psychometric network analysis in R. The R Journal 10. 422–439. https://doi.org/10.32614/RJ-2018-065.Search in Google Scholar

Cśardi, Gabor & Tamás Nepusz. 2006. The igraph software package for complex network research. InterJournal: Complex Systems 1695. Available at: https://igraph.org.Search in Google Scholar

Dai, Qingxia 1994. Zangmian yu geti liangci yanjiu [a study on numeral classifiers in Tibeto-Burman]. In Xueliang Ma (ed.), Zangmian yu xin lun [Recent Contributions to Tibeto-Burman Studies], 166–181. Beijing: Zhongyang Minzu Xueyuan Chubanshe.Search in Google Scholar

Dai, Qingxia. 1997a. A study on count-noun classifiers in Tibeto-Burman languages, In Editorial Committee of the International Yi-Burmese Conference (ed.), Studies on Yi-Burmese languages, 355–373. Chengdu: Sichuan Nationalities Publishing House.Search in Google Scholar

Dai, Qingxia. 1997b. Jingpoyu ci de shuang yinjiehua dui yufa de yingxiang [The influence of bisyllabification of lexical items in Jinghpaw on the grammar]. Minzu Yuwen 5. 25–30.Search in Google Scholar

DeLancey, Scott. 1986. Toward a history of Tai classifier systems. In Craig, Colette (ed.), Noun classes and categorization, 437–452. Amsterdam: John Benjamins.10.1075/tsl.7.26delSearch in Google Scholar

Di Natale, Anna & David Garcia. 2023. LEXpander: Applying colexification networks to automated lexicon expansion. Behav Res 56. 952–967. https://doi.org/10.3758/s13428-023-02063-y.Search in Google Scholar

Ebert, Karen H. 1994. The structure of Kiranti languages: Comparative grammar and texts. Zurich: ASAS, Universität Zurich.Search in Google Scholar

Enfield, Nick J. 2004. Nominal classification in Lao: A Sketch. Sprachtypologie Und Universalienforschung: STUF 57(2/3). 117–143. https://doi.org/10.1524/stuf.2004.57.23.117.Search in Google Scholar

Erdős, Paul & Alfréd Rényi. 1960. On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5(1). 17–60.Search in Google Scholar

Evans, Jonathan P. 2022. Classifiers in Dimasa and (in-)definite marking. Asian Languages and Linguistics 3(2). 181–201. https://doi.org/10.1075/alal.22007.eva.Search in Google Scholar

Genetti, Carol. 2007. A grammar of Dolakha newar. Berlin: Mouton de Gruyter.10.1515/9783110198812Search in Google Scholar

Genetti, Carol. 2017. Dolakha newar. In Graham Thurgood & Randy LaPolla (eds.), The Sino- Tibetan languages, 436–452. London & New York: Routledge.Search in Google Scholar

Greenhill, Simon J., Chieh-Hsi Wu, Xia Hua, Michael Dunn, Stephen C. Levinson & Russell D. Gray. 2017. Evolutionary dynamics of language systems. PNAS 114(42). E8822–E8829. https://doi.org/10.1073/pnas.1700388114.Search in Google Scholar

Grinevald, Colette. 2000. A morphosyntactic typology of classifiers. In J. Senft (ed.), Nominal classification, 50–92. Cambridge: Cambridge University Press.Search in Google Scholar

Grinevald, Colette. 2002. Making sense of nominal classification systems: Noun classifiers and the grammaticalization variable. In I. Wischer & G. Diewald (eds.), New Reflections on grammaticalization, 259–275. Amsterdam: John Benjamins.10.1075/tsl.49.17griSearch in Google Scholar

Hansson, Inga-Lill. 1989. A comparison of Akha, Hani, Khatu, and Pijo. Linguistics of the Tibeto-Burman Area 12(1). 1–91. https://doi.org/10.32655/ltba.12.1.02.Search in Google Scholar

Hansson, Inga-Lill. 2017. Akha. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 885–901. London & New York: Routledge.Search in Google Scholar

He, Jiren & Zhuyi Jiang. 1985. Naxi yu jianzhi [A Grammar of Naxi]. Beijing: Minzu Press.Search in Google Scholar

Her, One-Soon & Bing-Tsiong Li. 2023. A single origin of numeral classifiers in Asia and the Pacific: A hypothesis. In Marc Allassonnière-Tang & Marcin Kilarski (eds.), Nominal classification in Asia and Oceania: Functional and diachronic perspectives, 113–160. Amsterdam & Philadelphia: John Benjamins.10.1075/cilt.362.05herSearch in Google Scholar

Hill, Nathan W. & Johann-Mattis List. 2017. Challenges of annotation and analysis in computer-assisted language comparison: A case study on Burmish languages. Yearbook of the Poznań Linguistic Meeting 3. 47–76. https://doi.org/10.1515/yplm-2017-0003Search in Google Scholar

Huang, Bufan. 1992. Zangmianyuzu yuyan cihui [A Tibeto-Burman Lexicon]. Beijing: Central Institute of Minorities.Search in Google Scholar

Huang, Yang. 2022. Classifiers in nDrapa: A Tibeto-Burman language in Western Sichuan. Asian Languages and Linguistics 3(2). 202–238. https://doi.org/10.1075/alal.22009.hua.Search in Google Scholar

Hyslop, Gwendolyn. 2008. Newar classifiers: A summary of the literature. Newah Vijaanan(Journal of Newar Studies) 6. 28–41.Search in Google Scholar

Ikoro, Suanu. 1996. The Kana language. Leiden: University of Leiden.Search in Google Scholar

Jackson, Joshua C., Joseph Watts, Henry R. Teague, Johann-Mattis List, Robert Forkel, Peter J. Mucha, Simon J. Greenhill, Russell D. Gray & Kristen A. Lindquist. 2019. Emotion semantics show both cultural variation and universal structure. Science 366. 1517–1522. https://doi.org/10.1126/science.aaw8160.Search in Google Scholar

Jiang, Ying. 2009. Hanzang yuxi yuyan mingliangci bijiao yanjiu [A Comparative study of classifiers in Sino-Tibetan languages]. Beijing: The ethnic publishing house.Search in Google Scholar

Jing, Dian. 2015. Mojiang Biyue Haniyu cankao yufa [Reference Grammar of Mojiang Biyo Hani]. Beijing: China Social Sciences Press.Search in Google Scholar

Kazuyuki, Kiryu. 2009. On the rise of the classifier system in Newar. Senri Ethnological Studies 75. 51–69.Search in Google Scholar

Kenett, Yoed N. & Miriam Faust. 2019. A semantic network cartography of the creative mind. Trends in Cognitive Sciences 23(4). 271–274. https://doi.org/10.1016/j.tics.2019.01.007.Search in Google Scholar

LaPolla, Randy J. 2017. Overview of Sino-Tibetan morphology. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 40–69. London & New York: Routledge.10.4324/9781315399508Search in Google Scholar

Li, Fanwen. 1997. Xià-Hàn Zìdiǎn [Tangut/ Chinese Dictionary]. Beijing: China Social Sciences Press.Search in Google Scholar

Li, Yongsui & Ersong Wang. 1986. Haniyu jianzhi [Brief description of the Hani language]. Beijing: the ethnic publishing house.Search in Google Scholar

List, Johann-Mattis, Anselm Terhalle & Matthias Urban. 2013. Using network approaches to enhance the analysis of cross-linguistic polysemies. Proceedings of the 10th international conference on computational semantics (IWCS 2013)–Short Papers, 347–353. Potsdam, Germany: Association for computational linguistics.Search in Google Scholar

List, Johann-Mattis, Simon J. Greenhill, Cormac Anderson, Thomas Mayer, Tiago Tresoldi & Robert Forkel. 2018. CLICS2: An improved database of cross-linguistic colexifications assembling lexical data with the help of cross-linguistic data formats. Linguistic Typology 22(2). 277–306. https://doi.org/10.1515/lingty-2018-0010.Search in Google Scholar

Little, Carol R., Mary Moroney & Justin Royer. 2022. Classifiers can be for numerals or nouns: Two strategies for numeral modification. Glossa: A Journal of General Linguistics 7(1). 1–35. https://doi.org/10.16995/glossa.8437.Search in Google Scholar

Luangthongkum, Theraphan. 2013. A view on Proto-Karen phonology and lexicon. (unpublished ms. contributed to STEDT). Accessed via STEDT database https://stedt.berkeley.edu/search/on2023-03-15.Search in Google Scholar

Malla, Kamal P. 1990. The earliest dated document in Newari: The palmleaf from Ukū Bāhāh NS 235/AD 1114. Kailash 16. 15–25.Search in Google Scholar

Massara, Guido P., Tiziana Di Matteo & Tomaso Aste. 2017. Network filtering for big data: Triangulated maximally filtered graph. Journal of Complex Networks 5(2). 161–178. https://doi.org/10.48550/arXiv.1505.02445.Search in Google Scholar

Matisoff, James A. 2003. The Handbook of proto-Tibeto-Burman: System and Philosophyof Sino-Tibetan reconstruction. Berkeley: University of California Press.Search in Google Scholar

Matisoff, James A. 2015. The Sino-Tibetan etymological dictionary and Thesaurus project. Berkeley: Univ California.Search in Google Scholar

Michailovsky, Boyd. 1989. Bahing. (unpublished ms. contributed to STEDT). Accessed via STEDT database https://stedt.berkeley.edu/search/on2023-03-15.Search in Google Scholar

Mu, Yuzhang & Hongkai Sun. 2012. Lisuyu Fangyan yanjiu [Lisu dialect research]. Beijing: the ethnic publishing house.Search in Google Scholar

Post, Mark W. & Jackson T.-S. Sun. 2017. Tani languages. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 322–337. London & New York: Routledge.Search in Google Scholar

Post, Mark W. 2022. Classifiers in a language with articles: Recent evolution of a Typologically unusual Asian classifier system in the Tani languages of northeast India. Asian Languages and Linguistics 3(2). 239–267. https://doi.org/10.1075/alal.22012.pos.Search in Google Scholar

Qiu, Mengyang, Nichol Castro & Brendan, T. Johns. 2021. Structural comparisons of noun and verb networks in the mental lexicon, In Proceedings of the 43rd annual meeting of the cognitive science society 1649–1655. Cognitivesciencesociety.org.Search in Google Scholar

Qumutiexi. 2010. Yiyu Yinuohua yanjiu [A Study on the Yinuo dialect of Yi]. Beijing: The ethnic publishing house.Search in Google Scholar

R Core Team. 2023. R: a language and environment for statistical computing [Computer software manual]. Vienna, Austria. Available at: https://www.R-project.org/.Search in Google Scholar

Rzymski, Christoph, Tiago Tresoldi, Simon J. Greenhill, Wu Mei-Shin, Nathaael E. Schweikhard, Maria Koptjevskaja-Tamm, Volker Gast, Timotheus A. Bodt, Abbie Hantgan, Gereon A. Kaiping, Sophie Chang, Yunfan Lai, Natalia Morozova, Heini Arjava, Nataliia Hübler, Ezequiel Koile, Steve Pepper, Mariann Proos, Epps Briana Van, Ingrid Blanco, Carolin Hundt, Sergei Monakhov, Kristina Pianykh, Sallona Ramesh, Russell D. Gray, Robert Forkel, Johann-Mattis List. 2020. The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies. Scientific Data 7, 13. https://doi.org/10.1038/s41597-019-0341-x.Search in Google Scholar

Sagart, Laurent, Guillaume Jacques, Yunfan Lai, Robin J. Ryder, Valentin Thouzeau, Simon J. Greenhill & Johann-Mattis List. 2019. Dated language phylogenies shed light on the ancestry of Sino-Tibetan. PNAS 116(21). 10317–10322. https://doi.org/10.1073/pnas.1817972116. www.pnas.org/cgi/doi/10.1073/pnas.1817972116.Search in Google Scholar

Seifart, Frank. 2010. Nominal classification. Language and Linguistics Compass 4(8). 719–736. https://doi.org/10.1111/j.1749-818x.2010.00194.x.Search in Google Scholar

Senft, Gunter. 1996. Classificatory Particles in Kilivila. New York: Oxford University Press.10.1093/oso/9780195092110.001.0001Search in Google Scholar

Shirai, Satoko. 2022. Classifiers in nDrapa: Definition and categorization. Gengo Kenkyu 166.Search in Google Scholar

Siew, Cynthia S. Q., Dirk U. Wulff, Nicole M. Beckage & Yoed N. Kenett. 2019. Cognitive network science: A review of research on cognition through the lens of network representations, processes, and dynamics. Complexity 2019. https://doi.org/10.1155/2019/2108423.Search in Google Scholar

Siew, Cynthia S. Q. 2020. Applications of network science to education research: Quantifying knowledge and the development of expertise through network analysis. Education Sciences 10(4). 101. https://doi.org/10.3390/educsci10040101.Search in Google Scholar

Siew, Cynthia S. Q. & Anutra Guru. 2023. Investigating the network structure of domain-specific knowledge using the semantic fluency task. Memory & Cognition 51(3). 623–646. https://doi.org/10.3758/s13421-022-01314-1.Search in Google Scholar

Steyvers, Mark & Joshua B. Tenenbaum. 2005. The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognitive Science 29(1). 41–78. https://doi.org/10.1207/s15516709cog2901_3.Search in Google Scholar

Sun, Hongkai. 1991. Zangmianyu yuyin he cihui [Tibeto-Burman phonology and lexicon]. Beijing: Chinese Social Sciences Press.Search in Google Scholar

Sun, Jackson T.-S. 1993. A historical-comparative study of the Tani (Mirish) branch in Tibeto-Burman. Berkeley: University of California Ph.D. Dissertation.Search in Google Scholar

VanBik, Kenneth. 2009. Proto-kuki-chin: A reconstructed ancestor of the Kuki-Chin languages. (STEDT Monograph Series #8). Berkeley, CA: STEDT.Search in Google Scholar

Vittrant, Alice & Marc Allassonnière-Tang. 2021. Classifiers in Southeast Asian languages. In Paul Sidwell & Mathias Jenny (eds.), The languages and linguistics of Mainland Southeast Asia: A comprehensive guide, 733–772. Berlin: De Gruyter Mouton.10.1515/9783110558142-031Search in Google Scholar

Wang, Feng. 2012. Language Contact and Language Comparison: The Case of Bai. Beijing: Commercial Press.Search in Google Scholar

Weidert, Alfons K. 1984. The classifier construction of Newari and its Southeast Asian background. Kailash 11(3–4). 185–210.Search in Google Scholar

Wood, Daniel C. 2008. An initial reconstruction of Proto-Boro-Garo. Eugene, USA: University of Oregon Master thesis.Search in Google Scholar

Wulff, Dirk U., Simon De Deyne, Samuel Aeschbach & Rui Mata. 2022. Using network science to understand the aging lexicon: Linking individuals’ experience, semantic networks, and cognitive performance. Topics in Cognitive Science 14(1). 93–110. https://doi.org/10.1111/tops.12586.Search in Google Scholar

Xu, Xijian. 1987. Classifiers in Jingpo. Minzu Yuwen 5. 27–35.Search in Google Scholar

Xu, Xijian. 1989. On the origin and development of classifiers in Jingpo, translated by Randy J. LaPolla. Linguistics of the Tibeto-Burman Area 12(2). 15–23. https://doi.org/10.32655/ltba.12.2.02.Search in Google Scholar

Zhang, Cheng. 2012. The relation between the development of general classifiers and the establishment of the category of numeral-classifiers in Chinese. Journal of Chinese Linguistics 40(2). 307–321.Search in Google Scholar

Zhang, Jun. 2016. Lisuyu mɑ̠33 de duogongnengxing yu yufahua [The polyfunctionality of grammaticalization of mɑ̠33 in Lisu]. Minzu Yuwen 4. 26–37.Search in Google Scholar

Received: 2024-02-25

Accepted: 2025-02-03

Published Online: 2025-05-23

Published in Print: 2025-06-26

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/psicl-2024-0024

Keywords for this article

Tibeto-Burman; numeral classifier; semantic evolution; network; colexification

Creative Commons

BY 4.0