Word formation patterns in the perception domain: a typological study of cross-modal semantic associations

Elisabeth Norcliffe; Asifa Majid

doi:10.1515/lingty-2023-0038

Article Open Access

Word formation patterns in the perception domain: a typological study of cross-modal semantic associations

Elisabeth Norcliffe and Asifa Majid

Published/Copyright: August 5, 2024

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Linguistic Typology Volume 28 Issue 3

Abstract

The lexicalization of perception verbs has been of widespread interest as a route into understanding the relationship between language and cognition. A recent study finds global biases in colexification patterns, suggesting recurrent conceptual associations between sensory meanings across languages. In this paper, drawing on a balanced sample of 100 languages, we examine cross-modal semantic associations in word formation. Confirming earlier proposals, we find derived verbs are lower on a proposed Sense Modality Hierarchy (sight > hearing > touch > taste, smell) than the source perception verbs on which they are based. We propose these findings can be explained by verb frequency asymmetries and the general tendency for sources of derivations to be more frequent than their targets. Moreover, it appears certain pairings (e.g., hear–smell) are recurrently associated via word formation, but others are typologically rare. Intriguingly, the typological patterning partially diverges from the patterning reported for colexification in the same domain. We suggest that while colexification is driven by conceptual resemblance between sensory meanings, cross-modal word formations tend to arise from grammaticalization processes of lexical specification, where additional material (e.g., a sensory noun) is collocated to a polysemous verb in order to disambiguate it in context. Together, these processes can account for the typological similarities and divergences between the two phenomena. More generally, this study highlights the need to consider conceptual, communicative and diachronic factors together in the mapping between words and meanings.

Keywords: perception verbs; colexification; lexical typology; sensory language; word formation

1 Introduction

Meanings that are grouped together under a single word (‘colexified’, following François 2008) or linked via word formation (‘partially colexified’, following List 2023) tend to be conceptually related (excluding homonymy: Brugman 1988; Georgakopoulos and Polis 2022; Györi 2006; Traugott and Dasher 2002; Urban 2012, among others). These conceptual associations may be culture-bound, drawing on foundations unique to a particular language community or area (Evans and Wilkins 2000; Koptjevskaja-Tamm et al. 2022; Wilkins 1997). Alternatively, they may be observed across unrelated languages and therefore suggestive of pan-human cognitive underpinnings (Xu et al. 2020; Youn et al. 2016). A major goal in lexical typology is to delimit the variability in how languages distribute meanings across words and identify the motivations underlying recurrent patterns.

Recent years have witnessed an increase in typological studies of (strict) colexification^[1] (see, e.g., Di Natale et al. 2021; Georgakopoulos et al. 2022; Jackson et al. 2019; Norcliffe and Majid 2024; Tjuka 2024; Xu et al. 2020; Youn et al. 2016), driven in part by the compilation of large-scale databases such as the Database of Cross-Linguistic Colexifications (CLICS; Rzymski et al. 2020) and Lexibank (List et al. 2022). Probably to some extent due to the relative difficulty of creating equivalent typological databases of morphologically glossed vocabulary, the semantically-oriented study of word formation is still in its relative infancy. Notable exceptions include Urban (2012) and Schapper (2021), who study semantic associations in word formation and in multi-word expressions in the nominal domain from typological and areal perspectives.

In this study, we examine word formation in the verbal domain, specifically in verbs of perception (i.e., verbs denoting the concepts of seeing, hearing, smelling, etc.). Building on earlier seminal studies of perception verb typology (Evans and Wilkins 2000; Viberg 1984), Norcliffe and Majid (2024) compiled a geographically and genetically balanced sample of perception verb lexicons in 100 languages and examined patterns of colexification between sensory meanings. Certain pairs of sense modalities – in particular, hearing–touch, hearing–smell, and touch–taste – were found to be recurrently co-expressed in the same perception verb form across languages and geographical areas, while vision showed a robust tendency to remain lexically differentiated from all other senses. Here, we draw on the same language sample to examine patterns of word formation. As in Norcliffe and Majid’s (2024) study, our focus is on semantic associations between sense modality meanings; henceforth cross-modal associations. The phenomenon is exemplified in the two languages below, in which a source perception verb expressing one (or more) sense modality meaning combines with another element to express a different sense modality meaning:

(1)

look > taste
Cofán (Isolate) [cofa1242]^[2]
an + cañe (‘to eat’ + ‘to look’)
→ ancañe ‘to taste’	(Borman 1976: 17)

(2)

hear, listen > smell
Kaluli (Bosavi) [kalu1248]
mun + dabuma (‘odor’ + ‘to hear, listen’)
→ mun dabuma ‘smell an odor, sniff a smell’	(Schieffelin and Feld 1988: 25, 102)

We ask whether there are asymmetries in how sensory meanings are linked in word formation and to what extent the typological patterning mirrors that observed for colexification. To do this, we focus on two complementary aspects of cross-modal word formation: (i) the directionality of the semantic shifts that accompany word formation and (ii) the cross-modal associations themselves.

The directionality question can be stated as follows: Given a source (i.e., originating) sense modality meaning and target (i.e., extended or derived) sense modality meaning, is there a typological bias for some sense modalities to be sources and others to be targets? Early research on this topic suggests there is. In a classic typological study based on 53 languages, Viberg (1984) proposed that the perception verb domain is shaped by a hierarchy of senses (3), grounded in the dominance of vision, and to some extent audition, in human perception and neuroanatomy.

(3)

sight > hearing > touch > {taste, smell}

Viberg argued that the hierarchy is reflected in a number of asymmetries in sensory language, including asymmetries in the relative token frequency of different perception verbs, their morphological complexity and diachronic stability. Most relevant, Viberg also proposed the hierarchy constrained the directionality of semantic extensions within the perception domain. His evidence came from patterns of word formation as well as polysemy (colexification). With respect to word formation, Viberg found that a perception verb denoting a ‘higher’ sense modality (e.g., hearing) might extend its meaning – e.g., by combining with a noun or verb – to cover a ‘lower’ sensory meaning (e.g., touch or smell), but not the reverse (i.e., from smell to hearing or hearing to sight). For polysemy proper, Viberg (1984: 136) argued that it was generally possible to identify a prototypical as well as a secondary or extended meaning of the polysemous verb, based on various diagnostics, for example, how it is translated out of context. In such cases, Viberg claimed the prototypical meaning was always higher on the hierarchy than the secondary or extended meaning. Detailed research on Australian languages by Evans and Wilkins (2000) supported Viberg’s unidirectionality proposal. Evans and Wilkins examined patterns of word formation (“indirect semantic extensions” in their terminology) as well as diachronic extensions that result in synchronic polysemy (“direct semantic extensions”) and found meaning extensions in both conformed to Viberg’s hierarchical proposal.

While unidirectionality in the perception domain is sometimes held up as a semantic universal (Evans and Wilkins 2000), the typological robustness of the claim is difficult to assess based on past work: as Viberg was careful to acknowledge, his language sample was highly skewed towards European languages. Moreover, neither Viberg nor Evans and Wilkins’ study included information about the cross-linguistic frequencies of different extension types. In subsequent years, a scattering of counterexamples to Viberg’s unidirectionality proposal have been identified (Brenzinger and Fehn 2013; Ibarretxe-Antuñano 1999; Maslova 2004; Nakagawa 2012), casting some doubt on the generalizability of the original finding. To address this, we quantitatively test the unidirectionality claim for word formation, using a set of complex perception verbs drawn from Norcliffe and Majid’s (2024) balanced sample of languages.

In the second part of the paper, we examine the cross-modal associations themselves and ask whether there are cross-linguistic regularities with respect to which sensory concepts are linked via processes of word formation. Previous work suggests that not all combinations of sense modality meanings are equally likely to be associated in complex words. Viberg (1984) found that although semantic extensions proceeded unidirectionally from higher to lower sense modalities (3), they could skip modalities, with some cross-modal combinations unattested in extensions. He captured the attested patterns of semantic extensions in a refined ‘hierarchy’ (Figure 1).

Figure 1:

Patterns of semantic extensions in Viberg (1984).

Viberg suggested the observed extensions reflected ‘natural’ semantic relations between certain senses. The bifurcation between the contact senses (touch, taste) and the distal senses (hearing, smell) is noteworthy. Viberg also suggested there was a close relationship between smell and taste, reflecting the fact that smell is an integral part of flavor perception. Evans and Wilkin’s (2000) study of Australian languages revealed a similar pattern of associations, including the same bifurcation between touch–taste and hearing–smell and a direct link between taste and smell. Given that sensory meanings belong to the same ontological domain and are therefore all semantically close, it is not altogether surprising that languages should frequently link them in processes of word formation. More striking is the possibility raised by Viberg’s study that some pairs of sensory meanings are regularly conceptualized across cultures as more similar than others.

Because neither Viberg nor Evans and Wilkins presented frequency information, it is difficult to assess both the typological robustness of the generalizations captured in Figure 1, and whether there might be substantial differences in the cross-linguistic frequency of the different attested pairings. So, although Figure 1 shows vision linked to all modalities except smell, in later work, Viberg (2001: 1297) also claimed that vision has a strong tendency across languages not to colexify with non-visual modalities in perception verbs (see Norcliffe and Majid 2024 for discussion).^[3] Moreover, neither Viberg’s nor Evans and Wilkins’ study was based exclusively on evidence from word formation, rendering it difficult to isolate the patterns of semantic associations arising from this type of lexicalization process alone; both relied interchangeably on cases of word formation as well as on polysemy proper (strict colexification). Evans and Wilkins (2000: 553) in fact argue explicitly that both should be considered together because the same semantic association found in one language based on full identity between forms may be encountered in another based on partial identity (see also Vanhove 2008).

Although colexification and word formation can reflect the same kinds of semantic associations (Urban 2012), it is, however, not necessarily the case that they always do. At the most extreme, word formation processes can express semantic relationships that are, for reasons of interpretability, unlikely to arise via colexification. François (2008), for instance, offers as an example the derivational relationship between Latin spīro ‘breathe’ and ex-spīro ‘die’. Beyond these kinds of antonymic examples, it remains an empirical question to what extent colexification and word formation pattern in the same way with respect to the concepts they link.

For the perception verb domain, recent typological work has helped to clarify one half of this question, by focusing exclusively on typological patterns of colexification. Georgakopoulos et al. (2022) present a quantitative study of colexification in the perception and cognitive domains, based on data from CLICS², the second installment of the Database of Cross-Linguistic Colexifications. Georgakopoulos and colleagues found a tendency for the non-visual senses to colexify with each other, while vision, by contrast, was only linked to the non-visual senses via the cognitive domain. This latter finding is consistent with Viberg’s (2001) claim that vision tends to be kept apart from the other sense modalities in words. As Georgakopoulos and colleagues discuss, however, sampling imbalances in CLICS² made it difficult to reach firm typological conclusions: for colexifications in the perception domain, CLICS² was “massively overrepresented” for the Eurasian macro-area, “while Australia and North America together amount to less than 5 % of the colexification patterns” (2022, 461).

Norcliffe and Majid (2024) addressed these empirical limitations in a study of colexification based on a genetically and geographically balanced sample of 100 languages and replicated with an independent sample of 271 languages drawn from CLICS³. Confirming earlier reports, the study found that vision exhibited a strong bias to be lexically differentiated from the other senses across languages – that is, to not colexify with other sensory meanings. We refer to this as vision’s unimodal bias. For the remaining senses, not all cross-modal pairings were equally represented. Some combinations, in particular hearing–touch, as well as hearing–smell and touch–taste, were found recurrently across genetically and geographically unrelated languages. The frequent grouping of hearing–smell and touch–taste recapitulates Viberg’s original observation of a split between contact and non-contact senses. Critically, however, there was no evidence for a “close relationship” between taste and smell, contrary to Viberg (2001: 1300); these two senses were rarely co-expressed in perception verbs in a balanced sample of languages.^[4]

The negative bias against the colexification of smell and taste in perception verbs is unexpected: these senses are known to be closely associated in other areas of sensory vocabulary, a fact which has been ascribed to their close alignment at perceptual and neural levels (e.g., Winter 2019). Sensory norming studies, in which speakers are asked to rate how much a given property or object is experienced by each of the senses, also reveal the conceptual interdependency of smell and taste across languages (e.g., Chen et al. 2019; Lynott et al. 2020; Morucci et al. 2019; Speed and Majid 2017). As discussed in Norcliffe and Majid (2024), it is possible that their lack of association in perception verbs has a communicative, rather than conceptual basis. Experimental and corpus research shows that colexification is less common in contexts where confusability is particularly at stake (Brochhagen and Boleda 2022; Karjus et al. 2021). Due to their close alignment in other lexical categories, it is plausible that linguistic context cannot reliably disambiguate between an intended smell or taste interpretation of a perception verb that colexifies both meanings. So smell–taste colexifications in verbs may be inhibited for this reason. Similarly, vision’s unimodal bias may also have a basis in communicative need (Norcliffe and Majid 2024). Sensory norming studies show that words that strongly evoke one or more non-visual modalities also tend to be associated with vision (Chen et al. 2019; Lynott et al. 2020; Morucci et al. 2019; Speed and Majid 2017). This suggests that linguistic context cannot reliably disambiguate between visual and non-visual readings of ambiguous perception verbs.

If the colexification of certain sense modalities is inhibited for communicative reasons, this raises the interesting possibility that colexification and word formation may, despite previous assumptions to the contrary, show divergent typological patterning in the perception domain. This is because in word formation, the meanings of the source and target words are formally differentiated. Word formation – unlike colexification – does not give rise to ambiguity. Given their assumed close conceptual resemblance, we might expect smell and taste to associate via word formation more frequently than colexification. We might also expect that cross-modal word formations involving vision will not be inhibited to the extent observed for colexification. We address these possibilities here.

The remainder of this paper is structured as followed. In Section 2 we briefly describe the Perception Verb Database and present some initial typological generalizations regarding cross-modal word formation. Next, in Section 3 we address our first research question (directionality asymmetries), and in Section 4 the second research question (cross-modal associations). Finally, in Section 5, we consider the diachronic origins of complex perception expressions, which may provide some insight into the pattern of results observed.

2 The Perception Verb Database (PVDB) and some initial typological generalizations

In order to make robust typological generalizations, we must consider how language sampling methods influence outcomes. Previous studies of perception verbs have either relied on convenience samples (e.g., Georgakopoulos et al. 2022; Viberg 1984) or focused on one language area in depth (e.g., Evans and Wilkins 2000). Such studies, although informative in their own right, have limitations. In this study we use the Perception Verb Database (PVDB; Norcliffe and Majid 2024), which was manually created from high quality dictionaries and grammars. The language sampling followed Miestamo et al.’s (2016) method, in which each of six macro-areas – based on Hammarström and Donohue’s (2014) definitions – was sampled in proportion to its genealogical diversity. The procedure is described in detail in Norcliffe and Majid (2024) and the database is available online.^[5]

The PVDB contains perception verbs with non-controlled experiencer meanings (translational equivalents of ‘see’, ‘hear’ etc.) as well as controlled activity meanings (translational equivalents of ‘look’, ‘listen’, etc.). Only perceiver-oriented verbs are represented (i.e., where the experiencer is linked to the subject role in an active clause). Stimulus-oriented meanings (in which the perceptual stimulus is linked to the subject role, as in smell in the pancakes smell delicious) are not included (see Norcliffe and Majid 2024 for discussion). Lexical entries in the PVDB are coded for whether they are formally simple or complex.

A form is considered ‘complex’ if, semantically, it was listed in the source as the standard expression for designating perception via one of the five sense modalities and, formally, if it went beyond a simple, non-decomposable verb stem. Note that according to these criteria, our set of complex forms includes both complex lexical constituents (e.g., morphologically derived verb forms and compounds) as well as multi-word expressions (see Koptjevskaja-Tamm and Veselinova 2020 for a detailed discussion of simple and complex word structures in the lexicon). While multi-word expressions go beyond the level of the “word” (e.g., Haspelmath 2023), we consider them together with word formation proper for two reasons.

The first is a functional one: both morphologically complex words and complex expressions are used to name perceptual concepts (see Masini et al. 2022 for a similar argument with respect to “binominal compounds”). For example, in Ulwa (Keram) [yaul1241], the concept ‘to smell’ is designated with a verb + noun compound nambït-wana (‘odor’ + ‘sense’). The compound form functions as a single, polymorphemic lexical item according to language specific criteria, for example permitting object markers (preceding the entire complex word) and TAM suffixes (following the entire word), just like simple verbs in the language (Barlow 2018: 219). In Italian, by contrast, the concept ‘to smell’ is conventionally named with a multi-word expression in which a general perception verb sentire combines with an entire noun phrase l’odore di X ‘the odor of X’ (Collins 2024).

The second reason for considering complex words and multi-word expressions together is a practical one: in many cases we are unable to determine the correct structural analysis of a given complex form (according to the appropriate language-specific definitions of wordhood), due to lack of information in the source dictionary or grammar.

In the remainder of this paper, we use the term ‘complex verb’ as a cover term for both complex lexical items and multi-word expressions, and the term ‘word formation’ to refer to the process of creating such forms. Defined in this way, we took the set of cross-modal complex verbs in the PVDB and annotated them for their constituent parts (see Section 2.2).

2.1 Distribution of colexification and word formation strategies across languages

To obtain an overview, we classified the 100 languages in the PVDB into four types (see Figure 2):

Languages that lexicalize each sense modality meaning with a unique unimodal verb
Languages with at least one instance of cross-modal colexification, but no cross-modal word formation
Languages with at least one instance of cross-modal word formation, but no cross-modal colexification
Languages with at least one instance of both cross-modal colexification and cross-modal word formation.

Figure 2:

Perception verb paradigm types. (A) Examples of paradigm types according to presence versus absence of cross-modal colexification and word formation. (B) Frequencies of paradigm types in the PVDB.

For 34 languages there is a one-to-one mapping between sense modality meanings and verb forms. While this is evidently not a typologically rare strategy, it is more common for languages to have cross-modal colexification or word formation of some kind. Thirty-eight languages have at least one instance of cross-modal colexification, 13 have cross-modal word formation, and 16 have both strategies. Cross-modal word formations are found in all six macro-areas (see Table 1).

Table 1:

Languages with cross-modal word formation in the PVDB.

Languages (families)	Macro-area	Number of languages
Bokobaru (Mande) East Taa (Tuu) Luo (Nilotic)	Africa	3

Kayardild (Tangkic) Wardaman (Yangmanic)	Australia	2

Abkhazian (Abkhaz-Adyge) Italian (Indo-European) Korean (Koreanic)	Eurasia	4

Francisco León Zoque (Mixe-Zoque) Koasati (Muskogean) Mayangna (Misumalpan) Nisga’a (Tsimshian)	North America	4

Ambulas (Ndu) Duna (Isolate) Grass Koiari (Koiarian) Kalam (Nuclear Trans New Guinea) Kaluli (Bosavi) Konai (East Strickland) Mende (Sepik) Momu-Fas (Baibai-Fas) Rumu (Turama-Kikori) Ulwa (Keram) Yareba (Yareban)	Papunesia	11

Cofán (Isolate) Guambiano (Barbacoan) Kotiria (Tucanoan) Qawasqar (Kawesqar) Warao (Isolate)	South America	5

Together there are 44 unique cross-modal complex perception verbs in the PVDB. This is greater than the total number of languages with cross-modal complex perception verbs (n = 28), because languages sometimes have multiple complex forms (e.g., Luo in Figure 2A). There are 62 unique multi-sense verbs (i.e., verbs that colexify two or more modalities), differing in the number (range = 2–4) and type of sensory meanings encoded (see Norcliffe and Majid 2024).

2.2 Formal types of cross-modal complex verbs in the PVDB

Complex perception verbs are known to constitute a formally heterogenous class, both within and across languages (Evans and Wilkins 2000; Viberg 1984). In the PVDB, we identify three major formal patterns of realization. First, verb + verb compounds, which we define broadly as any multi-verb construction composed of two (or occasionally more) verbal constituents, one of which is a perception verb. This includes complex expressions that, depending on the source and the language, fall under the labels of serial verb constructions, verb + verb compounds, light verb + verb constructions and coverb + verb constructions; see example (4).

(4)

Kotiria (Tucanoan) [guan1269]
chʉ + ñʉna (‘to eat’ + ‘to see, look’)
→ chʉ ñʉna ‘to taste’	(Waltz 2007: 74)

Second, complex perception verbs appear as verb + noun expressions, in which a complex form is composed of a source perception verb, together with an incorporated or collocated noun or noun phrase. In such cases the noun tends to denote either the canonical body part associated with the denoted perceptual modality (e.g., ‘ear’; example (5)) or a stimulus noun (e.g., ‘flavor’; example (6)).

(5)

Ulwa (Keram)
kïkal + wana (‘ear’ + ‘to feel, taste, sense, think’)
→ kïkal wana ‘to hear’	(Barlow 2018, 252, 254)

(6)

Warao (Isolate) [wara1303]
a riaba + mí (‘sweetness, flavor’ + ‘to see/look’)
→ a riaba mí ‘to try/taste the flavor’	(Barral 1979: 30)

Finally, cross-modal complex verbs may be morphologically derived. Morphological derivations occur less frequently in the PVDB and the affixes involved are functionally diverse, and not always identifiable (examples (7) and (8)).

(7)

Nisga’a (Tsimshian) [nisg1240]
s + báḵ (non-productive stem augment + ‘to feel, try’)
→ s-báḵ ‘to taste’	(Tarpent 1989: 157)

(8)

Ambulas (Ndu) [ambu1247]
vé + knwu (‘to see’ + direction suffix meaning ‘do something tentatively, as a trial’)
→ vé-knwu ‘to hear, listen, feel (pain), smell, heed, obey, think, obey, worry about’	(Kundama et al. 2006: 81)

Overall V + V expressions are most frequent (n = 22), followed by V + N expressions (n = 15) and then morphologically derived forms (n = 7).

2.3 Semantic shift types

Aside from their formal realization, cross-modal complex verbs also vary with respect to the kind of semantic extension or shift that accompanies the derivation. Previous studies (Evans and Wilkins 2000; Viberg 1984) have largely characterized cross-modal semantic shifts as proceeding from a single source sense modality A to a single target sense modality B (e.g., hear > smell), but the PVDB shows that this is only one of a larger set of possible cross-modal shift types. The fact that there are multiple shift types is important, as it bears on our directionality analyses (Section 3) as well as on the possible diachronic development of cross-modal derived perception verbs (Section 5).

The simplest extension type from A → B does frequently occur (see Table 2, simple), but we also encounter three other shift types. In the second type (multi-sense source), the source verb is polysemous (i.e., it is a multi-sense perception verb) and the meaning of the target verb is not conventionally included among the sense modality meanings of the source verb. In the third type (narrowing), the source verb is also polysemous, but the sensory meaning of the target verb is included within the source verb. Strictly speaking, this is not semantic extension, but semantic narrowing of the source verb’s meaning. Note, in such cases both the source and target verb can be used to express the target meaning (e.g., East Taa, both tá̰ã and the complex form |núm tá̰ã designate the meaning ‘smell’). Finally, in the fourth shift type (multi-sense target), the target verb colexifies multiple sensory meanings.

Table 2:

Cross-modal semantic shift types.^a

Shift Type	Example
simple A → B	Momu Fas (Baibai-Fas) [fass1245]	sight → taste	kiy + on ‘consume + ‘see’ → kiyon ‘taste’

multi-sense source A + B(+) → C	Ulwa (Keram) [yaul1241]	touch + taste → hearing	kïkal + wana ‘ear’ + ‘feel, taste, sense, think’ → kïkal wana ‘hear’

narrowing A + B(+) → B	East Taa (Tuu) [huaa1248]	hearing + touch + taste + smell → smell	\|núm + tá̰ã ‘smell (itr.) + hear, feel, taste, smell’ → \|núm tá̰ã ‘smell’

multi-sense target A(+) → B + C(+)	Ambulas (Ndu) [ambu1247]	sight → hearing + touch + smell	vé + knwu ‘see’ + ‘direction suffix’ → véknwu ‘hear, listen, feel (pain), smell, perceive’

^aTable 2 language data come from: Momu-Fas: Honeyman (2017: 544); Ulwa: Barlow (2018: 252, 254); East Taa: Traill (1994: 154, 272); Ambulas: Kundama et al. (2006: 81).

3 Directionality biases: are cross-modal semantic shifts unidirectional?

Viberg claimed cross-modal semantic extensions are constrained by the proposed sense modality hierarchy, with the shift always proceeding from a higher to lower sense modality, for example sight > touch or hearing > smell. In this section we revisit this claim and test it quantitatively against the cross-modal complex verbs in the PVDB. We use two complementary approaches. We first determine whether, in accordance with the unidirectionality hypothesis, sources are more frequently associated with ‘higher’ than ‘lower’ sense modalities, and whether targets are more frequently associated with lower than higher sense modalities. To do this, we aggregate over sources and over targets of cross-modal semantic shifts, and use Bayesian logistic regression models to establish whether all five sense modalities exhibit the predicted asymmetry across languages. Second, we inspect the within-language source-target pairings and calculate how often the directionality of the shifts is consistent with the proposed hierarchy.

3.1 Source modalities versus target modalities

We extracted the set of 28 languages from the PVDB whose perception verb lexicons included one or more instances of cross-modal complex verbs and created a predictor variable Sense Modality (with five levels: sight, hearing, touch, taste and smell), which collapsed over agentive versus experiencer meanings of perception verbs. For each sense modality meaning in each language, we coded whether that modality was associated with a source or target perception verb. Specifically, if in a language a sense modality was encoded in at least one source verb, Source was coded as 1 for that modality, and 0 otherwise. The same procedure was followed for Target coding. Thus, we analyze separately whether there are cross-linguistic frequency differences between sense modalities with respect to whether they are encoded in source or target verbs.

To do this, we used Bayesian Logistic Mixed Effects Regression. The analysis was conducted with R 4.2.1 (R Core Team 2022). We used the package ‘tidyverse’, (version 1.3.2; Wickham et al. 2019) for data processing and visualization and ‘brms’ (version 2.17.0; Bürkner 2017, 2018, 2021) for Bayesian mixed effects regression models.^[6]

We fit separate models for sources (Model 1) and targets (Model 2). In both models the sole predictor was Sense Modality (sight, hearing, touch, taste, smell) which was Helmert-coded, so each level of the categorical predictor variable is compared to the mean of subsequent levels of that variable. This allowed us to compare (1) smell versus taste, (2) touch versus smell and taste, (3) hearing versus touch, smell and taste, and (4) sight versus hearing, touch, smell and taste. With Helmert coding, we can determine whether sense modality meanings that are increasingly higher on the proposed modality hierarchy are more likely to be encoded in source verbs, and those increasingly lower on the hierarchy encoded in target verbs.

In both models we included a random intercept for Macro-Area, which allowed macro-areas to have different baseline preferences. Fixed slopes had weakly informative priors (normal distribution centered at 0 with a standard deviation of 1: Lemoine 2019; McElreath 2020). We used the default brms priors for the standard deviation of random effects and residual errors, which take only positive values (Vasishth et al. 2018: 150). We report whether the 95 % credible intervals of the posterior distributions for each predictor included zero. Credible intervals not containing zero are interpreted as providing the strongest evidence for the effect of the predictor on the dependent variable. We also report the posterior probability of the effect being above or below zero. Markov chain Monte Carlo (MCMC) sampling was performed with four chains of 4,000 iterations each (including 2,000 warm-up iterations). We set the drift parameter delta to 0.99 to ensure convergence. Visual inspection of the chains and R ^ values of 1 indicated both models converged. Posterior predictive checks showed a good fit to the data for both models.

3.1.1 Model 1: Sense modalities associated with source verbs

The model revealed a reliable hierarchical effect (Figure 3), consistent with Viberg’s proposal. Source verbs were more likely to be associated with the visual modality than non-visual modalities (log odds = 0.24, SE = 0.09, 95 % credible interval [0.07, 0.42]), with a very high posterior probability of the effect being positive (β₁ > 0 = 99.7). Hearing meanings were also more likely to be encoded in source verbs compared to the lower modalities (log odds = 0.25, SE = 0.11; 95 % credible interval [0.02, 0.48]), again with a very high posterior probability of the effect being positive (β₁ > 0 = 98.7). Touch meanings likewise were encoded more often in source verbs compared to smell and taste (log odds = 0.42, SE = 0.17, 95 % credible interval [0.08, 0.77], β₁ > 0 = 99.1). However, there was no reliable difference between smell and taste (log odds = 0.23, SE = 0.34), with the posterior distributions firmly overlapping with zero (95 % credible interval [−0.41, 0.91]). Model stacking weights (leave-one-out cross validation with Pareto importance sampling) showed that the model including the Modality predictor provided considerably better predictive performance than a corresponding null model (null model weight = 0.268, predictor model weight = 0.732).

Figure 3:

Source verb coding results. (A) The proportion of languages encoding any of the five sense modalities in the source verb of a cross-modal shift. (B) Posterior distribution of logistic regression coefficients predicting whether languages have a source verb or not, for each of the five Helmert-coded Modality levels.

3.1.2 Model 2: Sense modalities associated with target verbs

Once again, consistent with Viberg’s proposal, the model revealed a reliable hierarchical effect for target verb coding (Figure 4). The visual modality was reliably less likely to be coded as a target verb compared to non-visual modalities (log odds = −0.42, SE = 0.16, 95 % credible interval [−0.79, −0.14]), with a very high posterior probability of the effect being negative (β₁ < 0 = 99.9). Similarly, hear meanings were less likely to be encoded in target verbs compared to touch, taste and smell (log odds = −0.53, SE = 0.17; 95 % credible interval [−0.91, −0.23], β₁ < 0 = 100), and touch meanings were encoded in target verbs reliably less often than smell and taste (log odds = −0.42, SE = 0.17; 95 % credible interval [−0.77, −0.10], β₁ < 0 = 99.5). Again, there was no reliable difference between smell and taste (log odds = −0.29, SE = 0.27; 95 % credible interval [−0.83, 0.25]). Model stacking weights (leave-one-out cross validation with Pareto importance sampling) showed the model including Modality as a predictor provided considerably better predictive performance than a corresponding null model (null model weight = 0.147, predictor model weight = 0.853).

Figure 4:

Target verb coding results. (A) The proportion of languages encoding any of the five sense modalities in the target verb of a cross-modal shift. (B) Posterior distribution of logistic regression coefficients predicting whether languages have a target verb or not, for each of the five Helmert-coded Modality levels.

3.1.3 Summary

The logistic regression models showed robust hierarchical effects consistent with Viberg’s original proposal. Source verbs of cross-modal semantic shifts were reliably more likely to encode modalities increasingly higher on the proposed hierarchy, while target verbs of cross-modal semantic shifts were reliably less likely to encode modalities increasingly higher on the hierarchy.

3.2 Directionality of the different shift types

Having established that sources and targets are, in the aggregate, associated with different ends of the sense modality hierarchy, we turn to specific within-language source-target pairings, to establish whether individual shifts are consistent with the proposed directionality. To do this, we considered separately each of the semantic shift types in Table 2, excluding shifts where the target verb is polysemous. This shift type is infrequent; only 3 out of 48 shifts (6 %) fall into this category. In all three cases, the target verb forms are morphologically derived and show lack of semantic transparency between constituent parts of the complex form. It is plausible that the derived verb form extended its meaning to cover additional sense modalities, facilitated by the loss of semantic transparency between the constituent parts of the complex form. Given the difficulty of establishing the original target meaning in these cases, we do not consider this shift type further.

For the remaining shift types, we consider each separately because categorizing the directionality requires a partially different approach in each case. While the directionality of simple shifts (A → B) can be easily identified, shifts involving multi-sense sources are potentially harder to classify, because the source verb could encode sensory meanings that fall hierarchically on either side of the target sensory meaning (e.g., if a complex verb denoting ‘feel’ is based on a source verb denoting both ‘hear’ and ‘smell’). We therefore deemed it appropriate to inspect this set separately. Type 3, which involves semantic narrowing, falls outside the scope of the unidirectionality proposal altogether, in that the modality of the target verb is included in the set of meanings of the source verb, so the target modality cannot properly be defined as either ‘lower’ or ‘higher’ than the source modalities. We can, nevertheless, inspect the semantic shifts in this set to determine whether the sense modality of the target verb tends to be the lowest among the set of meanings associated with the source.

The frequencies of the different shift types are presented in Figure 5. Below, we discuss each in turn.

Figure 5:

Frequencies of semantic shift types.

3.2.1 Directionality of simple shifts (A → B)

Inspection of simple A → B shifts reveals a robust directional asymmetry across languages, consistent with Viberg’s proposal (Figure 5A). Twenty-one of the total 22 shifts in this category involve a source verb that is higher on the proposed hierarchy than the target (see examples (9) and (10)).

(9)

see > taste
Yareba (Yareban) [yare1248]
ie + erásu ('eat '+ 'see, look')
→ ie erásu ‘to taste’	(Weimer and Weimer 1964: 151)

(10)

hear > smell
Grass Koiari (Koiarian) [gras1249]
vusika + uhuia (‘stink, bad odor’ + ‘hear/listen’)
→ vusika uhuia ‘to smell’	(Dutton 2003: 176)

The one case that stands apart is an instance of a shift from smell to taste, but since Viberg (1984) treated smell and taste as unordered on the hierarchy this is not strictly speaking an exception. Evans and Wilkins (2000) write that in Kayardild, the verb banyji has a basic meaning of ‘to smell’, but in combination with a verb of eating in a coverb construction, expresses the meaning ‘to taste’:

(11)

smell > taste
Kayardild (Tangkic) [kaya1319]
banyji-ja + diya-ja (‘smell’ + ‘eat’)
→ banyji-ja diya-ja ‘to taste’	(Evans and Wilkins 2000: 559)

3.2.2 Directionality of multi-sense source shifts (A + B(+) → C)

The second shift type, in which the source verb is a multi-sense verb, also conforms to the unidirectionality proposal (Figure 5B). In eight out of nine instances, the target modality does not outrank any of the modalities associated with the source verb (examples (12) and (13)). The one exception is Ulwa (Keram), in which the target verb kïkal wana ‘to hear, listen’ is based on the source verb wana ‘to feel, taste’, in combination with the body part noun kïkal ‘ear’. In this case, the target verb denotes a sense modality (‘hearing’) higher than the source verb (‘touch’, ‘taste’).

(12)

hear, listen, feel, sense > taste
Mende (Sepik) [mend1268]
a + misi (‘eat’ + ‘hear/listen/feel/sense’)
→ amisi ‘to taste’	(Nozawa 2006: 6)

(13)

hear, feel > smell
Rumu (Turama-Kikori) [rumu1243]
há + wara (‘smell (n.)’ + ‘hear/feel’)
→ há wara ‘to smell’	(Petterson 1999: 25)

3.2.3 Directionality of semantic narrowing (A + B(+) → B)

For semantic narrowing, there is a tendency (eight out of 10 instances) for target modalities to be the lowest of the set of source modalities (Figure 5C; see (14) and (15)). In one case, Kalam, the sense modality of the target verb is not the lowest, but neither is it the highest: the very general perception verb nŋ combines with the contact verb d ‘touch’ to express the agentive perceptual meaning ‘touch’ (16). Pawley (2020) has written extensively about the semantics of nŋ; in addition to denoting basic perception meanings, it also expresses a broad range of cognitive meanings. It is worth highlighting that this multi-word expression does not represent a strict semantic narrowing of the source verb nŋ, as the complex form carries an explicitly agentive meaning not present in the source verb meaning ‘feel’ (Kalam shows the same dynamicity shift in the complex form pug nŋ (sniff + nŋ), which expresses agentive ‘smell’).

(14)

hear, feel, taste, smell, sense > taste
Guambiano (Barbacoan) [guam1248]
ma + mɨrɨp (‘eat’ + ‘hear/feel/taste/smell/sense’)
→ ma mɨrɨp ‘to taste’	(Norcliffe, field notes)

(15)

hear, feel, taste, smell, perceive > smell
East Taa (Tuu)
\|núm + tá̰ã (‘smell (itr.)’ + ‘hear/feel/taste/smell/perceive/understand’)
→ \|núm tá̰ã ‘smell’	(Traill 1994: 154, 272)

(16)

see, look, hear, feel, smell, perceive > touch
Kalam (Nuclear Trans New Guinea) [kala1397]
d + nŋ (‘touch’ + ‘see/hear/feel/smell/perceive’)
→ d nŋ ‘to touch’	(Pawley 2020: 390)

Finally, Kalam is exceptional as it uses a complex verb form to express vision, based on the same highly polysemous source verb nŋ. In this case the target meaning is the highest of the modalities co-expressed by the source verb, according to Viberg’s hierarchy. This is the only case in the PVDB of a cross-modally derived vision verb (17).

(17)

see, look, hear, feel, smell, perceive > see
Kalam (Nuclear Trans New Guinea)
wdn + nŋ (‘eye’ + ‘see/hear/feel/smell/perceive’)
→ wdn nŋ ‘to see’	(Pawley 2020: 391)

3.2.4 Summary

Collapsing over simple and multi-sense source shift types, we find that 30 of the 31 – spanning 23 languages from 23 different language families and all six macro-areas – conform to the unidirectionality proposal. Consistent with the general trend, semantic narrowing shows an overall tendency – although not without exception – for the target meaning to be the lowest of the set of sense modalities co-expressed by the source verb.

3.3 Discussion

This study provides quantitative confirmation of Viberg’s unidirectionality proposal: cross-modal semantic shifts based on word formation proceed almost without exception from higher to lower modalities on the proposed sense hierarchy. Viberg (2008) argues, based on the patterns of extensions he observed in his original study, that the asymmetrical linguistic treatment of the senses reflects their relative dominance in human perception. Vision, above all, appears to have biological primacy over the other senses in humans, with perhaps up to 50 % of the cortex involved in visual functions (Palmer 1999). Accordingly, Viberg proposes the sense hierarchy is “firmly grounded in human biology and general cognition” (Viberg 2008: 127). Evans and Wilkins similarly attribute directionality effects in this domain to “perceptual givens” (2000: 562, quoting Berlin 1992).

While the present findings are compatible with such a biologically-oriented account, it is not the only possibility. Research on semantic extensions in vocabulary has shown that directionality is reliably predicted by word frequency. In a series of experiments, Harmon and Kapatsinski (2017) found participants were more likely to semantically extend a more frequent word form than a less frequent word form. Winter and Srinivasan (2022) examined word formation in the nominal domain (building on earlier work by Urban 2012) and found that for conceptual pairings that showed strong source/target asymmetries across languages, English words denoting the source meanings were more frequent than English words denoting target meanings. There is no current consensus regarding the mechanism underlying the close relationship between word frequency and source/target asymmetries (see Winter and Srinivasan 2022 for discussion). Possibly, because more frequent words are more intersubjectively accessible (familiar to interlocutors), when a speaker communicates a new meaning built on a high frequency word, there is a greater chance of communicative success (Dancygier and Sweetser 2014). Additionally, when communicating new meanings based on novel word formation or metaphorical extension, more frequent words will be more cognitively accessible to the speaker and hence provide sources for extensions than less frequent words (Harmon and Kapatsinski 2017).

The finding that frequency strongly correlates with source/target asymmetries speaks directly to the present case, because there is cross-linguistic evidence that perception verbs denoting sensory meanings at the top of Viberg’s proposed sense hierarchy tend to have a higher token frequency than those denoting ‘lower’ modalities. Across all languages studied so far, vision verbs are more frequent than non-visual perception verbs (see San Roque et al. 2015 for a cross-linguistic study of 13 languages; also Floyd et al. 2018 for Cha’palaa (Barbacoan) [chac1249], Imbabura Quechua (Quechuan) [imba1240] and English; Holmer 2021 for Aslian languages; Tchantourian and Vamling 2005 for Mingrelian (Kartvelian) [ming1252]; Viberg 1993 for Swedish [swed1254]; Krishna et al. 2022 for Telugu (Dravidian) [telu1268]; Winter et al. 2018 for English), suggesting, perhaps, that languages are “optimised for the communication of visual concepts” (Winter et al. 2018: 214).

The relative frequency of non-visual perception verbs appears to be more cross-linguistically variable, though here the data are sparser. San Roque et al.’s (2015) study of perception verb usage in conversational language found cross-linguistic variability in the relative rank ordering of non-visual perception verbs. Still, hearing verbs were second most frequent after vision in 12 out of the 13 languages sampled, suggestive of a strong cross-linguistic tendency. San Roque et al. (2015) found in Semai, an Aslian language, references to olfaction were more frequent than those to audition, a finding in keeping with the observed importance of olfaction in Aslian languages and cultures (e.g., Burenhult and Majid 2011; Majid and Kruspe 2018; Wnuk and Majid 2014). With respect to the relative frequencies of touch, taste, and smell, there is tentative evidence there may be language-particular inversions of the predicted rank ordering. In two separate studies, smell verbs were found to be more frequent than touch and taste verbs in Cha’palaa (Barbacoan) (Floyd et al. 2018; San Roque et al. 2015). Nevertheless, taken together, the majority of languages studied to date suggest an overall tendency for the relative frequencies of perception verbs to follow the order sight > hearing > touch > taste/smell.

These two sets of facts – the observed relationship between token frequency and source/target asymmetries in complex expressions, and the finding that perception verbs frequencies tend to follow a consistent rank ordering across languages – suggest the immediate cause of unidirectionality of semantic extensions in the perception domain could be lexical frequency. This does not rule out the possibility that the cross-linguistic frequency patterns themselves are shaped by the relative dominance of the senses in human perception, and thus, following Viberg, the ultimate cause for the unidirectionality of cross-modal semantic extensions could lie in human biology. It is certainly plausible that talk about vision could partially reflect a “hard-wired reliance on the visual modality” (San Roque et al. 2015: 20). But there are other reasons why speakers might refer most to vision too. There might simply be more occasions to talk about visually-based experiences because we can visually apprehend more objects in the world than we put in our mouths and taste, for example (San Roque et al. 2015: 20; Sweetser 1990: 39). Vision is also a primary foundation for joint attention and is often recruited for directing attention in interactional contexts (see San Roque et al. 2015: 20 and references therein). With respect to audition, the most typical percept of hearing verbs is the content of speech, rather than sounds more generally (San Roque et al. 2015; Sweetser 1990), suggesting the high frequency of hearing verbs across languages is driven by the pan-cultural social relevance of referring to speech situations.

Any overarching typological tendency for perception verb frequencies to follow a particular rank ordering could, therefore, be the outcome of multiple converging factors, rather than the direct outcome of a monolithic, biologically-grounded sensory hierarchy. This perspective also allows the possibility of cultural differences in the relative prominence of the senses that might run counter to broader global tendencies, such as the special place of olfaction in Semai and Cha’palaa. These language differences are not easily accommodated by a universalist biological account, but can be handled by taking into consideration communicative and cultural factors.

4 Associative biases: are some cross-modal associations more typologically prevalent than others?

Having established that cross-modal semantic shifts based on word formation exhibit directionality, we turn to the possible combinations of sense modality meanings linked by these shifts. To examine whether some sense modalities are more likely to be associated via word formation, we visualize the attested cross-modal associations using a weighted directed graph, which we plotted using the R package ‘ggraph’ (version 2.0.5; Pedersen 2021). Because we are interested in recurrent cross-linguistic patterns, we pruned the edges of the graph so only cross-modal complex verb types attested more than once in the PVDB were included (see Croft 2022; Georgakopoulos et al. 2022 for a similar approach with respect to classical semantic maps). The final graph is based on seven complex verb types from 20 languages and 18 language families (Figure 6).^[7]

Figure 6:

A weighted directed graph of cross-modal associations based on word formation in the PVDB. The graph is directed (edges have arrow heads conveying the direction of shifts); cross-linguistic frequencies of associations are represented as edge weights (the thicker the edge, the greater the number of attested links between sources and targets); the size of the nodes represents their ‘out-degree’ score (calculated as the sum of the weights of the outgoing edges of a node).

The lower senses of touch, taste and smell have the lowest out-degree scores (0, calculated as the sum of the weights of the outgoing edges of a node) – there are no outgoing edges from these meanings, indicating they are never the source of a semantic shift. Sight has the highest out-degree score (15); this is driven mainly by the high frequency of shifts to taste. The multi-sense source hearing–touch–taste–smell has the next highest out-degree score (6), followed by hearing–touch (5) and hearing (4). Sight > taste is the most frequent shift overall (n = 12), followed by hearing > smell (n = 4). Notably, there are no shifts from hearing to touch (or touch to hearing). Smell and taste are also not linked.

Touch is linked to smell only when touch also colexifies hearing, and hearing is linked to taste only when hearing also colexifies touch. This suggests the association between touch and smell is a second order one, mediated via hearing and, similarly, the association between hearing and taste is a second order one, mediated via touch. These first and second order associations are schematized in Figure 7.

Figure 7:

A schematic semantic map of first and second order associations involving smell, hearing, touch and taste (the undirected dotted line between hearing and touch represents colexification; solid directed lines indicate word formation).

The association of hearing–smell and touch–taste possibly reflects the conceptual alignment of contact versus non-contact sensory perception (cf. Viberg 1984). While these groupings would appear to have a natural conceptual basis, the high frequency of sight > taste shifts is surprising, as it does not have an obvious conceptual basis. We therefore return to the original data for further qualitative study.

4.1 A closer look at the sight–taste pairings

The set of sight–taste pairings are found predominantly in verb + verb expressions, nine of 11 cases (Table 3); the remaining two involve verb + noun expressions. There is a striking consistency in the composition of the verb + verb forms: in all cases the non-perception verbal constituent is a verb of consumption, such as ‘eat’, or an activity verb referring to manner of consumption, such as ‘lick’. Together, the combination of the consumption plus vision verb yields the meaning ‘to taste’ or ‘to try food’.^[8] This specific strategy of taste predicate formation is found in three separate macro-areas: North America, South America and Papunesia (possibly reflecting a Pacific Rim areal trait, cf. Nichols 1992).

Table 3:

Complex V + V taste verbs composed of ‘eat’ + ‘see’.

Macro-area	Language (family)	Form	Reference source
North America	Francisco León Zoque (Mixe-Zoquean) [fran1266]	cyu’tu + isu (ate + saw) → cyu’tisu ‘tasted/tried (food)’	Engel et al. (1987, 32, 63)
	Koasati (Muskogean) [koas1236]	í:pat + hí:can (eat + see) → í:pat hí:can ‘taste by eating’	Kimball (1994, 94, 123)
	Koasati (Muskogean) [koas1236]	lásaplit + hí:can (lick once, touch with tongue, nibble + see) → lásaplit hí:can ‘taste by taking a small amount’	Kimball (1994, 94, 123)

Papunesia	Duna (Isolate) [duna1248]	neya + ke (eat + see) → neya ke ‘taste’	San Roque (2010)
	Kaluli (Bosavi) [kalu1248]	dabe + bo:ba (lick + see) → dabe bo:ba ‘try to taste’	Schieffelin and Feld (1988, 25)
	Mende (Sepik) [mend1268]	a + heye (eat + see, look, watch, know, try and find out how it is) → aheye ‘eat to find out how it tastes, smell to find out’	Nozowa (2006, 5)
	Momu-Fas (Baibai-Fas) [fass1245]	kiy-on (consume + see, watch, visit) → kiy-on ‘taste’	Honeyman (2017, 544)

South America	Cofán (Isolate) [cofa1242]	an + cañe (eat + look) → ancañe ‘try the taste’	Borman (1976, 17)
South America	Kotiria (Tucanoan) [guan1269]	chʉ + ñʉna (eat + see, look) → chʉ ñʉna ‘try the taste’	Waltz (2007, 74)

The vision verb, in most of these examples, contributes an attemptive (conative) modal meaning to the main predicate of consumption. In Kotiria (Tucanoan), Waltz (2007) notes that ñʉna, with the basic meaning of ‘to see/look’, functions as an auxiliary with the meaning ‘to try’ in combination with other verbs, including those from other semantic domains. In Francisco León Zoque, the vision verb isu ‘saw’ occurs as the final element in numerous complex verbs with a ‘try’ meaning, e.g., va’nisu ‘tried to sing’ (< v’anu ‘sang’), myesisu ‘tried on (clothes)’ (< myesu ‘put on clothes’), indicating a generalized attemptive function in such expressions (Engel et al. 1987).^[9] In Kaluli (Bosavi), the vision verb bo:ba also has the meaning ‘to try’ and occurs in a number of compound constructions in this function (Schieffelin and Feld 1988). Kaluli is consistent with Foley’s (1986: 152) general observation for Papuan languages: “the conative modality (the actor tries to perform the action) is almost universally signalled … with a serial verb construction involving the verb stem ‘see’”. We cannot verify whether the two remaining non-Papuan languages (Cofán, Koasati) also utilize their vision verb as a conative or semantically similar modal expression, but it is plausible. The lexical semantic relationship between seeing and trying has, moreover, been documented elsewhere for a number of languages, including Ewe (Ameka 2008), Korean (Lee 1993), Mongsen Ao (Coupe 2007), Sayan Turkic (Anderson 2004), Tariana (Aikhenvald 2003), and Chintang (San Roque et al. 2018); see Voinov (2013) for a cross-linguistic survey.

So, although vision is frequently linked to taste in V + V constructions, the association appears to be a second order one, mediated by an initial semantic extension from see > try (Figure 8).

Figure 8:

A schematic semantic map showing the associations between see, try, and taste.

The two instances of V + N taste predicates based on a vision verb may also involve the vision verb having a ‘try’ meaning. In these cases, the vision verb combines with a noun meaning ‘taste’ or ‘flavor’ to express the meaning ‘to taste’ (i.e., ‘to see the taste’).

(18)

see > taste
Korean (Koreanic) [kore1280]
mas + pota (taste (n.), flavor + see/look at, perceive, set eyes on)
→ mas pota ‘taste, try the flavor of, experience, learn’	(Martin et al. 1967: 588, 788)

(19)

see > taste
Warao (Isolate)
a riaba + mī (sweetness, flavor + see, look, find, achieve)
→ a riaba mī ‘to try/taste the flavor of’	(Barral 1979: 30, 293)

In Korean, Lee (1993) observes that the verb po- ‘to see’ can be used as an auxiliary with an attemptive function, so it follows that its complex predicate for ‘taste’ is also built on this extended ‘try’ meaning (i.e., ‘try the taste’). For Warao, a language isolate of Guyana, the picture is unclear, though we note the existence of other South American languages that use the semantically similar complex expression ‘test its sweetness’ to refer to ‘taste’, for example the Guaicuruan languages Mocoví [moco1246] (Buckwalter and Ruiz 2023) and Toba [toba1269] (Buckwalter and Sánchez 2023).

Turning briefly to the three instances of see > touch extensions, two are found in languages where ‘see’ is used with an extended ‘try’ meaning (Kotiria and Duna), showing in these cases too, the association between vision and touch is a second order one. The third, Qawasqar, Kaweskar [qawa1238] is unique in using a reflexive pronoun in combination with a vision verb to express ‘to feel’.

4.2 Discussion

This study shows that not all logical pairings of sense modality meanings are equally likely to be associated in word formation. The two cross-linguistically most frequent pairings are (i) sight > taste and (ii) hearing (or hearing/touch) > smell. While the hearing–smell associations suggest a conceptual grouping perhaps based on the absence of bodily contact between perceiver and stimulus (following Evans and Wilkins 2000; Viberg 1984), the most frequent pairing in the sample, sight > taste, does not have an obvious conceptual underpinning. Qualitative exploration of the data revealed the association between sight and taste was a second order one, mediated by a meaning extension from see > try. As a result, we conclude that complex perception verbs based on sight verbs with a visual meaning – as opposed to sight verbs expressing an extended non-visual meaning – are, in fact, typologically rare. This, interestingly, undercuts Viberg’s proposed hierarchy of senses; vision, the purported dominant sense, is not in fact the primary semantic source of cross-modal semantic extensions (neither is it a target of semantic extensions). Word formation mirrors colexification in this regard: visual meanings rarely colexify with non-visual sensory meanings; when such colexifications do occur, they are typically second order, mediated by extensions into other semantic domains (Georgakopoulos et al. 2022; Norcliffe and Majid 2024; San Roque et al. 2018). In short, it appears that visual meanings seldom participate in semantic associations within the perception domain.

The semantic associations found in complex perception expressions mirror colexification in other ways: colexification shows the same recurrent grouping of hearing–smell and the same bias against the association of taste and smell (Norcliffe and Majid 2024). There is, however, one especially striking point of divergence between the two: previous studies find hear–feel to be the sensory pairing most frequently colexified across languages (Georgakopoulos et al. 2022; Norcliffe and Majid 2024). This is not mirrored in word formation patterns, however: in the present study, we found that feel verbs are never derived or composed from hearing verbs, or vice versa.

The recurrent association of hearing and touch in colexifications possibly reflects their close perceptual connections; although these senses cross-cut the dimension of bodily contact, they are known to be closely perceptually integrated (e.g., Schürmann et al. 2004; Suzuki et al. 2008; see also Winter et al. 2017). Studies of other lexical domains have also found a close association between the two. For example, touch adjectives are frequently used to describe sounds in several languages (see Winter 2019). The fact we do not find the same cross-linguistic tendency for hearing and touch to be associated in word formation is puzzling, on the assumption that word formation, like colexification, is motivated by conceptual associativity between meanings (Evans and Wilkins 2000; Urban 2012; Viberg 1984).

Equally puzzling is the lack of association between smell and taste, in both colexification and word formation. This is unexpected, given prior claims that the two senses are closely connected (e.g., Winter 2016, 2019). A potential contributing factor to the typological rarity of smell–taste linkages in word formation could be the overall lower token frequency of smell and taste verbs. This is unlikely to be the full story, however, given that these meanings are frequently associated in other lexical domains. For example, cross-linguistic studies of synesthetic adjectives have shown taste adjectives are commonly used to describe smells (Winter 2019), a fact which has been ascribed to their close conceptual and perceptual alignment.

As discussed in Section 1, it is possible that the rarity of smell–taste colexification may have a communicative basis (Norcliffe and Majid 2024). Experimental and corpus research has shown that colexification is less common in contexts where confusability is particularly at stake (Brochhagen and Boleda 2022; Karjus et al. 2021). Because smell and taste tend to be closely associated in sensory vocabulary (e.g., O’Meara et al. 2019; Winter 2016), the linguistic context may not offer reliable cues to differentiate between an intended smell or taste interpretation of a basic perception verb that colexifies the two. Given this, we predicted that word formation processes would not be similarly constrained across languages, because the associated meanings are formally distinguished. This prediction was not borne out. On both conceptual and communicative grounds, it therefore remains a puzzle why smell and taste should so rarely be linked in word formation.

The same issue arises with respect to vision. The robust typological bias against the colexification of vision with non-visual sensory meanings may similarly be driven by communicative need (Norcliffe and Majid 2024). Just as for taste and smell, the overall dominance of vision in sensory language (Chen et al. 2019; Lynott et al. 2020; Miklashevsky 2018; Morucci et al. 2019; Speed and Majid 2017; Vergallito et al. 2020) may mean the linguistic context does not reliably disambiguate between visual and non-visual interpretations of ambiguous perception verbs. Given this, we predicted a different outcome in patterns of word formation. Instead, although complex perception verbs (mainly for taste) are frequently built on vision verbs, the present study showed these are almost always used with an extended, non-perceptual meaning (‘try’). Considering the overall high frequency of vision verbs, it is striking they are rarely the source of cross-modal derivations otherwise (cf. Winter and Srinivasan 2022).

We are therefore left with the following explananda: if colexification and word formation are similarly driven by conceptual associativity between meanings, why are hear–feel associations so frequent in the former yet unattested in the latter? If the colexification of smell and taste is inhibited for communicative reasons, and if the colexification of vision with the non-visual modalities is similarly constrained, what inhibits the association of these meanings in word formation, given the distinction between them is, in this case, formally marked?

We cannot rule out the possibility that some of these apparent rara in word formation are simply due to our sample size. While our original balanced sample contained 100 languages, only 28 of these had cross-modal complex perception verbs. Given the multivariate nature of the research question (associative biases across languages among a set of five meaning categories), absent or infrequent cross-modal pairings in our sample may not be an accurate reflection of the population statistics. This underscores a general methodological challenge in conducting lexical typological research at the domain level, where the analyst is typically interested in relationships between multiple variables and where large samples of languages are therefore required to make inferences about typologically recurrent patterns. While large-scale lexical databases such as CLICS have made this more tractable for the study of colexification, for semantic associations in word formation, the lack of equivalent databases of morphologically glossed vocabulary is a bottleneck to obtaining larger language samples. In this regard, recent efforts to automatically infer partial colexification patterns from multilingual word lists (List 2023) are a promising development in the field.

With these caveats in mind, it is nevertheless worth considering what could, in principle, give rise to the set of divergences and parallels between colexification and word formation that are suggested by this study. Consideration of the diachronic dimension may shed some light on these puzzles. We close our paper with a discussion of how cross-modal complex predicates may be historically related to polysemous perception verbs and how one particular diachronic scenario, in which complex predicates develop out of polysemous source verbs, may provide insight into the typological patterning in this domain. Our discussion necessarily remains speculative; detailed empirical work is required to verify the scenario we outline. Our goal is rather to illustrate how diachrony has the potential to elucidate synchronic patterns in this domain and to point out future directions for systematic study.

5 Diachronic considerations

Semantic extensions based on word formation and those based on colexification are often implicitly treated as independent kinds of lexical change (Evans and Wilkins 2000; Viberg 1984). Yet it is possible that one process is historically dependent on the other. It has sometimes been suggested that complex expressions may develop out of an earlier stage of polysemy (Brown and Witkowski 1983; Maslova 2004). Maslova (2004) discusses this possibility for perception verbs specifically, a process she refers to as “lexical specification”.^[10] Building on Maslova’s proposal, we can sketch the following diachronic scenario: A perception verb (e.g., ‘hear’) first extends its meaning to an additional sensory meaning (e.g., ‘smell’), resulting in polysemy (Stage 2). Due to the resulting ambiguity, speakers may opt to disambiguate the intended sensory meaning by collocating additional material (e.g., a sensory noun) (Stage 3). For example, speakers might use an odor noun to specify the intended meaning of a hear–smell verb is ‘smell’. Over time, through frequency of use, the collocated element may become obligatory when referring to smell, giving rise to a lexicalized complex expression designating smell. As a result, the polysemous hear–smell verb may drop ‘smell’ as a referent, narrowing its meaning to ‘hear’ (Stage 4). The end state is a complex expression carrying the meaning ‘smell’, which is based on a source verb that now only expresses ‘hear’.

When viewed synchronically at Stage 4, the word formation process appears to be based on a direct extension from a source verb expressing A to a derived verb expressing B (e.g. ‘hear’ > ‘smell’). Diachronically, however, there is no semantic extension at all, but instead semantic narrowing from a polysemous source verb co-expressing {A, B}, to a derived target verb expressing only {B} (e.g. ‘hear/smell’ > ‘smell’). Crucially, on this scenario, the functional motivation for the development of complexity is disambiguation of the intended meaning of a polysemous word.

There are languages in the complexity database that provide initial support for this scenario. As we saw in Section 3.2.3, the database contains cases of semantic narrowing, in which the modality meaning of the complex expression also belongs to the set of meanings co-expressed by a polysemous source verb. Such cases are compatible with Stage 3 of Figure 9, in which a polysemous verb optionally alternates with a more specified complex expression. In East Taa (Tuu), for example, the verb tá̰ã, which co-expresses any non-visual modality (including smell), alternates with a complex form |núm tá̰ã which is used specifically for smell (Traill 1994: 154, 272). It is based on the collocation of the intransitive verb |núm ‘to smell’ with the multi-sense verb tá̰ã. We might view such languages as at the incipient phase of the lexicalization process, before the collocated item is obligatory when referencing the target modality (i.e., Stage 4).

Figure 9:

The stages of the diachronic scenario of lexical specification (after Maslova 2004). f = form and m = meaning.

Available descriptions of East Taa do not provide details about the context of use for each of these alternating forms, but for two other cases we do have additional information. In Kalam, the general verb nŋ can, on its own, express any of the sensory meanings ‘see’, ‘look’, ‘hear’, ‘listen’, ‘feel by touch’ and ‘smell’, in addition to various cognitive meanings (Pawley 2020). Pawley writes that, where necessary, ‘see’ can be disambiguated from other meanings by optionally using a complex expression wdn nŋ (literally, ‘eye perceive’) or wdn-magi nŋ (‘eyeball perceive’). Speakers make use of various clausal strategies for disambiguating non-visual sense modality meanings (see Pawley 2020). Kalam thus constitutes a documented case where a complex expression is selected in place of an ambiguous polysemous form in cases of contextual ambiguity, consistent with Maslova’s original proposal.

Italian offers another supporting case. The general perception verb sentire, which can express any non-visual modality, typically combines with the phrases l’odore di ‘the odor of’ and il sapore di ‘the taste of’ to express ‘to smell’ and ‘to taste’ respectively. The complex expressions appear to be relatively conventionalized – in the Collins Italian-English Dictionary, for example, they are provided as the translation of English ‘to smell’ and ‘to taste’ rather than the simple verb.^[11] Nevertheless, the complex forms are not obligatory. In contexts where the intended meaning is recoverable from context, the noun phrase does not have to be expressed (C. Mazzuca, pers. comm.). Thus, just as in Kalam, we find optionality between a complex expression, and a simple polysemous source verb, based on whether there is ambiguity in context. Critically, for Italian we have information about earlier stages of history. Italian sentire goes back to Latin sentire, which could refer to any sense modality meaning (Galac 2020). We can, therefore, confirm that the collocation of a sensory noun phrase in Italian to express smell and taste is a case of specification of a sensory meaning of an originally polysemous/more semantically general source verb, rather than a semantic extension to a new modality meaning.

Importantly, the specification scenario provides a straightforward explanation for the directionality effects we reported in Section 3. Speakers tend to use longer or more complex linguistic forms where an intended meaning is less frequent – and hence less expected – in its context (e.g., Gibson et al. 2019; Haspelmath 2021; Zipf 1935). Accordingly, theories of communicative efficiency hold that language is shaped by the trade-off between competing pressures for simplicity and informativeness (e.g., Gibson et al. 2019; Haspelmath 2021; Levshina 2022). When an intended meaning is more frequent or easily inferable from context, less linguistic code is needed to convey the meaning, so it is more efficient to use a shorter form. By contrast, where the intended meaning is less frequent and hence less predictable in context, more linguistic code is required to successfully transmit the message. Thus, in cases of polysemy, speakers will be more likely to explicitly mark the less frequent meaning. Given an ambiguous hear–smell verb, for example, a speaker is more likely to explicitly mark or specify the less frequent meaning (smell).

Returning to the set of associative puzzles, the specification scenario would explain why smell–taste and visual–non-visual meanings are so rarely linked via word formation. If, diachronically, cross-modal derived perception expressions tend to develop out of multi-sense source verbs, it follows that combinations of sense modality meanings that do not typically colexify in languages (e.g., for communicative reasons) will also not come to be linked via word formation – the preceding historical stage (Stage 2 of Figure 9) on which such a process is dependent seldom arises to begin with.

The lexical specification scenario may also explain the peculiar discrepancy between colexification and word formation with respect to the cross-modal pairing hear–feel. Hear–feel is the pair of senses that most frequently colexifies across languages and geographic areas (Norcliffe and Majid 2024), yet we find that no languages in our sample link these two modalities via word formation. At first glance, this appears counter to the lexical specification hypothesis – if hear and feel frequently colexify, why are there no languages in the PVDB that have gone on to grammaticalize a complex expression based on such a polysemous source verb? The answer may lie in the linguistic resources that languages typically utilize for specifying the tactile modality in ambiguous contexts (Stage 3 of the lexical specification scenario).

While the available data are limited, reports in the literature suggest that in cases of contextual ambiguity involving multi-sense verbs, languages tend to distinguish the tactile modality by means of modifying verbal clauses that specify a particular manner of movement or pressure of the stimulus object on the body. In contrast to collocated nouns (e.g., a general stimulus noun such as ‘odor’), these are presumably less likely to combine with a polysemous perception verb and grammaticalize into complex perception expressions. Van Putten (2020: 433) observes for Avatime, for example, that to differentiate between ‘hear’ and ‘feel’ interpretations of the multi-sense verb nu (which can refer to any non-visual sense), speakers will make use of various ad-hoc constructional strategies that invoke touching the relevant body part or reference to the skin. Kalam is also reported to rely on clausal strategies to disambiguate tactile perception from other senses. Pawley (2020: 391) gives an example in which the general perception verb nŋ is interpreted as expressing tactile sensation since a preceding clause specifies contact of the stimulus object on the skin. In Guambiano (Barbacoan), when using the multi-sense verb mɨrɨp with an intended tactile meaning, speakers tend to make use of subordinate clauses that specify the nature of the tactile stimulus (Norcliffe, field notes).

The above preliminary datapoints suggest a preference to rely on clausal strategies to distinguish the tactile modality in cases of ambiguity. This may be because, as Strik Lievers and Winter (2018) observe, tactile perception is especially temporally dynamic, involving changing forces in contact with the skin. Because languages tend to lexicalize transient phenomena such as events and movements as verbs (Givón 2001; Murphy 2010 among others), it follows that languages will rely on verbal clauses for expressing tactile information. Consistent with this, Strik Lievers and Winter (2018) found for English that sensory meanings are differentially encoded with respect to lexical categories, with verbs over-represented for the domains of touch and hearing, which is also temporally dynamic.

Altogether, in the context of the lexical specification hypothesis, the different lexical affordances of each sense modality could provide an explanation for why it is typologically rare for hearing and touch to be linked via word formation. In the case of polysemies involving smell or taste, sensory nouns are typically recruited to disambiguate; these may grammaticalize into V + N compounds. In the case of hear–feel polysemies, however, languages may tend to rely on ad-hoc and diverse clausal strategies to specify tactile perception, possibly due to the greater dynamicity of the modality.

To take stock, a diachronic scenario in which complex expressions develop out of earlier polysemous source verbs would provide an account of the typological patterning: it would explain not only the observed directionality effects (for efficiency reasons, languages will tend to formally mark meanings that are less frequent in their context), but also the parallelisms with and divergences from colexification patterns in the same domain.

Logically, a historical relationship between polysemy and word formation could run in the other direction, i.e., polysemous words could develop out of earlier complex expressions. Urban (2011) proposes such a scenario in the context of complex nominal expressions. In the first stage of this scenario, a word designating one meaning combines with another word to produce a complex expression designating a new meaning. For example, a word meaning ‘skin’ combines with the noun ‘tree’ to express the meaning ‘bark’. Over time, due to “general principles of linguistic economy” (Urban 2011: 26), the complex form may undergo ellipsis, such that the original form comes to express both meanings (i.e., the form is now polysemous). For example, the element ‘tree’ is dropped from the complex form, with the simple word now designating both ‘bark’ and ‘skin’. While there are certainly documented cases of this kind of formal reduction of complex words (Brown and Witkowski 1983), this diachronic trajectory does not account for lexicalization patterns in the perception domain straightforwardly. If polysemous perception verbs regularly emerge from the formal reduction of erstwhile complex forms, then we cannot explain why hear–feel colexifications are so frequent cross-linguistically, and yet these two modalities are seldom linked via word formation. If complex forms are the diachronic source of cross-linguistically frequent polysemous hear–feel verbs, then we should regularly encounter such lexical expressions in languages.

6 Conclusion

Cross-modal word formation shows striking regularities across languages. The direction of cross-modal semantic shifts is asymmetric (running from the ‘higher’ senses to the ‘lower’) and the sensory meanings linked by such shifts show associative biases. While previous work has emphasized the role of human biology and neurophysiology to account for these lexicalization patterns, we argue instead that general conceptual, communicative, and diachronic principles better account for the mapping between words and meanings. Such an account is also preferable because it better captures both universal and culture-specific forces in meaning-making across languages.

Corresponding author: Elisabeth Norcliffe [ɪˈlɪsəbəθ ˈnɔklɪf], University of Oxford, Oxford, UK, E-mail: elisabeth.norcliffe@psy.ox.ac.uk

Funding source: HORIZON EUROPE Marie Sklodowska-Curie Actions

Award Identifier / Grant number: 836921 – LexPex

Acknowledgments

We thank Claudia Mazzuca for help with the Italian data and Sebastian Sauppe for statistics advice. Many thanks also to our three anonymous referees for their constructive comments on earlier drafts and to Masha Koptjevskaja-Tamm for her editorial input.

Research funding: This work has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No 836921 [LexPex].
Data availability: The database, analysis code and supplementary materials are available at https://osf.io/5vzwg/?view_only=e03d0cdcc76245b29a4e4702f344ddae.

References

Aikhenvald, Alexandra Y. 2003. A grammar of Tariana, from northwest Amazonia (Cambridge Grammatical Descriptions). Cambridge; New York: Cambridge University Press.10.1017/CBO9781107050952Search in Google Scholar

Ameka, Felix K. 2008. Aspect and modality in Ewe: A survey. In Felix K. Ameka & M. E. Kropp Dakubu (eds.), Aspect and Modality in Kwa Languages, 100, 135–194. Amsterdam: John Benjamins.10.1075/slcs.100.07ameSearch in Google Scholar

Anderson, Gregory D. S. 2004. Auxiliary verb constructions in Altai-Sayan Turkic (Turcologica). Wiesbaden: Harrassowitz.Search in Google Scholar

Barlow, Russell. 2018. A grammar of Ulwa. Honolulu: University of Hawai’i at Mānoa Doctoral Dissertation.Search in Google Scholar

Barral, Basilio de. 1979. Diccionario Warao-Castellano, Castellano-Warao [Warao-Castellano, Castellano-Warao dictionary]. Caracas: Universidad Católica Andres Bello.Search in Google Scholar

Berlin, Brent. 1992. Ethnobiological classification: Principles of categorization of plants and animals in traditional societies. Princeton: Princeton University Press.10.1515/9781400862597Search in Google Scholar

Borman, M. B. 1976. Vocabulario Cofán: Cofán-Castellano, Castellano-Cofán [Cofán: Cofán-Castellano, Castellano-Cofán vocabulary] (Serie de Vocabularios Indígenas “Mariano Silva y Aceves” 19). Quito: Instituto Lingüístico de Verano.Search in Google Scholar

Brenzinger, Matthias & Anne-Maria Fehn. 2013. From body to knowledge: Perception and cognition in Khwe-||Ani and Ts’ixa. In Alexandra Y. Aikhenvald & Anne Storch (eds.), Perception and cognition in language and culture, 161–191. Leiden: Brill.10.1163/9789004210127_008Search in Google Scholar

Brochhagen, Thomas & Gemma Boleda. 2022. When do languages use the same word for different meanings? The Goldilocks principle in colexification. Cognition 226. 105179. https://doi.org/10.1016/j.cognition.2022.105179.Search in Google Scholar

Brown, Cecil H. & Stanley R. Witkowski. 1983. Polysemy, lexical change and cultural importance. Man 18(1). 72. https://doi.org/10.2307/2801765.Search in Google Scholar

Brugman, Claudia Marlea. 1988. The story of over: Polysemy, semantics, and the structure of the lexicon (Outstanding Dissertations in Linguistics). New York: Garland.Search in Google Scholar

Buckwalter, Albert S. & Roberto Ruiz. 2023. Mocoví dictionary. In Mary Ritchie Key & Bernard Comrie (eds.), The intercontinental dictionary series. Leipzig: Max Planck Institute for Evolutionary Anthropology.Search in Google Scholar

Buckwalter, Albert S. & Orlando Sánchez. 2023. Toba dictionary. In Mary Ritchie Key & Bernard Comrie (eds.), The intercontinental dictionary series. Leipzig: Max Planck Institute for Evolutionary Anthropology.Search in Google Scholar

Burenhult, Niclas & Asifa Majid. 2011. Olfaction in Aslian ideology and language. The Senses and Society 6(1). 19–29. https://doi.org/10.2752/174589311X12893982233597.Search in Google Scholar

Bürkner, Paul-Christian. 2017. brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software 80(1). https://doi.org/10.18637/jss.v080.i01.Search in Google Scholar

Bürkner, Paul-Christian. 2018. Advanced Bayesian multilevel modeling with the R package brms. The R Journal 10(1). 395. https://doi.org/10.32614/RJ-2018-017.Search in Google Scholar

Bürkner, Paul-Christian. 2021. Bayesian item response modeling in R with brms and Stan. Journal of Statistical Software 100(5). https://doi.org/10.18637/jss.v100.i05.Search in Google Scholar

Chen, I-Hsuan, Qingqing Zhao, Yunfei Long, Qin Lu & Chu-Ren Huang. 2019. Mandarin Chinese modality exclusivity norms. PLoS One 14(2). e0211336. https://doi.org/10.1371/journal.pone.0211336.Search in Google Scholar

Collins. 2024. Collins unabridged English to Italian online dictionary. Available at: https://www.collinsdictionary.com/us/dictionary/english-italian/smell.Search in Google Scholar

Coupe, A. R. 2007. A grammar of Mongsen Ao (Mouton Grammar Library 39). Berlin; New York: Mouton de Gruyter.10.1515/9783110198522Search in Google Scholar

Croft, William. 2022. On two mathematical representations for “semantic maps”. Zeitschrift für Sprachwissenschaft 41(1). 67–87. https://doi.org/10.1515/zfs-2021-2040.Search in Google Scholar

Dancygier, Barbara & Eve Sweetser. 2014. Figurative language (Cambridge Textbooks in Linguistics). Cambridge: Cambridge University Press.Search in Google Scholar

Di Natale, Anna, Max Pellert & David Garcia. 2021. Colexification networks encode affective meaning. Affective Science 2(2). 99–111. https://doi.org/10.1007/s42761-021-00033-1.Search in Google Scholar

Dutton, Thomas Edward. 2003. A dictionary of Koiari, Papua New Guinea, with grammar notes (Pacific Linguistics 534). Canberra: Pacific Linguistics.Search in Google Scholar

Engel, Ralph, Mary Allhiser de Engel & José Mateo Alvarez. 1987. Diccionario zoque de Francisco León [Dictionary of Francisco León Zoque] (Serie de vocabularios y diccionarios indígenas “Mariano Silva y Aceves” núm. 30). México, D.F: Instituto Lingüístico de Verano.Search in Google Scholar

Evans, Nicholas & David Wilkins. 2000. In the mind’s ear: The semantic extensions of perception verbs in Australian languages. Language 76(3). 546. https://doi.org/10.2307/417135.Search in Google Scholar

Floyd, Simeon, Lila San Roque & Asifa Majid. 2018. Smell is coded in grammar and frequent in discourse: Cha’palaa olfactory language in cross-linguistic perspective. Journal of Linguistic Anthropology 28(2). 175–196. https://doi.org/10.1111/jola.12190.Search in Google Scholar

Foley, William A. 1986. The Papuan languages of New Guinea (Cambridge Language Surveys). Cambridge; New York: Cambridge University Press.Search in Google Scholar

François, Alexandre. 2008. Semantic maps and the typology of colexification: Intertwining polysemous networks across languages. In Martine Vanhove (ed.), From polysemy to semantic change: Towards a typology of lexical semantic associations, 163–215. Amsterdam/Philadelphia: John Benjamins.10.1075/slcs.106.09fraSearch in Google Scholar

Galac, Ádám. 2020. Semantic change of basic perception verbs in English, German, French, Spanish, Italian, and Hungarian. Argumentum 16. 125–146. https://doi.org/10.34103/ARGUMENTUM/2020/9.Search in Google Scholar

Georgakopoulos, Thanasis, Eitan Grossman, Dmitry Nikolaev & Stéphane Polis. 2022. Universal and macro-areal patterns in the lexicon: A case-study in the perception-cognition domain. Linguistic Typology 26(2). 439–487. https://doi.org/10.1515/lingty-2021-2088.Search in Google Scholar

Georgakopoulos, Thanasis & Stéphane Polis. 2018. The semantic map model: State of the art and future avenues for linguistic research. Language and Linguistics Compass 12(2). e12270. https://doi.org/10.1111/lnc3.12270.Search in Google Scholar

Georgakopoulos, Thanasis & Stéphane Polis. 2022. New avenues and challenges in semantic map research (with a case study in the semantic field of emotions). Zeitschrift für Sprachwissenschaft 41(1). 1–30. https://doi.org/10.1515/zfs-2021-2039.Search in Google Scholar

Gibson, Edward, Richard Futrell, Steven P. Piantadosi, Isabelle Dautriche, Kyle Mahowald, Leon Bergen & Roger Levy. 2019. How efficiency shapes human language. Trends in Cognitive Sciences 23(5). 389–407. https://doi.org/10.1016/j.tics.2019.02.003.Search in Google Scholar

Givón, Talmy. 2001. Syntax: An introduction, Rev. edn. Amsterdam; Philadelphia: J. Benjamins.10.1075/z.syn2Search in Google Scholar

Györi, Gábor. 2006. Semantic change and cognition. Cognitive Linguistics 13(2). 123–166. https://doi.org/10.1515/cogl.2002.012.Search in Google Scholar

Hammarström, Harald & Mark Donohue. 2014. Some principles on the use of macro-areas in typological comparison. Language Dynamics and Change 4(1). 167–187. https://doi.org/10.1163/22105832-00401001.Search in Google Scholar

Hammarström, Harald & Robert Forkel. 2022. Glottocodes: Identifiers linking families, languages and dialects to comprehensive reference information. Semantic Web Journal 13(6). 917–924. https://doi.org/10.3233/sw-212843.Search in Google Scholar

Harmon, Zara & Vsevolod Kapatsinski. 2017. Putting old tools to novel uses: The role of form accessibility in semantic extension. Cognitive Psychology 98. 22–44. https://doi.org/10.1016/j.cogpsych.2017.08.002.Search in Google Scholar

Haspelmath, Martin. 2021. Explaining grammatical coding asymmetries: Form–frequency correspondences and predictability. Journal of Linguistics 57(3). 605–633. https://doi.org/10.1017/S0022226720000535.Search in Google Scholar

Haspelmath, Martin. 2023. Defining the word. WORD 69(3). 283–297. https://doi.org/10.1080/00437956.2023.2237272.Search in Google Scholar

Holmer, Sonja. 2021. The language of vision in four Aslian speech communities: An introductory investigation of basic vision verbs. Lund: Lund University BA thesis.Search in Google Scholar

Honeyman, Thomas. 2017. A grammar of Momu, a language of Papua New Guinea. Canberra: Australian National University Doctoral Dissertation.Search in Google Scholar

Huumo, Tuomas. 2010. Is perception a directional relationship? On directionality and its motivation in Finnish expressions of sensory perception. Linguistics 48(1). 49–97. https://doi.org/10.1515/ling.2010.002.Search in Google Scholar

Ibarretxe-Antuñano, Iraide. 1999. Polysemy and metaphor in perception verbs: A cross-linguistic study. Edinburgh: University of Edinburgh Doctoral Dissertation.Search in Google Scholar

Jackson, Joshua Conrad, Joseph Watts, Teague R. Henry, Johann-Mattis List, Robert Forkel, Peter J. Mucha, Simon J. Greenhill, Russell D. Gray & Kristen A. Lindquist. 2019. Emotion semantics show both cultural variation and universal structure. Science 366(6472). 1517–1522. https://doi.org/10.1126/science.aaw8160.Search in Google Scholar

Karjus, Andres, Richard A. Blythe, Simon Kirby, Tianyu Wang & Kenny Smith. 2021. Conceptual similarity and communicative need shape colexification: An experimental study. Cognitive Science 45(9). e13035. https://doi.org/10.1111/cogs.13035.Search in Google Scholar

Kimball, Geoffrey D. 1994. Koasati dictionary (Studies in the Anthropology of North American Indians). Lincoln: University of Nebraska Press.Search in Google Scholar

Koptjevskaja-Tamm, Maria, Antoinette Schapper & Felix Ameka (eds.). 2022. Areal typology of lexico-semantics (special issue). Linguistic Typology, 26(2).Search in Google Scholar

Koptjevskaja-Tamm, Maria & Ljuba N. Veselinova. 2020. Lexical typology in morphology. Oxford research encyclopedia of linguistics. Oxford: Oxford University Press.10.1093/acrefore/9780199384655.013.522Search in Google Scholar

Krishna, P. Phani, S. Arulmozi & Ramesh Kumar Mishra. 2022. Do you see and hear more? A study on Telugu perception verbs. Journal of Psycholinguistic Research 51(3). 473–484. https://doi.org/10.1007/s10936-021-09827-7.Search in Google Scholar

Kundama, John, Adéru Sapayé & Patricia R. Wilson. 2006. Ambulas dictionary. SIL Archives. https://pnglanguages.sil.org/resources/archives/31170.Search in Google Scholar

Lee, Keedong. 1993. A Korean grammar on semantic-pragmatic principles. Seoul: Hankwuk Munhwasa (Korea Press).Search in Google Scholar

Lemoine, Nathan P. 2019. Moving beyond noninformative priors: Why and how to choose weakly informative priors in Bayesian analyses. Oikos 128(7). 912–928. https://doi.org/10.1111/oik.05985.Search in Google Scholar

Levshina, Natalia. 2022. Communicative efficiency: Language structure and use. Cambridge: Cambridge University Press.10.1017/9781108887809Search in Google Scholar

List, Johann-Mattis. 2023. Inference of partial colexifications from multilingual wordlists. Frontiers in Psychology 14. 1156540. https://doi.org/10.3389/fpsyg.2023.1156540.Search in Google Scholar

List, Johann-Mattis, Robert Forkel, Simon J. Greenhill, Christoph Rzymski, Johannes Englisch & Russell D. Gray. 2022. Lexibank, a public repository of standardized wordlists with computed phonological and lexical features. Scientific Data 9(1). 316. https://doi.org/10.1038/s41597-022-01432-0.Search in Google Scholar

Lynott, Dermot, Louise Connell, Marc Brysbaert, James Brand & James Carney. 2020. The Lancaster sensorimotor norms: Multidimensional measures of perceptual and action strength for 40,000 English words. Behavior Research Methods 52(3). 1271–1291. https://doi.org/10.3758/s13428-019-01316-z.Search in Google Scholar

Majid, Asifa & Nicole Kruspe. 2018. Hunter-gatherer olfaction is special. Current Biology 28(3). 409–413.e2. https://doi.org/10.1016/j.cub.2017.12.014.Search in Google Scholar

Martin, Samuel E., Yang-Ha Lee & Sung-Un Chang. 1967. A Korean-English dictionary. New Haven: Yale University Press.Search in Google Scholar

Masini, Francesca, Simone Mattiola & Steve Pepper. 2022. Exploring complex lexemes cross-linguistically. In Steve Pepper, Francesca Masini & Simone Mattiola (eds.), Binominal lexemes in cross-linguistic perspective, 1–20. Berlin/Boston: De Gruyter.10.1515/9783110673494-001Search in Google Scholar

Maslova, Elena. 2004. A universal constraint on the sensory lexicon, or when hear can mean ‘see’? In Alejsandr P. Volodin (ed.), Tipologičeskie obosnovanija v grammatike: K 70-letiju professora Xrakovskogo V.S. [Typological knowledge in grammar: On the occasion of Professor Khrakovsky’s 70th birthday], 300–312. Moscow: Znak.Search in Google Scholar

McElreath, Richard. 2020. Statistical rethinking: A Bayesian course with examples in R and Stan. New York: Chapman and Hall/CRC.10.1201/9780429029608Search in Google Scholar

Miestamo, Matti, Dik Bakker & Antti Arppe. 2016. Sampling for variety. Linguistic Typology 20(2). 233–296. https://doi.org/10.1515/lingty-2016-0006.Search in Google Scholar

Miklashevsky, Alex. 2018. Perceptual experience norms for 506 Russian nouns: Modality rating, spatial localization, manipulability, imageability and other variables. Journal of Psycholinguistic Research 47(3). 641–661. https://doi.org/10.1007/s10936-017-9548-1.Search in Google Scholar

Morucci, Piermatteo, Roberto Bottini & Davide Crepaldi. 2019. Augmented modality exclusivity norms for concrete and abstract Italian property words. Journal of Cognition 2(1). 42. https://doi.org/10.5334/joc.88.Search in Google Scholar

Murphy, M. Lynne. 2010. Lexical meaning (Cambridge Textbooks in Linguistics). Cambridge/New York: Cambridge University Press.Search in Google Scholar

Nakagawa, Hirosi. 2012. The importance of TASTE verbs in some Khoe languages. Linguistics 50(3). 395–420. https://doi.org/10.1515/ling-2012-0014.Search in Google Scholar

Nichols, Johanna. 1992. Linguistic diversity in space and time. Chicago: University of Chicago Press.10.7208/chicago/9780226580593.001.0001Search in Google Scholar

Norcliffe, Elisabeth & Asifa Majid. 2024. Verbs of perception: A quantitative typological study. Language 100(1). 81–123. https://doi.org/10.1353/lan.2024.a922000.Search in Google Scholar

Nozawa, Michiyo. 2006. Mende triglot dictionary. Ukarumpa: Summer Institute of Linguistics.Search in Google Scholar

O’Meara, Carolyn, Susan Smythe Kung & Asifa Majid. 2019. The challenge of olfactory ideophones: Reconsidering ineffability from the Totonac-Tepehua perspective. International Journal of American Linguistics 85(2). 173–212. https://doi.org/10.1086/701801.Search in Google Scholar

Palmer, Stephen E. 1999. Vision science: Photons to phenomenology. Cambridge, Mass: MIT Press.Search in Google Scholar

Pawley, Andrew. 2020. The depiction of sensing events in English and Kalam. In Helen Bromhead & Zhengdao Ye (eds.), Meaning, life and culture: In conversation with Anna Wierzbicka, 381–402. Canberra: ANU Press.10.2307/j.ctv1d5nm0d.26Search in Google Scholar

Pedersen, Thomas Lin. 2021. ggraph: An implementation of grammar of graphics for graphs and networks. R Package version 2.0.5. Available at: https://CRAN.R-project.org/package=ggraph.Search in Google Scholar

Petterson, Robert G. 1999. Rumu—English—Hiri-Motu dictionary/Rumuhei—Hohei—Mutuhei Hei Ke Tei Kopatë (Occasional Paper No. 6). Palmerston North, New Zealand: IPU New Zealand Tertiary Institute (formerly International Pacific College).Search in Google Scholar

R Core Team. 2022. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.Search in Google Scholar

Rzymski, Christoph, Tiago Tresoldi, Simon J. Greenhill, Mei-Shin Wu, Nathanael E. Schweikhard, Maria Koptjevskaja-Tamm, Volker Gast, Timotheus A. Bodt, Abbie Hantgan, Gereon A. Kaiping, Sophie Chang, Yunfan Lai, Natalia Morozova, Heini Arjava, Nataliia Hübler, Ezequiel Koile, Steve Pepper, Mariann Proos, Briana Van Epps, Ingrid Blanco, Carolin Hundt, Sergei Monakhov, Kristina Pianykh, Sallona Ramesh, Russell D. Gray, Robert Forkel & Johann-Mattis List. 2020. The database of cross-linguistic colexifications, reproducible analysis of cross-linguistic polysemies. Scientific Data 7(1). 13. https://doi.org/10.1038/s41597-019-0341-x.Search in Google Scholar

San Roque, Lila. 2010. The grammar of perception in Duna. Nijmegen. Paper presented at the Language and Cognition Department, Max Planck Institute for Psycholinguistics, Nijmegen.Search in Google Scholar

San Roque, Lila, Kobin H. Kendrick, Elisabeth Norcliffe, Penelope Brown, Rebecca Defina, Mark Dingemanse, Tyko Dirksmeyer, N. J. Enfield, Simeon Floyd, Jeremy Hammond, Giovanni Rossi, Sylvia Tufvesson, Saskia van Putten & Asifa Majid. 2015. Vision verbs dominate in conversation across cultures, but the ranking of non-visual verbs varies. Cognitive Linguistics 26(1). 31–60. https://doi.org/10.1515/cog-2014-0089.Search in Google Scholar

San Roque, Lila, Kobin H. Kendrick, Elisabeth Norcliffe & Asifa Majid. 2018. Universal meaning extensions of perception verbs are grounded in interaction. Cognitive Linguistics 29(3). 371–406. https://doi.org/10.1515/cog-2017-0034.Search in Google Scholar

Schapper, Antoinette. 2021. Baring the bones: The lexico-semantic association of bone with strength in Melanesia and the study of colexification. Linguistic Typology 26(2). 313–347. https://doi.org/10.1515/lingty-2021-2082.Search in Google Scholar

Schieffelin, Bambi B. & Steven Feld. 1988. Bosavi-English-Tok Pisin dictionary/Bosabi towo: liya: Ingilis towo: liya: Pisin towo: liya: bugo/Tok Ples Bosavi, Tok Inglis, na Tok Pisin diksineli (Pacific Linguistics Series C 153). Canberra: Pacific Linguistics.Search in Google Scholar

Schürmann, Martin, Gina Caetano, Veikko Jousmäki & Riitta Hari. 2004. Hands help hearing: Facilitatory audiotactile interaction at low sound-intensity levels. The Journal of the Acoustical Society of America 115(2). 830–832. https://doi.org/10.1121/1.1639909.Search in Google Scholar

Speed, Laura J. & Asifa Majid. 2017. Dutch modality exclusivity norms: Simulating perceptual modality in space. Behavior Research Methods 49(6). 2204–2218. https://doi.org/10.3758/s13428-017-0852-3.Search in Google Scholar

Strik Lievers, Francesca & Bodo Winter. 2018. Sensory language across lexical categories. Lingua 204. 45–61. https://doi.org/10.1016/j.lingua.2017.11.002.Search in Google Scholar

Suzuki, Yuika, Jiro Gyoba & Shuichi Sakamoto. 2008. Selective effects of auditory stimuli on tactile roughness perception. Brain Research 1242. 87–94. https://doi.org/10.1016/j.brainres.2008.06.104.Search in Google Scholar

Sweetser, Eve. 1990. From etymology to pragmatics: Metaphorical and cultural aspects of semantic structure (Cambridge Studies in Linguistics 54). Cambridge: Cambridge University Press.10.1017/CBO9780511620904Search in Google Scholar

Tarpent, Marie-Lucie. 1989. A grammar of the Nisgha language. British Columbia: University of Victoria.Search in Google Scholar

Tchantourian, Revaz & Karina Vamling. 2005. Basic verb frequency in Megrelian. Lund Working Papers in Linguistics 51. 199–207.Search in Google Scholar

Tjuka, Annika. 2024. Objects as human bodies: Cross-linguistic colexifications between words for body parts and objects. Linguistic Typology 28(3). 379–418. https://doi.org/10.1515/lingty-2023-0032.Search in Google Scholar

Traill, Anthony. 1994. A !Xóõ dictionary. Köln: Rüdiger Köppe Verlag.Search in Google Scholar

Traugott, Elizabeth Closs & Richard B. Dasher. 2002. Regularity in semantic change. Cambridge: Cambridge University Press.10.1017/CBO9780511486500Search in Google Scholar

Urban, Matthias. 2011. Asymmetries in overt marking and directionality in semantic change. Journal of Historical Linguistics 1(1). 3–47. https://doi.org/10.1075/jhl.1.1.02urb.Search in Google Scholar

Urban, Matthias. 2012. Analyzability and semantic associations in referring expressions: A study in comparative lexicology. Leiden: Leiden University Doctoral Dissertation.Search in Google Scholar

van Putten, Saskia. 2020. Perception verbs and the conceptualization of the senses: The case of Avatime. Linguistics 58(2). 1–38. https://doi.org/10.1515/ling-2020-0039.Search in Google Scholar

Vanhove, Martine. 2008. Semantic associations between sensory modalities, prehension and mental perceptions. In Martine Vanhove (ed.), From polysemy to semantic change: Towards a typology of lexical semantic associations, 341–370. Amsterdam: John Benjamins.10.1075/slcs.106.17vanSearch in Google Scholar

Vasishth, Shravan, Bruno Nicenboim, Mary E. Beckman, Fangfang Li & Eun Jong Kong. 2018. Bayesian data analysis in the phonetic sciences: A tutorial introduction. Journal of Phonetics 71. 147–161. https://doi.org/10.1016/j.wocn.2018.07.008.Search in Google Scholar

Vergallito, Alessandra, Marco Alessandro Petilli & Marco Marelli. 2020. Perceptual modality norms for 1,121 Italian words: A comparison with concreteness and imageability scores and an analysis of their impact in word processing tasks. Behavior Research Methods 52(4). 1599–1616. https://doi.org/10.3758/s13428-019-01337-8.Search in Google Scholar

Viberg, Åke. 1984. The verbs of perception: A typological study. In Brian Butterworth, Bernard Comrie & Östen Dahl (eds.), Explanations for Language Universals, 123–162. Boston: De Gruyter Mouton.Search in Google Scholar

Viberg, Åke. 1993. Crosslinguistic perspectives on lexical organization and lexical progression. In Kenneth Hyltenstam & Åke Viberg (eds.), Progression & regression in language: Sociocultural, neuropsychological, & linguistic perspectives, 340–385. Cambridge: Cambridge University Press.Search in Google Scholar

Viberg, Åke. 2001. The verbs of perception. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals. An international handbook, vol. 2, 1294–1309. Berlin: De Gruyter.Search in Google Scholar

Viberg, Åke. 2008. Swedish verbs of perception from a typological and contrastive perspective. In María de los Ángeles Gómez González, J. Lachlan Mackenzie & Elsa M. González Álvarez (eds.), Languages and cultures in contrast and comparison, 123–172. Amsterdam: John Benjamins.10.1075/pbns.175.09vibSearch in Google Scholar

Voinov, Vitaly. 2013. ‘Seeing’ is ‘trying’: The relation of visual perception to attemptive modality in the world’s languages. Language and Cognition 5(1). 61–80. https://doi.org/10.1515/langcog-2013-0003.Search in Google Scholar

Waltz, Nathan E. 2007. Diccionario bilingüe: Wanano o guanano-español/Español-wanano o guanano [Bilingual dictionary: Wanano o Guanano-Español/Español-Wanano o Guanano]. Bogotá: Editorial Fundación para el Desarrollo de los Pueblos Marginados.Search in Google Scholar

Weimer, Harry & Natalia Weimer. 1964. Yareba language (Dictionaries of Papua New Guinea 2). Ukarumpa: Summer Institute of Linguistics.Search in Google Scholar

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy McGowan, Romain François, Garrett Grolemund, Alex Hayes, Lionel Henry, Jim Hester, Max Kuhn, Thomas Pedersen, Evan Miller, Stephan Bache, Kirill Müller, Jeroen Ooms, David Robinson, Dana Seidel, Vitalie Spinu, Kohske Takahashi, Davis Vaughan, Claus Wilke, Kara Woo & Hiroaki Yutani. 2019. Welcome to the Tidyverse. Journal of Open Source Software 4(43). 1686. https://doi.org/10.21105/joss.01686.Search in Google Scholar

Wilkins, David. 1997. Handsigns and hyperpolysemy: Exploring the cultural foundations of a semantic association. In Darrell T. Tryon & Michael Walsh (eds.), Boundary rider: Essays in honour of Geoffrey O’Grady, 413–444. Canberra: Pacific Linguistics.Search in Google Scholar

Winter, Bodo. 2016. Taste and smell words form an affectively loaded and emotionally flexible part of the English lexicon. Language, Cognition and Neuroscience 31(8). 975–988. https://doi.org/10.1080/23273798.2016.1193619.Search in Google Scholar

Winter, Bodo. 2019. Sensory linguistics: Language, perception and metaphor (Converging Evidence in Language and Communication Research). Amsterdam: John Benjamins.10.1075/celcr.20Search in Google Scholar

Winter, Bodo, Marcus Perlman & Asifa Majid. 2018. Vision dominates in perceptual language: English sensory vocabulary is optimized for usage. Cognition 179. 213–220. https://doi.org/10.1016/j.cognition.2018.05.008.Search in Google Scholar

Winter, Bodo, Marcus Perlman, Lynn K. Perry & Gary Lupyan. 2017. Which words are most iconic?: Iconicity in English sensory words. Interaction Studies. Social Behaviour and Communication in Biological and Artificial Systems 18(3). 443–464. https://doi.org/10.1075/is.18.3.07win.Search in Google Scholar

Winter, Bodo & Mahesh Srinivasan. 2022. Why is semantic change asymmetric? The role of concreteness and word frequency in metaphor and metonymy. Metaphor and Symbol 37(1). 39–54. https://doi.org/10.1080/10926488.2021.1945419.Search in Google Scholar

Wnuk, Ewelina & Asifa Majid. 2014. Revisiting the limits of language: The odor lexicon of Maniq. Cognition 131(1). 125–138. https://doi.org/10.1016/j.cognition.2013.12.008.Search in Google Scholar

Xu, Yang, Khang Duong, Barbara C. Malt, Serena Jiang & Mahesh Srinivasan. 2020. Conceptual relations predict colexification across languages. Cognition 201. 104280. https://doi.org/10.1016/j.cognition.2020.104280.Search in Google Scholar

Youn, Hyejin, Logan Sutton, Eric Smith, Cristopher Moore, Jon F. Wilkins, Ian Maddieson, William Croft & Tanmoy Bhattacharya. 2016. On the universal structure of human lexical semantics. Proceedings of the National Academy of Sciences 113(7). 1766–1771. https://doi.org/10.1073/pnas.1520752113.Search in Google Scholar

Zipf, George Kingsley. 1935. The psycho-biology of language: An introduction to dynamic philology. Boston: Houghton Mifflin.Search in Google Scholar

Received: 2023-05-22

Accepted: 2024-04-22

Published Online: 2024-08-05

Published in Print: 2024-10-28

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/lingty-2023-0038

Keywords for this article

perception verbs; colexification; lexical typology; sensory language; word formation

Creative Commons

BY 4.0