Home Linguistics & Semiotics A crosslinguistic study of conditions on argument indexing
Article Open Access

A crosslinguistic study of conditions on argument indexing

  • Katherine Walker and Eva van Lier EMAIL logo
Published/Copyright: January 20, 2026
Linguistics
From the journal Linguistics

Abstract

This paper presents a systematic study of conditional argument indexing in 83 genealogically diverse languages. ‘Indexing’ means a form of marking (affixal, clitic, or non-linear) on a verb that expresses one or more features (person, number, gender/noun class) of an argument. ‘Conditional argument indexing’ is when a particular argument is not indexed (or indexed differently) under certain conditions. We study all three (in)transitive core arguments and a wide range of conditions, including referent properties (e.g., person, animacy, discourse prominence), TAMEP (tense, aspect, mood, evidentiality, polarity), event semantics, lexical factors, and co-occurrence of an independent argument. Our results show that conditional indexing systems are typically conditioned by more than one factor and that lexical restrictions are often involved. Also, while conditional indexing is attested in similar proportions across the three argument roles, it tends to be triggered by different factors for each role. Unlike in earlier studies of differential argument flagging, this study shows that referent properties have similar effects on argument indexing for each role, in terms of the values associated with the presence or absence of indexing. These findings have implications for the communicative function and grammaticalization of (conditional) indexing.

1 Introduction

This paper represents, to the best of our knowledge, the first attempt to provide a systematic crosslinguistic study of conditional argument indexing across a large sample of languages, taking into account all three (in)transitive core arguments (S, A, P)[1] and a wide range of conditions, including referent properties (e.g., person, animacy, discourse prominence), TAMEP (tense, aspect, mood, evidentiality, polarity), event semantics, lexical factors, and co-occurrence of an independent argument.

Using a genealogically varied sample of 83 languages, we draw on reference grammars to record if and how S, A, and P are indexed in each language, detailing the types of indexing system and any factors that condition them. Our results demonstrate that conditional indexing systems are typically multi-conditional (i.e., conditioned by more than one factor) and that lexical restrictions often play a role. Also, while conditional indexing is attested in similar proportions across the three argument roles, it tends to be triggered by different factors for S, A, and P. Yet, whenever referent properties are involved, they have similar effects on each role in terms of the values associated with indexing or lack thereof. We connect our results to existing literature on conditional (and differential) indexing and differential argument marking more generally, in particular to the communicative function and grammaticalization of indexing.

The paper is structured as follows: in Section 2, we draw on previous research to lay the groundwork for our study, providing a definition of the topic under investigation in Section 2.1 and further description of condition types in Section 2.2. This review yields a number of hypotheses, stated in the relevant subsections. Our research questions are presented in Section 2.3. We explain our methodology in Section 3 and present our results in Section 4. Finally, in Section 5, we discuss implications of the results, and summarize and conclude the paper.

2 Setting the scene

2.1 Defining (conditional) indexing

‘Indexing’ refers to marking on a verb that expresses features (person, number, gender/noun class) of an argument. An example is the 3sg suffix -s in English, which indexes S (it squeak-s) and A (it make-s a noise). Our definition of indexing is a broad term that encompasses ‘bound person forms’ (which may include number and/or gender features; Haspelmath 2013) and markers of gender and/or number only (following, e.g., Iemmolo 2011; Nichols 2018) as well as what is elsewhere defined as (grammatical or anaphoric) agreement (cf., e.g., Bresnan and Mchombo 1987; Corbett 2006), among others. In the present study, we investigate indexing of the core arguments S, A, and P only.

In terms of form, indexing in our definition encompasses affixes (person/number prefixes in (1)),[2] clitics (person/number proclitics in (2)),[3] and non-linear marking such as stem ablauting (number indexing in (3)). That is, indexing implies some kind of formal change that is phonologically or strictly positionally dependent on a verbal host or changes the form of the verbal host itself. We define ‘verbal host’ broadly to include any element of a (potentially multi-unit) verbal predicate.[4] In (3), for example, the number of S is indexed through ablaut on the lexical verb (a suffix on the auxiliary indexes person).[5]

(1)
Kamang (Alor-Pantar)
Leon na- tak-si
Leon 1 sg . i -see-ipfv
‘Leon sees me.’
(Schapper 2014: 254)
(2)
Bulgarian (Balto-Slavic, Indo-European)
Kuče-to ja= goni kotka-ta
dog-art.sg.n 3sg.f.acc=chase.prs.3sg cat-art.sg.f
‘The dog chases the cat.’
(Compensis 2022: 30)
(3)
Iha (Greater West Bomberai)
a.
mehén te-we
sit. sg aux-3.prs
‘s/he is sitting’
b.
m ihí te-we
sit. pl aux-3.prs
‘they are sitting’
(Walker and Himmelmann in press)

‘Conditional argument indexing’ refers to a system in which a particular argument is not indexed, or is indexed differently (i.e., with an index from a different paradigm) under certain conditions. To expand on the English example, S/A argument indexing is conditioned by a referent property (the feature of person/number: it squeaks vs I/you/we/they squeak[-ø]) and a tense condition: indexing occurs in the simple present, but not in the simple past (it/I/you etc. squeaked[-ø]). As elaborated below, it is common for conditional indexing systems to be multi-conditional, as in English.

The definition given above introduces several notions that require clarification. When a particular argument is indexed differently, it is a symmetrical system; when it is not indexed under certain conditions, it is an asymmetrical system (Iemmolo 2013). In this article, we discuss only the latter: asymmetrical conditional argument indexing.[6] For ease of reference, we will henceforth use ‘conditional indexing’ as a shorthand for asymmetrical conditional argument indexing. Note, too, that we use ‘zero’ in a theoretically neutral way to refer to the absence of an index; that is, we do not differentiate between ‘absolute zero’, ‘paradigmatic zero’ and ‘zero allomorphs’ (Siewierska 2009: 429).

Conditional indexing contrasts with a canonical indexing system (non-conditional or ‘full’ indexing), in which a dedicated paradigm of forms indexes a particular argument role on every verb, regardless of context (cf. Corbett 2006: 26).[7] A paradigm is a closed set of indexes that specifies one or more features (person/gender/number) of the referent in a particular argument role. Indexes can also express additional categories, such as other referent properties (e.g., animacy, definiteness, topicality) or TAMEP categories. An indexing system may consist of several paradigms, for instance in the case of person/number indexes fused with tense values. The ‘system’ is defined by the argument role (or roles, in the case of portmanteau forms) and the features indexed; hence, a language with different person/number indexing paradigms for S/A arguments for different tenses has multiple paradigms in a single indexing system.[8] A language which independently indexes different feature sets for a single role (e.g., separate person and number indexing) has multiple systems for a single role.

Our notion of conditional indexing (partially) overlaps with various terms discussed in earlier studies, both typological and language- or family-specific. In essence, conditional indexing is a type of ‘differential argument marking’ (DAM),[9] which is broadly defined as: “Any kind of situation where an argument of a predicate bearing the same generalized semantic argument role may be coded in different ways, depending on factors other than the argument role itself, and which is not licensed by diathesis alternations” (Witzlack-Makarevich and Seržant 2018: 3). DAM encompasses indexing as well as flagging (case or adpositional marking). In other words, DAM includes both Differential Argument Flagging (DAF) and Differential Argument Indexing (DAI).

DAI, then, is perhaps the closest terminological neighbour of conditional indexing, and it is for this reason that we draw on DAI literature. Like for DAM in general, DAI literature has tended to focus on referent property conditions on P marking, with ‘differential object indexing’ (DOI) as the best-known subtype. Coined by Iemmolo (2011), the label DOI is used, for example, by Haig (2018) and Just (2022, 2024). In addition to these crosslinguistic studies, the existence of various language- and family-specific analyses of DOI/DAI suggest that the phenomenon exhibits a broad areal and genealogical spread.[10]

As mentioned above, P (or ‘object’) is the role most frequently studied in relation to DAI. Just (2022, 2024) is a rare exception, providing a typological view on differential indexing of both A and P (her sample is not discussed), and claiming that differential A indexing is not uncommon, despite the focus on P. While Just (2022) uses the term ‘differential subject indexing’ to refer to S/A, the conditional indexing of S alone is generally discussed as a separate phenomenon, called split-/fluid-S, active/stative or semantic alignment (see, e.g., Dixon 1994; Donohue and Wichmann 2008; Mithun 1991). This is perhaps due to the fact that, first, split-S systems “appear to be most frequent in languages which mark argument relations directly on the verb rather than as case on argument nouns” (Næss 2007: 168), so that there is less comparison to be made with differential flagging. Second, factors unrelated to referent properties are pervasive: typically, differential S indexing is conditioned by lexical factors, event semantics, or a combination of both. For instance, in the multi-conditional indexing system in Kamang, indexing on some verbs is partially conditioned by event semantics. Compared with no indexing, as in (4a), the presence of indexing by prefixes, as in (4b), shifts the viewpoint to the middle and end of the event, thereby emphasizing affectedness of the S argument (see Schapper 2014; Walker 2024b).

(4)
Kamang (Alor-Pantar)
a.
kui tak
dog run
‘The dog runs.’
b.
kui ge -tak
dog 3. iii -run
‘The dog ran off (was forced to run).’
(Schapper 2014: 326)

Our purpose in introducing the term conditional indexing is to broaden the field of study beyond referent property-conditioned DAI to include systems conditioned by other factors.[11] The additional factors we focus on here, described in the following subsections, are TAMEP (considered a condition in some definitions of DAM), lexical restrictions (termed ‘sporadic agreement’: Corbett 2006: 17; Fedden 2019), event semantics (for S indexing, often called split-S or semantic alignment), and co-occurrence of an overt independent argument (cf. grammatical vs anaphoric agreement; Bresnan and Mchombo 1987). Gathering these factors under the umbrella of conditional indexing thus unites what have previously been separate strands of literature.

One motivation for including these different factors is that, when indexing systems are considered as a whole, they are often multi-conditional. That is, studies on DAI often focus on sub-parts of an entire system that contains multiple splits. And multiple splits may be conditioned by different factors: in Kamang, there are various subsystems conditioned by lexical conditions (i.e., verb class), animacy (see example (5) below), and event semantics (example (4)), as well as unclear triggers. By taking a holistic view that considers multiple conditions at work within a single system, this paper contributes to the continued discussion of the function of indexing in general. To do so, we address the following broad question: What (combinations of) factors condition what type of indexing? Here, ‘types’ of indexing refers to how different conditions correlate with argument role (S, A, P) and the expression of argument features (person/number/gender).

2.2 Condition types

2.2.1 Referent property conditions

Referent property conditions include features (person/number/gender), animacy, and a cluster of other referent properties labelled ‘discourse-related factors’, which include definiteness, topicality, and focus (Just 2024: 298; Ozerov 2018). Indexing systems conditioned by person (possibly combined with number and/or gender) are well known: consider the English example in which 3sg subjects (S and A arguments) are indexed by -s in the present tense, while all other person/number values are not indexed.

An example of animacy-conditioned P indexing is given in (5) from Kamang: faafa ‘search for’ in (5a) has a prefix that indexes a human P argument, whereas in (5b) the same verb has no index for a non-human P.

(5)
Kamang (Alor-Pantar)
a.
ge-dum=a ga- faafa
3.iii-child=spec 3. i -search.for
‘…[she] kept looking for the child.’
b.
taweng te-bini faafa
in.turns cmn.iii-lice search.for
‘…[they] search for each other’s lice.’
(Walker et al. 2024: 294)

Bulgarian (Balto-Slavic, Indo-European) has conditional P indexing triggered by discourse-related factors, illustrated by the proclitic ja= in (6b), which is absent in (6a). Compensis (2022) links the use of the clitic to discourse prominence, a cluster concept that “cuts across the categories of topic and focus” (Meakins and O’Shannessy 2010: 1704; see, e.g., Himmelmann and Primus 2015, as well as Riesberg 2018 on discourse prominence as a condition on differential flagging). Compensis describes the function of the clitic in conjunction with a conominal (an overt coreferential independent argument; Haspelmath 2013) as “(re)establishing” or “elevating the discourse prominence status” of the P argument (Compensis 2022: 248).

(6)
Bulgarian (Balto-Slavic, Indo-European)
a.
Kuče-to goni kotka-ta
dog-art.sg.n chase.prs.3sg cat-art.sg.f
‘The dog chases the cat.’
b.
Kuče-to ja= goni kotka-ta
dog-art.sg.n 3sg.f.acc =chase.prs.3sg cat-art.sg.f
‘The dog chases the cat.’
(Compensis 2022: 30)

Referent property conditions are frequently definitional for studies on DAI, most of which investigate differential P indexing. Indeed, Haig (2018), following Siewierska (1999, 2004), claims that P indexing gravitates towards being conditional, while S/A indexing tends to be non-conditional.[12] Part of the reason, according to Haig, is due to differences in grammaticalization: since P arguments are typically associated with discourse-new information, they are less often pronominal than A arguments. This relatively low pronoun rate for P means less opportunity for development into bound forms compared to S/A. The hypothesis that follows from this frequency-based account is that P indexing is less common than S/A indexing overall and, where it does exist, it is more likely to be conditional.

The existence of a dedicated body of literature on referent property conditions (i.e., DAI), particularly for P arguments, leads to the hypothesis that such conditions will be frequent overall and most frequently attested in P indexing compared to S and A. But note that some scholars would not consider so-called ‘paradigmatic’ zeroes (e.g., 3sg is zero) instances of conditional indexing for S and A. Since we do include such cases (as indexing conditioned by ‘feature’), this may lead to finding referent property conditions for S and A more frequently than has been reported in previous literature. Concerning the indexed features of the argument, referent property conditions are expected to be more frequent for person-only P indexing systems compared to systems that (also) index number and/or gender. Haig (2018) argues that person indexing for P arguments is relatively uninformative, given that in discourse P arguments are overwhelmingly third-person (see also Haig et al. 2021).[13] , [14] Since the ‘uninformativeness’ of person applies to P but not to S and A arguments, we do not expect a similar result for the latter roles.

The specific effect of referent property conditions on indexing is expected to correlate with an argument’s position on referential hierarchies (or ‘scales’), of which various versions have been proposed and discussed in the literature (see, e.g., Haspelmath 2021; Silverstein 1976; Timberlake 1977, among many others). The hierarchies relevant to our data are shown in (7): The general hypothesis is that the higher an argument is on any of the scales (i.e., the further left), the more likely it is to be indexed. Conversely, the lower it is (i.e., the further right), the more likely the argument is not to be indexed; that is, to be zero.

(7)
Scales of referential prominence
person: 1/2 > 3
animacy: human (> animal) > inanimate
discourse-related: definite (> specific indefinite) > indefinite non-specific

discourse-given/topical > discourse-new/non-topical

high prominence > low prominence

Importantly, the hypothesis is that the effect direction is the same for each argument role: for both A and P arguments, chances for indexation increase as referential prominence increases. This was already observed by Siewierska (2004: 149) and has been restated (for P arguments only) by Iemmolo (2011), and (for both A and P arguments) by Just (2022, 2024). We assume that the same holds for S.

The functional explanation proposed for this unified effect of referent property conditions on each argument role concerns reference tracking: speakers are more likely to track prominent arguments, regardless of their role in a given clause (e.g., Iemmolo 2011: 50). This synchronic function of indexes as reference trackers may result from the grammaticalization of erstwhile reference-tracking devices: pronouns, which are used for referentially prominent arguments (Just 2022, 2024: 20). (But note the absence in this account of zero [no overt independent form] as the most efficient coding choice for the most prominent referents.)

As first noted by Comrie (1979: 20), the unified directionality for indexing contrasts with differential flagging, in which the effect works in opposite directions for A and P: additional flagging tends to appear on referentially non-prominent (indefinite, inanimate, focal) A arguments, but on referentially prominent (definite, animate, topical) P arguments.[15] The proposed functional explanation for this is “expectation sensitivity: speakers add extra coding material when a meaning is unexpected in its context” (Haspelmath 2023); that is, when an A argument is unexpectedly non-prominent or a P argument is unexpectedly prominent (see Haspelmath 2021, 2023; Levshina 2022). As Haspelmath (2023) concedes, the fact that the ‘expected’ constellation of prominent A arguments triggers indexing constitutes a major exception to frequency-based explanations of asymmetrical coding in grammar.

Note that number is not included in the referential prominence hierarchies in (7). Literature on referential hierarchies is rarely explicit on the effect of number, other than the fact that arguments higher on the scale are more likely to show more number distinctions (Corbett 2000). This is probably because coding efficiency and referential prominence exert conflicting pressures: On the one hand, coding efficiency explains the crosslinguistic tendency for singular to remain unmarked while non-singular numbers are overtly marked (Corbett 2000). On the other hand, singular entities are more referentially prominent than plurals (Hopper and Thompson 1980; Timberlake 1977) and are therefore more likely to be indexed. In sum, number-only indexing systems may be more likely to index plural than singular (coding-efficiency hierarchy: pl>sg), while combined person/number systems may be more likely to index singular rather than plural (referential hierarchy: 3sg>3pl). The strongest signal for the effects of referential hierarchies, then, should come from person-only indexing systems (cf. Haig 2018).

A final point relating to referent property conditions is that some systems are not conditioned by the referential prominence of the indexed argument but by the relative prominence of the argument and its coargument. That is, whether A and/or P is indexed depends on the constellation of the properties of A and P. In DAM literature, this coargument condition is often called a scenario (or ‘hierarchical’) split (see Siewierska 2004: 51–56; Witzlack-Makarevich et al. 2016). In Laguna Keres (Keresan), for example, if A outranks P, as in the scenarios 1 > 3, 2 > 3, shown in (8a–b), the prefix indexes A, whereas if P outranks A – 3 > 1, 3 > 2, shown in (8c–d), the prefix indexes the P argument.

(8)
Laguna Keres (Keresan)16
  1. 16

    Prefixes, which also express mood and polarity, are drawn from two sets, labelled “A” and “B” by Lachler (2006: 140–143).

a.
si -ukacha
1 a -see
‘I see him.’
b.
sr -ukacha
2 a -see
‘You see him.’
c.
srg -ukacha
1 b -see
‘He sees me.’
d.
kɨdr -ukacha
2 b -see
‘He sees you.’
(Lachler 2006: 144–145)

2.2.2 TAMEP conditions

TAMEP (tense, aspect, mood, evidentiality, polarity) is a group of conditions related to properties of the predicate or clause. Kotiria (Tucanoan) illustrates this kind of conditioning: in realis mood (as opposed to irrealis), the person value (1 vs 2/3) of the S argument is indexed under the condition of visual evidentiality, shown in (9a) with the 2/3 person form -rá. However, no indexing occurs in inferential evidentiality: in (9b), the inferential form -ri does not express the feature of person.

(9)
Kotiria (Tucanoan)
a.
ti=hó-rí ∼tá-ká-pʉ́ ∼wa’á -rá
anph=drawing-pl rock-cls:round-loc be.leaning- vis.ipfv .2/3
‘The drawings (petroglyphs) are leaning on (carved into the surface of) the rock.’
b.
bá-yʉ’-dʉ-∼ka wa’a- ri hí-a
decompose-intens-dim go- nom(infer ) cop-assert.pfv
‘It (the curupira’s body) had (apparently) decomposed completely.’
(Stenzel 2013: 205, 262)

In terms of predicting the frequency of TAMEP conditions per role, few clear expectations arise from the literature. However, it is possible that TAMEP may be more often found to condition S/A indexing systems due to more frequent fusion than with P indexes (Creissels 2005: 57).

For directionality, a clear hypothesis is possible only for polarity: indexing is more likely in affirmative rather than negative contexts. First, because crosslinguistically languages generally make fewer contrasts in negative clauses than in affirmative ones (Miestamo 2005), and second, because negation affixes may replace indexes in a template. This is the case in Tariana (Arawakan): person/number indexing, such as the first-person singular prefix nu- in (10a) is replaced by the negative prefix ma- on negated verbs in non-future active clauses, shown in (10b) (Aikhenvald 2003: 400).

(10)
Tariana (Arawakan)
a.
yanaki nu- iɾa-ka (nuha)
whisky 1 sg -drink-rpst.vis (I)
‘I have drunk whisky.’
b.
yanaki ma-iɾa-kade-mha (nuha)
whisky neg-drink-neg-pres.nonvis (I)
‘I didn’t drink whisky.’
(Aikhenvald 2003: 400–401)

2.2.3 Event semantics

Some conditional indexing systems are determined by what we loosely term ‘event semantics’. Under this umbrella are concepts often classified as subcomponents of semantic transitivity, including properties of prototypical agents (volitionality, agentivity of A and agentive S [SA]), properties of prototypical undergoers (affectedness of P and patientive S [SP]), and properties of the event (telicity, dynamicity/‘kinesis’) (Hopper and Thompson 1980; Næss 2007: 168). As mentioned above, we expect event-semantic conditions to be more often attested for S arguments than A or P, given their frequent involvement in ‘split-S’ systems. This is illustrated by Kamang in (4) above, where a cluster of event-semantic properties, including increased affectedness of S and increased duration of an event, trigger indexing.

The directionality hypothesis for event-semantic conditions is that indexing of all argument roles (S, A, P) is more likely in events with higher transitivity. This follows from the general principle of more referentially prominent arguments being more likely to be indexed. That is, where properties of high transitivity correlate with properties of high referential prominence, indexing is more likely. For A and SA arguments, higher referential prominence correlates with higher agentivity, volitionality, control, involvement in an event, etc. For P and SP arguments, higher referential prominence correlates with greater affectedness. There is, however, a confound for P since prototypical events are described as having inanimate (i.e., low referential prominence) endpoints (see Næss 2004). Properties of the verb/clause that indicate higher transitivity – which clearly overlap with TAMEP and lexical conditions – are telic aspect, punctual events, realis mood, and affirmative polarity (Hopper and Thompson 1980).

2.2.4 Verb class

In systems conditioned by the verb class, there are lexical constraints such that different verbs or verb classes exhibit different indexing behaviour. For instance, in Iha two verbal bases obligatorily exhibit number ablaut – shown for ‘sit’ in (3a–b) above – while all other verbs have only one stem that is invariant for number (see Walker and Himmelmann in press).

Systems like Iha, where indexing is obligatory for some verbal bases but prohibited for others, have been described as “sporadic agreement” (Corbett 2006: 17; Fedden 2019); the verbs in these systems that prohibit indexing are “uninflectable” (see Spencer 2020). Systems in which lexical class is the only conditioning factor are typically not considered to be DAM; however, lexical constraints in multi-conditional systems in which they restrict alternation based on referent properties are termed ‘restricted DAM’ (see Witzlack-Makarevich and Seržant 2018: 21).

Fedden (2019) reports various motivations for lexical classes, including phonological, morphological, semantic, and etymological. An example of phonologically motivated verb classes is provided by Chechen (Nakh-Daghestanian), in which most vowel-initial verbs bear a gender prefix and consonant-initial verbs do not (see Komen et al. 2021). However, it is not a language-wide rule that words must be consonant-initial, and there are even some exceptions among verbs; for example, olxu ‘comb wool’ is not attested with gender indexing. Hence, we consider cases like Chechen to be conditional indexing, triggered by the lexical condition of (largely phonologically defined) verb class.

Semantic classes include groups of verbs with similar meaning (e.g., ‘cut-and-break’ verbs in Mian [Asmat-Awyu-Ok, Trans New Guinea]; Fedden 2019: 312–313) or verbs with similar kinds of P arguments (e.g., verbs that typically have animate objects in Teiwa [Alor-Pantar]; Fedden et al. 2013, 2014; Fedden and Brown 2017). In the latter case, a degree of grammaticalization occurs such that, in Teiwa for instance, indexing is no longer sensitive to animacy, leaving lexically conditioned indexing in the synchronic language.

It is our impression that verb class is a common but under-described condition and we expect it to be attested frequently in our study. The notion of incomplete grammaticalization leading to conditional indexing suggests that semantic classes will be more frequent for P arguments. Other types of lexical classes are expected to occur with equal frequency in any argument role. Note that which features are indexed may also play a role: number-only indexing is likely to be frequently lexically conditioned, and to be more likely to index S and P rather than A (Corbett 2000: 253, 257). In addition, number-only indexing often applies to a minority of verbs, such that the default class has no number indexing (Corbett 2000: 259).

Overall, the condition of verb class is expected to be more frequent for S/P compared to A. This is because 1) semantic classes are expected to be more frequent for P; 2) event-semantic conditions may motivate verb classes for S; and 3) number-only indexing systems, which are often lexically conditioned, are more frequent for S/P. Given the limited information available in grammars, we apply only very broad labels where possible to investigate the occurrence of different motivations for lexical classes (see Section 3.2). A further question concerns frequency: as Fedden (2019) postulates, verbs with high token frequency (those that occur most often in discourse) may retain a minority indexing pattern compared with the majority of lexical items that have lower token frequency. No clear hypothesis can be made in terms of whether the high token-frequency group is more likely to have or to lack indexing: the hypothesis merely links the minority indexing pattern (be that indexing or no indexing) with the most frequent verbs.

2.2.5 Co-occurrence restrictions

Whether a conominal is obligatory, optional or prohibited is definitional in various treatments of indexing phenomena (e.g. in grammatical vs anaphoric agreement, Bresnan and Mchombo 1987; Siewierska 1999). Here, we treat the ability of an index to co-occur with a conominal as a potential condition. An indexing system conditioned by co-occurrence restrictions is illustrated by Breton (Celtic, Indo-European) in (11a–b). In present-tense clauses, subject (S/A) indexing is only acceptable in the absence of an independent pronoun; if a pronoun is present, the verb form that occurs bears no index (Stump 1984).

(11)
Breton (Celtic, Indo-European)
a.
Levrioù a lenn- an .
books pcl read-l sg
‘I read books.’
b.
Me a lenn (*lenn-an) levrioù.
I pcl read (*read-1sg) books
‘I read books.’
(Stump 1984: 290–291)

The hypothesis that “verb agreement and overt arguments are in complementary distribution” (Nichols 2018: 846) – the ‘complementary hypothesis’ – has been the subject of several studies, some of which find little evidence of a correlation between richness of argument indexing and optionality of conominal expression (Bickel 2003; Fedden 2022; Nichols 2018), while others find a tendency to avoid co-occurrence of independent pronouns and coreferential indexes in discourse rather than a prohibition (Schnell and Barth 2020). Furthermore, languages in which dependent (specifically affixal) and independent forms cannot co-occur are typologically rare (Hengeveld 2012: 474). We therefore do not expect co-occurrence restrictions to be frequent. Where co-occurrence restrictions do occur, however, they are likely to be more frequent in P indexing than S/A, since P forms are often early grammaticalizers that ‘get stuck’ along the way to obligatory indexing (Haig 2018; Siewierska 1999).

2.3 Research questions

The motivation for the present study is the observation that existing typological studies of conditional indexing are restricted in one or more ways, relating to (i) the type of conditions taken into account (mostly referent properties); (ii) the number and type of argument role(s) considered (mostly just P, sometimes A and P, or rather S only); and (iii) the size and composition of the language sample (small, skewed, convenience sample, unclear).[17] A comprehensive study of conditional indexing taking all factors and all three roles into consideration, using a comparatively large and balanced sample of languages, is hitherto lacking. This study constitutes a first attempt to fill this gap. We use a genealogically balanced sample of 83 languages with conditional indexing in at least one argument role, based on data from AUTOTYP (Bickel et al. 2022; for more details, see Section 3.1).

Our broad research question (given at the end of Section 2.1 above) – what (combinations of) factors condition what type of indexing? – can be broken down into the following sub-questions:

  1. What is the crosslinguistic frequency and spread of conditional indexing (vs non-conditional or no indexing)?

  2. Do the different argument roles S, A, and P have:

    1. different frequencies of conditional indexing?

    2. different profiles in terms of the kinds of (combinations of) conditions on indexing?

    3. the same directionality; that is, are the same values within each condition more likely to trigger indexing versus zero (no indexing)?

3 Methodology

3.1 Sampling

From AUTOTYP (v1.1.0, Bickel et al. 2022), we gathered data on whether a language has agreement for the core arguments of intransitive and transitive verbs: S, A, and/or P (we excluded A, T, and G roles for ditransitive verbs). Note that the definition of agreement in AUTOTYP differs from the definition of indexing used here in that it requires at least one person value in a paradigm to be able to co-occur in a clause with a conominal (Bickel et al. 2013: 19). If there is agreement, we report whether the agreement is non-conditional or conditional, and extract the conditions reported for conditional agreement. We report our findings at a language level only, since, for technical reasons, it was not feasible in the present study to do so at the level of argument roles.

A summary of the AUTOTYP data is given in Table 1: ‘none’ indicates that a language has no agreement with any role, ‘conditional’ that at least one role has conditional agreement, and ‘non-conditional’ that there are no conditions on agreement in any role. Out of 537 languages, 380 (70.8 %) have agreement for at least one role, of which a little under half (178 languages; 46.8 %) have conditional agreement.

Table 1:

Agreement types per language in v1.1.0 of AUTOTYP (2022).

Agreement type n= % Grand total % Agreement total
None 157 29.2
Agreement 380 70.8
  Non-conditional   202   37.6 53.2
  Conditional   178   33.1 46.8
Total 537

Our AUTOTYP search thus yielded 178 languages that have conditional agreement for at least one of the core arguments S, A, P. To create the sample we use in this study, we selected one language per ‘major branch’, the level below ‘stock’ (AUTOTYP is based on the classification system in Nichols et al. 2013; isolates are classed as single members of their own stock). For example, Indo-European is a stock with conditional agreement present in five major branches: Balto-Slavic, Germanic, Greek-Armenian, Indo-Iranian, and Italic-Celtic. Languages that were the sole representative of major branches are selected by default. If, as in the Indo-European branches, there were multiple representatives, we selected one based primarily on quality and accessibility of literature (prioritizing open-access resources). The 178 conditional-agreement languages belong to 64 stocks (roughly half of the total of 122), 19 of which are isolates or stocks with only a single member in the sample. The 85 major branches within these 64 stocks are the basis for our sample.

In some cases, we replaced a language from AUTOTYP with a language from the same (or nearest possible) genealogical grouping (based on Glottolog [Hammarström et al. 2024] classifications). Substitution was necessary for reasons of quality and accessibility of literature, and in a few instances we were unable to find a suitable substitute. The resulting sample numbers 83 languages. It is essentially a genealogically stratified variety sample (see, e.g., Miestamo et al. 2016) created by leveraging the coarser-grained available data in an open-access typological database. The details of the sample are provided in the Supplementary Material.

3.2 Data coding

Having composed the sample, we consulted grammars and other published sources. Here, we followed our definition of (conditional) indexing as stated in Section 2, which, as detailed above, deviates slightly from the definition of agreement used by AUTOTYP. We entered the relevant information into an Excel sheet, which was then processed using R (R Core Team 2023) in RStudio (Posit Team 2024). Where appropriate, we tested for statistical significance using Pearson’s Chi-squared test, or Fisher’s exact test where expected values were lower than five (Levshina 2015: 213–214). For each language, we recorded whether the indexing of S, A, and P is conditional, non-conditional or always absent (‘none’). If it was conditional, we indicated the conditioning factor(s). We divided the sample equally between the two authors: to help ensure consistency, each language was re-checked by the author who performed the initial analysis and cross-checked by the other, with frequent discussion throughout the process.

The roles are considered separately; that is, we did not explicitly record alignment information, though it is possible to compare whether S/A or S/P are subject to the same conditions in a given language. Also recorded as separate systems are instances in which different features are indexed independently (cf. Section 2.1 for a discussion of ‘systems’). Thus, the total number of systems for each role is higher than the number of languages.

In the overwhelming majority of cases, we accepted the analysis in the sources concerning whether a form is a bound index or an independent pronoun. However, we sometimes overrode terminology that had originally been chosen for theoretical reasons not relevant to our investigation. For instance, Mauwake (Madang, Nuclear Trans New Guinea) is described as having a set of independent pronouns for P arguments (Berghäll 2015: 95–96). However, we categorize Mauwake as a language with P indexing, since these forms exhibit behaviour often associated with clitics (see also Järvinen 1991; Olsson 2016).

Note that, if at least one cell in one paradigm in a system has no index, then the whole system is classified as asymmetrical, even if it contains one or more symmetrical splits. This is demonstrated below for Coastal Marind (Anim): P is indexed by one of several different paradigms, resulting in symmetrical splits (see Section 2.1), but the whole system is considered asymmetrical, since some verbs do not index P at all, and 3sg is not indexed on some verbs. Part of the system is represented as a hierarchical tree structure in Figure 1. In the figure, Split 1, Split 2, and Split 3 are conditioned by verb class, while Split 4, for the irregular prefixing verbs ‘shoot’, ‘feed’, and ‘see’, is conditioned by the features of person/number. We take these hierarchies into account insofar as we record all the different factors that condition zero. Hence, Coastal Marind P indexing is recorded as asymmetrical indexing conditioned by verb class (Splits 1, 2, and 3) and feature (person/number; Split 4), regardless of whether some branches are ‘symmetrical’ (i.e., do not involve a zero) or not.

Figure 1: 
Coastal Marind P indexing. “(…)” indicates further branches of a split not represented for the sake of simplicity; for a full description of P indexing, see Olsson (2021).
Figure 1:

Coastal Marind P indexing. “(…)” indicates further branches of a split not represented for the sake of simplicity; for a full description of P indexing, see Olsson (2021).

The main conditions we recorded are categorized as referent properties, TAMEP, event semantics, verb class, and co-occurrence restrictions (see Section 2.2). An ‘other’ category included miscellaneous rarely attested factors (see Section 4.7). For all factors, we noted when authors described an alternation as optional or probabilistic. For example, 3pl clitics in Kharia (Mundaic, Austroasiatic) are more likely to occur with human referents (Peterson 2011: 100). However, optionality was not taken into account in the analysis for the present paper; we leave this for future work. That is, Kharia 3pl clitics are categorized as animacy-conditioned without further comment.

Referent properties include features (person/number/gender), animacy, and discourse-related factors, as defined in Section 2.2.1. We also include coargument conditions, where indexing depends on properties of the coargument (excluding reflexives). The results are reported in Section 4.2. Animacy is reported as a binary split between higher versus lower, labelled ‘animate’ and ‘inanimate’ for convenience. That is, we do not differentiate between animate versus inanimate and human versus non-human systems, even though the latter are perhaps more widespread. We did not come across a system that had a three-way animacy condition (human vs other animate vs inanimate). For the features of person and number, we recorded first, second, and third person, plus singular versus non-singular number. More fine-grained distinctions in non-singular number, including clusivity distinctions, are not considered. For gender or noun class, we record categories as given in the grammars.

We gather under the umbrella of ‘discourse factors’ a number of different conditions, which are subdivided into the categories of topicality (including labels such as ‘high/low salience’, ‘given/new’, ‘proximate/obviate’), focus (e.g., ‘subject focus’, ‘predicate focus’), definiteness, specificity, word order (when it is assumed to be triggered by information structure; labels include ‘fronted conominal’ and ‘no preverbal conominal’), and ‘other’, which includes the notions ‘familiar’ (Oksapmin [Asmat-Awyu-Ok, Nuclear Trans New Guinea]; Loughnane 2009: 230) and ‘for reference tracking’ (Gyeli [Bantu, Atlantic-Congo]; Grimm 2021: 212). Terminology on information structure was taken from authors’ descriptions, with the knowledge that it may have been applied in very different ways. To address the directionality hypothesis, topicality values were converted into ‘high’ (e.g., definite, specific, topical, given, known, proximate, high salience) or ‘low’ discourse prominence (e.g., indefinite, non-specific, new, non-topical, obviate, low salience). It was not always clear how to translate all concepts into high or low prominence, and such cases were excluded from the directionality figures (see Section 4.2). ‘Focus’ was also excluded: in the few grammars it occurred in, the term was either defined in very different ways when describing a property of an argument, or it described a property of the clause or predicate. Word-order conditions were also, where possible, translated into high/low prominence, assuming that preverbal/fronted arguments are a rough proxy for high topicality.

TAMEP conditions are defined in Section 2.2.2. TAME categories were recorded largely as given in the grammars, with some summarization (e.g., we listed ‘PAST’ rather than remote past and recent past where these exhibit the same indexing behaviour). We did not set out to create crosslinguistically comparable TAME categories, so these remain largely language specific. The only TAMEP category with a directionality hypothesis is polarity: polarity was coded as either affirmative or negative and is reported in Section 4.6.

As described in Section 2.2.3, the event-semantic condition was operationalized as semantic transitivity, with values extracted from the language descriptions translated as ‘high’ or ‘low’ semantic transitivity. A selection of these values as they pertain to SA/A, SP/P, and the verb/clause are given in Table 2.[18]

Table 2:

Selection of values from language descriptions translated to ‘high’ and ‘low’ semantic transitivity.

Semantic transitivity SA/A SP/P Verb/clause
High Volitional

Active

More involved
Affected

Emphasis on change of state
Brief time frame

Individual activities
Low Non-volitional

Inactive

Less involved
Not affected

Stative

No emphasis on change of state
Longer time frame

Generic events

Verb-class conditions are defined in Section 2.2.4 and reported in Section 4.3. We aim to record verb classes of any size, including even single-verb ‘exceptions’ – consider the inclusion of the three-verb group ‘shoot’, ‘feed’, ‘see’ in Coastal Marind (Figure 1 above). Listings and/or descriptions of verb classes were taken from the language descriptions and include heterogeneous motivations. As an additional step, we categorized verb classes according to four broad motivating factors: phonological, semantic, arbitrary, and default. Phonological classes are illustrated by Chechen above (Section 2.2.4): in general, indexing occurs only on vowel-initial verbs, and not on consonant-initial verbs. These classes are categorized as phonological even though they are not watertight: the classes leak, so that a few exceptional vowel-initial verbs do not take indexing. Semantic classes include, for example, ‘cut-and-break verbs’ and verbs with high semantic transitivity (Mian [Asmat-Awyu-Ok, Nuclear Trans New Guinea]; Fedden 2019: 313–314). Again, these classes leak to a certain extent (Fedden 2019: 313–314), but we categorize them as semantic on the basis that very few classes do not leak at all, and a common semantic core has been identified by the grammar author. The ‘arbitrary’ category is something of a catch-all for verb classes that are not phonologically or semantically motivated. ‘Arbitrary’ classes can be morphological classes such as suffixing versus infixing versus prefixing, or lexical ‘exceptions’ such as the three-verb group ‘shoot’, ‘feed’, and ‘see’ in Coastal Marind.

‘Default classes’ are essentially also unmotivated/arbitrary classes, but unlike ‘arbitrary classes’ they form a large part, possibly the majority of the verbal lexicon, which occur alongside minority classes of one of the other types. Default classes are described as such in our sources, or they are clearly the most frequent pattern. To return to the example of Iha (example (3) above), number indexing via stem ablaut occurs on only two verbs – an arbitrary class – and no number indexing occurs on all other verbs – a default class. We recorded as many verb classes as could be gleaned from the sources we consulted, but exhaustiveness depends on the length of the grammar and the aims and interests of the grammar writer. Hence, the results are naturally incomplete.

Finally, co-occurrence restrictions are defined in Section 2.2.5 and reported in Section 4.5. We recorded whether the presence of indexes is restricted by the presence of a coreferential nominal (NP or pronoun). That is, if an index may not co-occur when a noun or an independent pronoun is present, we record a co-occurrence restriction (e.g., Breton in (11) above). However, we do not report the behaviour of conominals in general; that is, we do not report on the rarity of obligatory co-occurrence of indexes and conominals (Siewierska 2004: 268) or on frequency of co-occurrence in systems in which conominals can be omitted. For example, if pronouns occur less frequently with verbs belonging to a class of obligatorily indexed verbs than with verbs that cannot bear indexing, we record only the verb-class condition.

To limit the domain of study, we did not include every conceivable condition on indexing. First, we excluded clause type (main vs dependent clause), looking exclusively at main clauses.[19] Second, to reiterate from the definition in Section 2, conditional indexing excludes diathesis as a source of variation. Since voice markers, including direct-inverse markers, do not index features of an argument, they are excluded from the study. Finally, purely phonologically conditioned alternations (e.g., realizations of English 3sg -s as [z], [s], or [iz] are conditioned by the preceding phoneme) are excluded, though phonologically conditioned verb classes are included as conditional indexing.

4 Results

In this section, we first report the general results on conditional indexing in our sample (Section 4.1): frequency of conditions per role (Section 4.1.1) and associations between the most frequent conditions and each role, or ‘role profiles’ (Section 4.1.2). Subsequently, we address each condition type separately, testing the relevant hypotheses: (various types of) referent property conditions (Section 4.2), verb class (Section 4.3), event semantics (Section 4.4), co-occurrence restrictions (Section 4.5), polarity (Section 4.6), and the remaining miscellaneous conditions (Section 4.7). Since polarity was the only condition for which we have a specific hypothesis, it is the only TAMEP condition with a dedicated subsection. General frequency of TAMEP conditions is reported in Section 4.1.

4.1 General results

4.1.1 Frequency of condition types per argument role

This section reports the high-level results on conditional indexing in our sample. Based on previous literature and given our broad definition of conditional indexing, which admits various conditioning factors, we expect to find conditional indexing to be frequent for all argument roles. The relevant hypotheses concerning system types are that conditional indexing is more likely to outnumber non-conditional indexing for P arguments than S or A, though this has limited applicability in a sample composed solely of languages with conditional indexing.

Figure 2 shows the proportions of non-conditional and conditional indexing and ‘none’ (i.e., no indexing system exists at all for that role; see Section 3.1) per argument role. No indexing of any kind is attested in only 1 % of A-role systems and 18 % of P-role systems; that is, for the vast majority of languages in the sample, indexing is attested for all three roles.

Figure 2: 
Proportions of indexing system types per argument role (83 languages, 315 total systems). The number of systems per role is greater than the number of languages because we count systems that operate independently as separate systems. See Section 2.1.
Figure 2:

Proportions of indexing system types per argument role (83 languages, 315 total systems). The number of systems per role is greater than the number of languages because we count systems that operate independently as separate systems. See Section 2.1.

Figure 2 shows that conditional indexing is attested with similar frequency for all roles (S: 81 %, A: 80 %, P: 82 %), even though we did not aim to compose a sample with equal proportions for each role. The remaining system types show a clear split between S and A, where all but one system (A indexing in Kamang [Alor-Pantar]) is non-conditional indexing (19 %), and P, where all remaining datapoints involve no indexing at all (‘none’; 18 %). The fact that non-conditional P-indexing systems are not attested in the sample provides some support for the hypothesis that conditional is more likely to outnumber non-conditional indexing for P than for S or A.

The frequency hypotheses for condition types are restated below in (12), and all roles are also expected to be frequently multi-conditional; that is, conditioned by more than one factor.

(12)
Hypotheses: Frequency of condition per role
Referent property - P > S/A
TAMEP - S/A > P
Verb class - S/P > A
Event semantics - S > A/P
Co-occurrence - P > S/A

The frequencies of the (summarized) conditions per argument role are shown in Figure 3. Frequencies are given as a percentage of systems conditioned by a particular factor. Percentages add up to more than 100 % since, in line with our hypothesis, many systems are multi-conditional. Across all roles (n = 255), the majority of systems are conditioned by two or more factors (59 %, n = 151), while 41 % (n = 104) are conditioned by one factor only.

Figure 3: 
(Summarized) conditions in indexing systems in the sample. Raw frequencies are reported in Supplementary A.3.
Figure 3:

(Summarized) conditions in indexing systems in the sample. Raw frequencies are reported in Supplementary A.3.

Referent property conditions are attested with similarly high frequency for all roles, contrary to the hypothesis that they are more frequent for P. However, our hypothesis is confirmed for TAMEP conditions since they are considerably more frequent for S (56.7 %, n = 51) and A (58.0 %, n = 47) compared to P (21.4 %, n = 18). For verb class, the condition is noticeably less frequent for A (25.9 %, n = 21) than for S (43.3 %, n = 39) and P (40.5 %, n = 34). This is also in line with expectations.

For event-semantic conditioning (‘event’ in the figure) we expected that S would have the highest frequency, and this is borne out: the frequency is higher for S (13.2 %) than for A (6.2 %) and P (4.8 %). Co-occurrence restrictions occur in over 10 % systems for each role. We expected co-occurrence restrictions to be more frequent for P than S/A, but in fact they are least frequent for P (10.7 %) compared to S (11.1 %) and A (12.3 %), though the differences are small.

4.1.2 Role profiles

We expected that different roles would have different profiles; that is, there is a correlation between argument role and the type(s) of factors that condition indexing. Overall, there is a statistically significant correlation between the two variables ‘role’ and ‘condition’, when including only the three most frequent conditions (χ2 (4) = 17.709, p < 0.01). Figure 4 is a correlation plot (using the corrplot package; Wei and Simko 2021) that displays the residuals from the chi-squared test, illustrating the attraction or repulsion between each of the three most frequent conditions (see Figure 3) and each role. Positive residuals are in blue, signifying an attraction (positive association) between the corresponding row and column variables. Negative residuals are in red, implying a repulsion (negative association) between the corresponding row and column variables. The greater the value in each circle (positive or negative), the more it contributes to the test statistic (Levshina 2015: 218–219), which is visualized by larger circle size and colour intensity. Note that multi-conditionality (combinations of conditions) is not taken into account here.[20]

Figure 4: 
Correlation plot for the three most frequent conditions (blue = attraction; red = repulsion).
Figure 4:

Correlation plot for the three most frequent conditions (blue = attraction; red = repulsion).

There is a positive association between A and TAMEP and a slightly less strong association between S and TAMEP. The negative association between P and TAMEP is very strong; that is, this association makes the greatest contribution to the significance level. For P, there is a positive association with both verb class and referent property. S shows a negative association with referent property, while A is almost neutral.[21] For verb class, A shows a strong negative association with verb class, while S shows a weak positive association.

While Figure 4 above displays the association between individual factors and each role, in the following we investigate the association between combinations of factors and each role. As mentioned above, the majority of systems (59 %, n = 151) are conditioned by more than one factor. Conditions that occur together for each role are visualized in Figure 5 (using the ggVennDiagram package; Gao 2023). Each circle represents a condition, with areas of overlap indicating systems with more than one factor. Again, only the three most frequent conditions are included; that is, systems counted in the non-overlapping areas may still be multi-conditional if another minor condition is present. In Figure 5, darker shading indicates higher raw frequency of a particular condition or combination.

Figure 5: 
Combination of major conditions on indexing per argument role.
Figure 5:

Combination of major conditions on indexing per argument role.

The figure shows the high frequency of referent property conditions for all roles, individually and in combination. The frequency of TAMEP without either of the other two major conditions is low for all three roles; however, there is a clear difference in the overlap between TAMEP and referential conditions. For both S (39 %) and A (31 %), this is the most frequent system type, while only 11 % of P systems have this combination. The opposite pattern is attested for systems conditioned by both verb class and referent property conditions: this combination is more frequent for P (24 %) than for S and A (both 9 %). For verb-class conditions without either of the other two major factors, S (10 %) patterns with P (11 %) in being more frequent than for A (3 %). The general lower frequency of TAMEP conditioning for P is reflected in the different frequencies of systems with all three major factors: S (20 %) has the highest rate, followed by A (14 %), followed by P (5 %). While Figure 4 and Figure 5 are suggestive of different role profiles, a generalized linear model (GLM) for A and P only did not reach statistical significance (using the lme4 package; Bates et al. 2015).[22]

An example of the most frequent combination of common conditions for P – referent property and verb class – is Coastal Marind, partially illustrated above in Figure 1. The figure shows the combination of verb class and feature conditions for the verbs ‘shoot’, ‘feed’, ‘see’, which have person/number indexing except for 3sg, which is not indexed. Not represented in the figure are, first, two additional verb classes: ‘hear’ has indexing for animates and zero for inanimates, and ‘marry’ and ‘call someone’s name’ have zero for third person and 2pl. Second, there is a feature condition that does not interact with verb class: there is a role-neutral 1pl index, which means that 1pl P is always indexed, even on the half of transitive verbs that are lexically specified to occur without P indexing (Olsson 2021: 217–218).

Ik (Kuliak) exemplifies the most frequent combination of common conditions for S/A indexing: referent property and TAMEP, illustrated in Figure 6. The TAMEP condition in Split 1 distinguishes between realis and irrealis (irrealis includes negation; see Schrock 2014: 360–361). The referent property condition in Split 2 applies only to realis forms, with indexing for all person/number values except for 3sg arguments. In irrealis forms, however, all S/A arguments, including 3sg, are indexed. Examples are given in Figure 6 for 1sg and 3sg forms of the verb ɦye- ‘know’.

Figure 6: 

Ik (Kuliak) S/A indexing: TAMEP and person/number conditions (examples: Schrock 2014: 428, 580, 582, 611). dp: ‘dummy pronoun’, which “refers anaphorically back to a non-core argument mentioned earlier in the discourse” (Schrock 2014: 227).
Figure 6:

Ik (Kuliak) S/A indexing: TAMEP and person/number conditions (examples: Schrock 2014: 428, 580, 582, 611). dp: ‘dummy pronoun’, which “refers anaphorically back to a non-core argument mentioned earlier in the discourse” (Schrock 2014: 227).

4.2 Referent property conditions

The most frequent conditions for each role are referent properties, which comprise a cluster of subcomponents: features (person, number, gender/noun class), coargument conditions, animacy, and discourse-related factors (definiteness, specificity, topicality, etc.). The frequency in our sample of each of these subcomponents in systems conditioned by referent properties is shown in Figure 7 for each role.[23]

Figure 7: 
Distribution of subcomponents of referent property conditions (n = 214). Raw frequencies are reported in Supplementary A.3.
Figure 7:

Distribution of subcomponents of referent property conditions (n = 214). Raw frequencies are reported in Supplementary A.3.

Discourse-related factors are in turn composed of a number of subcomponents, as described in Sections 2.2.1 and 3.2. For example, definiteness and specificity play a role in the multi-conditional indexing of third-person plural S and A arguments in Turkish [Turkic]; see example (17) in Section 4.7. Regarding the correlation between the individual subcomponents of discourse-related factors and argument role, the results from independence testing did not reach significance (taking only A and P into account as the theoretically most different roles: p = 0.52). For raw frequencies, see Supplementary A.3.

The second set of hypotheses concern directionality: independently of role, we expect that higher-ranking arguments are more often indexed, and lower-ranking arguments are more often zero. For the feature condition where this includes person (rather than number and/or gender only), first and second person rank higher than third person. Hence, third person is expected to more often be zero than first/second person. This is the case in P indexing in Chol (Mayan) for example. In (13a), a first-person P argument is indexed with the suffix -oñ; in (13b), third-person P is not indexed.

(13)
Chol (Mayan)
a.
k-papaj=äch tyi ke i-päy- oñ
a1-sp:father=affr pfv start a3-call- b 1
‘My father started to call me.’
b.
tyi i-k’el-e pami
pfv a3-see-tv world
‘He saw the world.’
(Vázquez Álvarez 2011: 79, 266)

The results for systems conditioned by a person feature (excluding coargument conditioning) are shown in Figure 8 and confirm the hypothesis: for all roles, it is rare that third person is always indexed (i.e., without alternating with zero). The effect of person on indexing form (not split by role) is statistically significant (χ2 (4) = 106.74, p < 0.01). In the figure, the category ‘alternating (index/zero)’ indicates that a feature condition is involved in a multi-conditional system; that is, a particular feature value alternates between indexing and zero according to another condition. For instance, third-person values may alternate between indexing and zero depending on definiteness/specificity, a distinction that does not apply to other person values. It also includes systems in which, for example, one gender in third person is zero while another is indexed. The fact that several factors apply only to third person (animacy, definiteness, topicality, gender) explains the higher proportion of alternating third-person arguments than for first- and second-person arguments. However, since person is least informative for P arguments, third-person P is expected to be the most likely of all to alternate. This hypothesis is not confirmed. Among third-person arguments, S and A are the most likely to alternate (S: 87 %, n = 53, A: 80 %, n = 40), while the frequency for P is somewhat lower (58 %, n = 28). However, third-person P is more likely to be zero (21 %, n = 10) than S (8 %, n = 5) or A (2 %, n = 1).

Figure 8: 
Distribution of marking conditioned by the feature of person.
Figure 8:

Distribution of marking conditioned by the feature of person.

Systems with coargument conditioning alternate between indexing and zero depending on the constellation of features of the argument and its coargument (S therefore has no coargument conditioning by definition). Figure 9 shows the directionality for coargument-conditioned systems (only the person value of the argument rather than also of the coargument is taken into account). Our goal here is not to establish hierarchies in terms of which combinations of A and P arguments are most likely to be zero but to show which person values in a coargument-conditioned system are most likely to be zero. As above, third person is expected to be more likely to be zero than first or second person. This is the case, though the difference is largely due to frequency of alternation than frequency of zero (no instance of zero for first and second person; one instance each of third-person zero for A and P). The hypothesis that third person is most likely to alternate is thus confirmed. Independence testing (not split by role) reached significance (p < 0.01).

Figure 9: 
Distribution of marking by person of the argument in coargument-conditioned systems.
Figure 9:

Distribution of marking by person of the argument in coargument-conditioned systems.

While the hypothesis for directionality of systems that index the feature of person is clear, there are contradictory hypotheses for number. Following the coding-efficiency principle, we expect that non-singular numbers are more likely to be indexed and singular is more likely to be zero. The correlation between high discourse prominence and indexing, however, leads us to expect the opposite: singular arguments (i.e., more prominent ‘individuated’ referents) are more likely to be indexed and non-singular is more likely to be zero.

Figure 10 shows the results for systems that index only the feature of number. For simplicity, dual and paucal categories are excluded, and we compare only singular (‘sg’) and plural (‘pl’). Though the total number of systems is low, a clear pattern emerges for all argument roles in which singulars are more likely to be zero, while non-singulars most frequently alternate between zero and indexing. Taking all roles into account, the result is statistically significant (χ2 (2) = 43.508, p < 0.01). This result supports the hypothesis based on the principle of coding efficiency and refutes the hypothesis based on the higher referential prominence of singular referents.

Figure 10: 
Distribution of marking in number-only indexing systems.
Figure 10:

Distribution of marking in number-only indexing systems.

Number indexing in systems that also index person show the same directionality as number-only systems, but less strongly. Still, the result is statistically significant (χ2 (2) = 49.5, p < 0.01). As expected, the trend for non-singulars to be indexed and singulars to be zero is stronger for third person than first and second; see Supplementary A.3.

Figure 11 shows directionality of the animacy condition. Animacy systems have been converted to binary values of more animate versus less animate (‘inanimate’ in the figure). For systems with coargument conditions, only the animacy of the argument is recorded, not the animacy of the coargument.

Figure 11: 
Distribution of marking conditioned by animacy.
Figure 11:

Distribution of marking conditioned by animacy.

The hypothesis that arguments higher on the referentiality scale are more likely to be indexed is confirmed. Summarizing across all roles, the effect of animacy on form is statistically significant (χ2 (2) = 57.491, p < 0.01). While zero accounts for over 50 % of inanimates in animacy-conditioned systems, animates are only ever zero when they alternate due to another condition. The same pattern is found for all roles, but the highest rate of zero is found for inanimate P arguments. For animates, P also has the highest rate of alternation. Animacy-conditioned P indexing is illustrated by Kamang in (5) above, which is also conditioned by verb class.

For discourse-related factors, we look separately at ‘discourse prominence’, word order, and definiteness/specificity. We collect under the umbrella of ‘discourse prominence’ the labels associated with a broad conception of topicality and assign each label in a split to ‘high’ or ‘low’ discourse prominence. ‘High’ discourse prominence includes, for example, (more) topical, proximate, high global topicality, high salience, higher discourse topicality, recoverable from context, known, given. ‘Low’ discourse prominence includes new, obviate, and the opposite values from the list for ‘high’. These labels are applied to the argument itself, including if the argument is part of a coargument condition (e.g., 3prox>3obv is recorded as A = ‘high’ and P = ‘low’). The hypothesis is that high discourse prominence is more likely to be indexed and low discourse prominence is more likely to be zero. The results are shown in Figure 12, which is not split by role due to the low number of systems (S = 5, A = 9, P = 11).

Figure 12: 
Distribution of marking according to high/low discourse prominence.
Figure 12:

Distribution of marking according to high/low discourse prominence.

Figure 12 confirms the hypothesis: high-prominence arguments are more likely to be indexed, while low-prominence arguments are more likely to be zero, a pattern that is statistically significant (p < 0.01). Compared to other directionality figures, there is less alternation in Figure 12. This suggests that discourse prominence tends to trigger low-level splits; that is, after other conditions have been taken into account, it is the decisive condition that triggers indexing or zero. This is illustrated in (14) for Laguna Keres (Keresan): once the coargument condition has been taken into account (see (8) above), the decisive factor for indexing third-person A or P is topicality (Lachler 2006: 161–162). That is, in 3 > 3 scenarios, the more topical of A and P is indexed (with set A and set B prefixes, respectively) and the non-topical argument is zero.

(14)
Laguna Keres (Keresan)
a.
g- ukacha
3 a -see
he/she/it saw him/her/it’ (topical A > non-topical P)
b.
dzi- ukacha
3 b -see
‘he/she/it saw him/her/it’ (topical P > non-topical A)
(after Lachler 2006: 145, 161–162)

The results for directionality of word order (p = 0.15) and definiteness/specificity (p = 0.08) conditions are not statistically significant. For word order, we consider ‘fronted’ or ‘preverbal’ conominal constructions to be a rough proxy for topicalization, and therefore higher discourse prominence and a greater likelihood of indexation. Only nine systems in eight languages could be coded in this way (S: n = 1, A: n = 3, P: n = 5), and while the results are not significant, slightly more systems conform to the hypothesis than not. For instance, in Yeri (Nuclear Torricelli), P indexing is subject to various conditions, including verb class and person/number features. First- and second-person P arguments are indexed with prefixes, while third person has various options (see Wilson 2017: Ch. 7.3.3). One option is so-called ‘augment’ suffixes, which are sensitive to placement of the conominal: “there is a clear tendency to avoid having an object augmented suffix immediately precede a conominal in natural discourse” (Wilson 2017: 406). The standard word order is AVP (SVO); hence, indexing is less likely (though possible) when there is a conominal in the standard postverbal slot, shown by the lack of P suffix in (15a), and is more likely when a conominal is preverbal, shown in (15b). Note that indexing is also more likely if no overt conominal is present, and that Wilson does not comment on the function of preverbal conominals.

(15)
Yeri (Nuclear Toricelli)
a.
te-n n-ori wɨnoga
3-sg.m 3sg.m-hit.real older.brother
‘He hit the elder brother.’
(Wilson 2017: 404)
b.
wogɨl n-ori- wa-n
kundu.drum 2sg.m-hit.real-aug-sg.m
‘You (sg) beat the kundu drum.’
(Wilson 2017: 183)

Definiteness (and specificity), where definite/specific arguments are more discourse-prominent and therefore more likely to trigger indexing, is a factor in only 11 systems in 7 languages. Somewhat surprisingly, definiteness is a more frequent factor in S and A systems than P indexing (S: n = 4, A: n = 5, P: n = 2). Like the word-order condition, the result is not statistically significant, but more systems conform to the hypothesis than not. For example, in Hanis (Coosan), S and A indexes are optional in the presence of a conominal. Kroeber tentatively suggests that co-occurrence is more likely “when the referent is specific or topical” (2013: 115). This analysis seems to be supported by the example pair in (16): the specific S argument of ‘arrive’ in (16a) is indexed with a proclitic, but the generic S argument of ‘dwell’ in (16b) is not indexed.

(16)
Hanis (Coosan)
a.
či· ux w =hél·aq lə=temísin
there 3 du =arrive art=grandsons
‘His grandchildren arrived there.’
b.
x̌=qat me· til·áqai
adv(?)=below person dwell
‘People were living down below.’
(Kroeber 2013: 114)

4.3 Verb class

As outlined in the Methodology (Section 3.2), verb classes were categorized as semantic, phonological, arbitrary, or default. The overall frequency of each class type is shown in Figure 13. Multiple classes of the same type are not reflected in this figure; we record only that the category is attested in a system. That is, a system with only phonological classes – as in Chechen – is counted once in the figure, while a system with one arbitrary and one default class (i.e., respectively a small and a large verb class with neither semantic nor phonological motivation), like number indexing in Iha, is counted twice, once per class type (see Section 2.2.4). Since many systems include more than one type, percentages add up to more than 100 %. The hypothesis for event-semantic conditioning is relevant here, since some of the semantic verb classes are motivated by the same semantic categories; namely, that the order of most to least likely to be conditioned by event semantics is S>P>A. The hypothesis is weakly confirmed: semantic classes are slightly more frequent for S (41 %, n = 16) than P (38 %, n = 18), and P has a higher frequency than A (33 %, n = 7). The greatest difference is for phonological classes, which occur in 52 % (n = 11) of A systems but 31 % (n = 12) for S and only 18 % (n = 6) for P. Default and arbitrary classes occur with similar frequency for S and A, but with higher frequency for P.

Figure 13: 
Frequency of verb-class types in indexing systems conditioned by verb class. Raw frequencies are reported in Supplementary A.3.
Figure 13:

Frequency of verb-class types in indexing systems conditioned by verb class. Raw frequencies are reported in Supplementary A.3.

The majority of verb-class-conditioned systems have verb classes of more than one type. While there is a significant correlation between individual verb-class types and all three argument roles (χ2 (6) = 27.24, p < 0.01), we did not find any main effect or interactions for these verb-class types on A and P roles when fitting a GLM on the data (see Supplementary B.2).

There are no explicit hypotheses for directionality, except where semantic classes are motivated by factors associated with semantic transitivity; see Section 2.2.3. However, given that frequency is assumed to play a role in maintaining minority verb classes (i.e., verb classes with low type frequency are maintained due to high token frequency; see Section 2.2.4), it is pertinent to investigate the frequency of indexing versus zero for the default class. Summarizing across argument roles, the effect of default versus other verb classes on indexing behaviour (indexed, alternating, zero) is statistically significant (χ2 (2) = 26.527, p < 0.01). However, this is driven by number-only indexing systems: when these are removed from the total, the effect does not reach significance (χ2 (2) = 2.8579, p = 0.24). The difference in the distribution of verb-class types in number-only indexing systems compared to other indexing systems is reported in Supplementary A.3.

4.4 Event semantics

The condition type ‘event semantics’ was operationalized as semantic transitivity and includes some of the systems conditioned by ‘semantic’ verb classes that are also included in the preceding section. Examples of factors coded as ‘high’ are volitional S/A arguments, affected S/P arguments, events occurring in a brief time frame, and individuated activities. Non-volitional S/A arguments, less affected S/P arguments, events occurring over a longer time frame, and generic events are coded as ‘low’; see Section 3.2.

The hypothesis is that highly transitive events are more likely to have indexing for any role. Given the low numbers involved and the fact that the same directionality is expected for all argument roles, Figure 14 is not split by role. The hypothesis is confirmed, with indexing more likely for high-transitivity predicates and zero more likely for low-transitivity predicates, a result that reached statistical significance (p < 0.01).

Figure 14: 
Distribution of marking conditioned by event semantics.
Figure 14:

Distribution of marking conditioned by event semantics.

An example of a system (partially) conditioned by semantic transitivity is Kamang, illustrated in (4) above, where some verbs index P when it is more affected (high semantic transitivity), and do not index P when it is less affected (low semantic transitivity).

4.5 Co-occurrence restrictions

We expected co-occurrence restrictions to be relatively rare, and indeed they are attested in only 29 systems. In terms of directionality, we expected indexing to be more likely in the absence of a conominal. Figure 15 shows the distribution of marking in systems conditioned by the presence or absence of a conominal. Note that this includes only co-occurrence conditions on indexing and does not take into account the possibility of co-occurrence where this is not a trigger; see Section 3.2. Indexing is more frequent with absent conominals, and zero is not attested. When conominals are present, zero occurs in all argument roles (cf. Breton illustrated in (11) above). However, alternation is the most frequent option, showing that a further factor is decisive in the presence or absence of indexing when a conominal is expressed. Across all argument roles, the effect of co-occurrence restrictions is statistically significant (χ2 (2) = 35.161, p < 0.01).

Figure 15: 
Directionality effect of co-occurrence of conominal for all roles.
Figure 15:

Directionality effect of co-occurrence of conominal for all roles.

4.6 Polarity

The only TAMEP condition for which we had expectations regarding directionality is polarity: indexing is more likely in affirmative constructions, and zero is more likely in negatives. There are 34 polarity-conditioned systems in the sample but, although zero is more frequent in negatives (17.6 %, n = 6) than affirmatives (9.4 %, n = 3), the effect is not statistically significant (p = 0.40). An example of the expected directionality (negation triggering zero) was shown above for Tariana in (10).

4.7 Other conditions

Several conditions occur only rarely in the sample, and we merely provide some examples here. First, while we addressed co-occurrence with a conominal in Section 4.5, a small number of systems are conditioned by particular constituents of the conominal or the co-occurrence of the coargument. In Turkish, presence or absence of 3pl S/A indexing is conditioned by a number of factors, such that indexing is obligatory for arguments not expressed by an overt conominal (shown in (17a)), and optional if a conominal is present (shown in (17b)), depending on animacy, definiteness, and specificity (Kornfilt 1997). Indexing is prohibited for indefinites (regardless of animacy; Göksel and Kerslake 2005: 131) and also if NPs are “accompanied by a quantifier (birkaç ‘a few’ and birçok ‘many’), and are not overtly marked for number” (Bamyacı et al. 2014: 260), shown by the absence of -ler on ‘think’ in (17c).

(17)
Turkish (Turkic)
a.
Bu sabah gel-di-*( ler ).
this morning come-pst-*( 3pl )
‘They [the suitcases] arrived this morning.’
(Kornfilt 1997: 387 in Bamyacı et al. 2014: 261)
b.
Ӧǧrenci-ler gel-di-( ler ).
student-pl come-pst-(3pl )
‘Students came.’
(Sezer 1978: 26 in Bamyacı et al. 2014: 260)
c.
Birçok kişi çocukluǧ-u-nu pek düşün-mez.
many people childhood-poss.3sg-acc much think-neg.aor
‘Many people don’t think much about their childhood.’
(Göksel and Kerslake 2005: 119 in Bamyacı et al. 2014: 260)

In (18), Nganasan (Uralic) illustrates A indexing conditioned by the properties of the P argument. There are two sets of suffixes that index A, the ‘subjective conjugation’ (sc), which indexes A only and is zero for 3sg, and the ‘objective’ conjugation (oc), a full paradigm of portmanteau forms that index person/number of A and number of third-person P. The choice between subjective and objective conjugation is based on properties of P: objective conjugation is used for third-person P arguments with high discourse prominence (topical, definite; Dalrymple and Nikolaeva 2011; Wratil 2018: 359; Wagner-Nagy 2018). In addition to the discourse-prominence status of P, the form of the conominal is relevant: objective forms never occur with a pronoun, but “tend to be used” if a lexical NP “appears at the beginning of the sentence and it is not followed immediately by the verb” (Wagner-Nagy 2018: 338–339). Since 3sg A is zero in the subjective conjugation and indexed (in a portmanteau form) in the objective conjugation, properties of the coargument – beyond scenario-based coargument conditions – trigger conditional indexing. In (18a), the object conjugation, which indexes both A and P (an alternative gloss is “3sg>sg”) is used where no conominal P is present, while in (18b), subjective conjugation is used (3sg is not indexed) with an overt conominal NP that expresses P immediately preceding the verb.

(18)
Nganasan (Uralic)
a.
Ka’təmi-ʔe -ðu .
look-pf -3sg.oc
‘He has looked at it.’
(Wratil 2018: 357)
b.
ŋonəi-ʔ śigiʔi-ʔ luu-ʔə-m-tu śeri-ʔə
one.more-adv ogre-gen.pl parka-augm-acc-sg.3sg(poss) put.on-pf(3sg.sc)
‘He has put on once more the ogre’s parka.’
(Wratil 2018: 358)

In some instances, indexing is determined by certain elements of the clause: S indexing in Alto Perené (Arawakan) is ‘fluid’, alternating between two sets of affixes depending on a number of factors such as volitionality, control, aspect, and discourse prominence (Mihas 2015: 455). Third person is indexed with ‘set 1’ prefixes, but is zero under the conditions that trigger ‘set 2’ suffixes for other person values, shown in (19a–b). As well as referent property and semantic conditions, we record as ‘other’ the prohibition on set 2 indexes (zero for third person) in the presence of a number of clausal elements, such as the adverb akiro ‘still’: in (19c), the verb must occur with a set 1 prefix.

(19)
Alto Perené (Arawakan)
a.
o- ja-t-atz-i o-shimaa-t-a
3nm.S.set1-go-ep-prog-real 3nm.S-fish-ep-real
‘She went to fish.’
(Mihas 2015: 165)
b.
ja-ite-tz-i (-∅) i-saik-ashi-vai-tz-i
go-cmpl-ep-real(-3S. set 2) 3m.S.set1-be.at-appl.int-dur-ep-real
‘He went to set traps for the animals.’
(Mihas 2015: 164)
c.
aikiro o- shitov-atz-i nija
still 3nm.S.set1-leave-prog-real water
‘The water level keeps rising.’
(Mihas 2015: 127)

Finally, there is a small group of factors that can be loosely described as ‘social’ conditions. These include dialectal or individual differences; for example, in Yeri (Nuclear Torricelli), “the majority of Yeri speakers pronounce a first person plural subject prefix h- [while] a small minority of Yeri speakers do not” (Wilson 2017: 347), shown in (20).

(20)
Yeri (Nuclear Torricelli)
a.
hebi h- or
1pl 1pl-lie.real
‘we sleep’
b.
hebi or
1pl lie.real
‘we sleep’
(Wilson 2017: 346)

5 Discussion and conclusion

Overall, this study shows that in our 83-language sample asymmetrical conditional argument indexing is most frequently influenced by the following (summarized) condition types: referent property, TAMEP, and verb class, in that order (Section 4.1.1). The pervasiveness of referent property conditions is not surprising and justifies the focus on this cluster of conditions in the existing literature. The large proportion of verb-class conditions confirms the observation that lexical restrictions are widespread but have not been systematically included in previous overviews (cf. Iemmolo 2011; Just 2024). In general, we also observed that, in our sample, conditional indexing is highly frequent for all argument roles. However, P is different from S and A in that for P the alternative to conditional indexing is having no indexing at all, whereas S and A have non-conditional indexing instead.

Further differences between S, A, and P have to do with their respective sensitivity to each of the three most frequently attested (summarized) condition types (Section 4.1.2). In particular, TAMEP was found to be most strongly associated with A and S, rather than P, while lexical restrictions (verb class) apply more commonly to P and S as opposed to A. Referent property conditions are most strongly associated with P indexing systems (but see below for the feature condition of person). These associations between argument roles and condition types are statistically significant when the conditions are considered in isolation. However, since many conditional indexing systems are triggered by multiple different condition types, we also considered combinations of the three most common factors for S, A, and P indexing systems, but this did not yield significant results. More data are needed in order to explore more fine-grained argument-role profiles, taking a multi-conditional perspective.

For systems triggered by referent property conditions, the results did not conform entirely to the hypothesis that all sub-conditions (the feature condition of person/number, coargument conditions, animacy, and discourse conditions) would be more frequent for P than A or S. Only animacy was considerably more frequent for P, while discourse factors showed a similar frequency for A and P and slightly lower frequency for S. The latter is perhaps because being selected as the sole argument of a verb is already an indication of high discourse prominence, which is not necessarily the case for both arguments of transitive verbs: since levels of (respective) discourse prominence are more variable, discourse factors are more likely to arise as conditions in A and P indexing than in S indexing.

In Section 4.2, we addressed individual referent property conditions, checking our hypotheses concerning effect direction; in particular whether, for all roles, referents with higher-ranked properties are more likely to be indexed. Concerning the most frequent conditions of feature (person/number/gender/noun class) and coargument, first- and second-person arguments were more frequently indexed than third persons for all roles. This confirms the hypothesis that referent property effects operate in the same direction independently of role. However, contrary to our hypothesis, third-person P does not alternate (between indexing and zero) more frequently than third-person S or A. This expectation follows from the high frequency of third-person P in discourse, and therefore its relative uninformativeness. Additional factors (animacy, definiteness) would then be more relevant for P than S or A. While animacy is a more frequent condition for P than S and A, definiteness/specificity shows the opposite pattern. Thus, the uninformativeness of third-person P in discourse is perhaps only reflected in the prevalence of zero, rather than higher rates of alternation.

As in the case of person, the directionality hypothesis was also borne out by other referent property conditions; that is, for animacy and discourse factors, indexing was more likely for higher-ranked arguments across all roles. However, relatively few systems are sensitive to discourse factors. This may seem unexpected in light of the proposed discourse-based referent-tracking function of indexing in general (see, e.g., Just 2024). It seems then, as has been described in the literature on differential P marking, that systems in which information-structural factors historically played a role tend to grammaticalize into systems that are synchronically subject to different referent property conditions, which typically correlate with topic-worthiness (Iemmolo 2010: 248).

While the directionality hypotheses for person, animacy, and discourse factors were clearly based on existing literature and borne out by the data, the expectation for the feature of number was ambivalent between zero for singular (more economical) and zero for plural (less referentially prominent). Our results favour the former hypothesis, since singular values (both in person/number indexing and in number-only indexing systems) were most often zero.

Verb classes (Section 4.3) have hitherto not been considered in detail as a factor (co-)triggering conditional indexing. Yet, they are overall the third most common condition. Based on existing literature on semantic alignment, the relevance of semantic verb classes for S was expected, but they were also frequently attested for P and A. Phonological classes are particularly common for A (and to a lesser extent S), possibly due to a higher degree of grammaticalization of subject indexes: overall stronger formal boundness of indexes may be reflected in their weakening and eventual disappearance in some phonological environments, resulting in different indexing behaviour of verb classes with specific phonological properties.

In general, many systems involve a combination of default and arbitrary classes or, in other words, a majority indexing pattern and an idiosyncratic minor group of verbs that deviates from this. For number-only systems, default classes are less likely to exhibit indexing, compared to non-default classes (cf. Corbett 2000: 259). It is to be expected that non-default classes, which by definition have relatively low type frequency, have high token frequency, in order for the indexing pattern to be maintained over time (cf. Polinsky and Comrie 1999 on Tsez). This information is rarely available in grammars, and crosslinguistic corpus data would be required to investigate this hypothesis.

The remaining conditions considered in our study were relatively less common. Setting aside the especially rare conditions in Section 4.7, the remaining conditions of event semantics (Section 4.4), co-occurrence restrictions (Section 4.5), and polarity (Section 4.6) showed the expected directionality: indexing is associated with properties of prototypically high-transitivity events, with the absence of a conominal, and with affirmative rather than negative constructions. However, only the results for event semantics and co-occurrence restrictions were statistically significant. As far as the latter are concerned, they are not frequent, but the results seem to be in line with the idea that the expression of an index and an overt conominal is redundant. Thus, they can be interpreted as support for a weak version of the complementarity hypothesis (cf. Schnell and Barth 2020; see Section 2.2.5).

In sum, our study confirms the claim that, despite conditional P indexing attracting the most attention, conditional indexing of A is widespread (Just 2022, 2024), given that it was highly frequent in a sample that did not select for equal representation of conditional indexing per role. Conditional S indexing was also frequent, a phenomenon that has been studied rather separately (cf. Just 2024). Our results confirmed the hypothesis that referent property conditions operate in the same direction across all roles, in stark contrast to differential flagging, in which A and P show opposite directionality effects. We take this as evidence in favour of the use of indexing as a means to tracking prominent referents of any role in discourse and as ultimately resulting from the grammaticalization of pronouns as the (originally) preferred way of encoding such prominent referents. However, an interesting avenue for further exploration is whether the synchronic situation has different diachrony: the end of a grammaticalization cline for pronouns is zero – if A is further along the cline, there is a question whether a synchronic ‘zero’ is due to loss of a marker, or a marker never having developed (due to, e.g., lack of third-person pronouns). For P, if indexing is less grammaticalized, ‘zero’ is perhaps more likely to be due to indexing never having developed in the first place. The high frequency of TAMEP conditioning and phonologically conditioned verb classes for A arguments point in this direction.

While our study provides quantitative support for trends suggested in recent earlier studies, at the same time it offers a more comprehensive picture of conditional indexing, not only by including all three argument roles, but also by considering a wider variety of factors, in particular verb classes. Another aspect in which our study goes beyond earlier studies is that we explicitly include feature conditions (i.e., person/number/gender) as a potential factor in conditional indexing. This means that so-called paradigmatic zeroes have a natural place in our typology, where they have typically been left aside. In considering all roles and increasing the number of possible conditions, we were able to present conditional indexing as a unified phenomenon. The fact that we see, on the one hand, clear differences between roles in terms of the type of conditions they tend to be sensitive to and, on the other hand, clear similarities in the effects of various conditions across all roles, lends support to this holistic approach.

We arrived at these conclusions by means of detailed coding of our 83-language sample, which was composed based on data extracted from AUTOTYP (Section 3.1). In Walker and Van Lier (in prep.), we reflect in more detail on this way of generating a sample, and on the use of large, open-access typological databases more generally.

In closing, we point out three areas we think would merit further exploration in future research. Firstly, grammatical descriptions often mention that indexing is ‘optional’ without pinpointing which factor or factors are decisive in determining whether an index actually appears. These kinds of probabilistic factors are best studied using corpus-based and/or experimental methods. More generally, experimental studies on conditional indexing appear to be lacking to date. Secondly, our treatment of verb classes in this study has been fairly basic. Given the pervasiveness of verb-class conditions, however, a more detailed study of verb classes in and across languages is in order, especially taking into account type versus token frequency. Finally, we have not applied a crosslinguistically comparable classification of TAME conditions. As a result, we were not able to discern any trends in terms of associations of certain TAME values with the presence or absence of indexing. This might be a promising avenue to explore further, given the link between tense/aspect values and coding splits in flagging, traditionally known as split ergativity (see, e.g., McGregor 2009: 490–492).

Abbreviations

1

first person

2

second person

3

third person

i, iii

Kamang indexing series

a, b

Laguna Keres/Chol indexing series

A

actor

acc

accusative

adv

adverb

affr

affirmative

anph

anaphoric

aor

aorist

appl

applicative

art

article

assert

assertion

aug

augment

augm

augmentative

aux

auxiliary

cls

classifier

cmn

common person

cmpl

completive

cop

copula

dp

dummy pronoun

du

dual

dur

durative

ep

epenthetic

gen

genitive

infer

inference

int

intent

intens

intensifier

ipfv

imperfective

m

masculine

n

neuter

neg

negative

nm

non-masculine

nom

nominalizer

nonvis

non-visual

obv

obviative

oc

objective conjugation

pcl

preverbal particle

pf

present perfect

pfv

perfective

pl

plural

poss

possessive

prog

progressive

prox

proximate

prs

present

pst

past

real

realis

rpst

recent past

S

subject of intransitive verb

sc

subjective conjugation

seq

sequential

set1/set2

Alto Perené indexing series

sg

singular

sp

Spanish borrowing

spec

specific

tv

transitive verb in perfective

vis

visual


Corresponding author: Eva van Lier, Department of Linguistics, University of Amsterdam, Amsterdam, Netherlands, E-mail:

Funding source: NWO Social Sciences and Humanities

Award Identifier / Grant number: VI.Vidi.195.008

  1. Research funding: Dutch Research Council (NWO, file number VI.Vidi.195.008).

References

Aikhenvald, Alexandra Y. 2003. A grammar of Tariana, from Northwest Amazonia. Cambridge, New York, et al.: Cambridge University Press.10.1017/CBO9781107050952Search in Google Scholar

Aikhenvald, Alexandra Y. & Robert M. W. Dixon (eds.). 2006. Serial verb constructions: A cross-linguistic typology. Oxford: Oxford University Press.10.1093/oso/9780199279159.001.0001Search in Google Scholar

Bamyacı, Elif, Jana Häussler & Barış Kabak. 2014. The interaction of animacy and number agreement: An experimental investigation. Lingua 148. 254–277. https://doi.org/10.1016/j.lingua.2014.06.005.Search in Google Scholar

Bates, Douglas, Martin Maechler, Ben Bolker & Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1). 1–48. https://doi.org/10.18637/jss.v067.i01.Search in Google Scholar

Berghäll, Liisa. 2015. A grammar of Mauwake (Studies in Diversity Linguistics 4). Berlin: Language Science Press.10.26530/OAPEN_603339Search in Google Scholar

Bickel, Balthasar. 2003. Referential density in discourse and syntactic typology. Language 79(4). 708–736. https://doi.org/10.1353/lan.2003.0205.Search in Google Scholar

Bickel, Balthasar, Giorgio Iemmolo, Taras Zakharko & Alena Witzlack-Makarevich. 2013. Patterns of alignment in verb agreement. In Dik Bakker & Martin Haspelmath (eds.), Studies in memory of Anna Siewierska, 15–36. Berlin & Boston: De Gruyter Mouton.10.1515/9783110331127.15Search in Google Scholar

Bickel, Balthasar, Johanna Nichols, Taras Zakharko, Alena Witzlack-Makarevich, Kristine Hildebrandt, Michael Rießler, Lennart Bierkandt, Fernando Zúñiga & John B. Lowe. 2022. The AUTOTYP database (v1.1.0). Available at: https://doi.org/10.5281/zenodo.6793367.Search in Google Scholar

Bickel, Balthasar, Alena Witzlack-Makarevich & Taras Zakharko. 2015. Typological evidence against universal effects of referential scales on case alignment. In Ina Bornkessel-Schlesewsky, Andrej L. Malchukov & Marc D. Richards (eds.), Scales and hierarchies: A cross-disciplinary perspective, 7–44. Berlin, Munich, Boston: De Gruyter Mouton.10.1515/9783110344134.7Search in Google Scholar

Bresnan, Joan & Sam A. Mchombo. 1987. Topic, pronoun, and agreement in Chicheŵa. Language 63(4). 741–782. https://doi.org/10.2307/415717.Search in Google Scholar

Compensis, Paul. 2022. Differential object indexing in Bulgarian: The role of discourse prominence and predictability. Cologne: University of Cologne PhD thesis.Search in Google Scholar

Comrie, Bernard. 1979. Definite and animate objects: A natural class. Linguistica Silesiana 3. 15–21.Search in Google Scholar

Corbett, Greville G. 2000. Number (Cambridge Textbooks in Linguistics). Cambridge: Cambridge University Press.Search in Google Scholar

Corbett, Greville G. 2006. Agreement (Cambridge Textbooks in Linguistics). Cambridge: Cambridge University Press.Search in Google Scholar

Creissels, Denis. 2005. A typology of subject and object markers in African languages. In F. K. Erhard Voeltz (ed.), Studies in African linguistic typology (Typological Studies in Language 64), 43–70. Amsterdam: John Benjamins.10.1075/tsl.64.04creSearch in Google Scholar

Dalrymple, Mary & Irina Nikolaeva. 2011. Objects and information structure (Cambridge Studies in Linguistics). Cambridge, New York, et al.: Cambridge University Press.Search in Google Scholar

Dixon, R. M. W. 1994. Ergativity. Cambridge: Cambridge University Press.Search in Google Scholar

Donohue, Mark & Søren Wichmann. 2008. The typology of semantic alignment. Oxford, New York, et al.: Oxford University Press.10.1093/acprof:oso/9780199238385.001.0001Search in Google Scholar

Fauconnier, Stefanie & Jean-Christophe Verstraete. 2014. A and O as each other’s mirror image? Problems with markedness reversal. Linguistic Typology 18(1). 3–49. https://doi.org/10.1515/lingty-2014-0002.Search in Google Scholar

Fedden, Sebastian. 2019. To agree or not to agree? A typology of sporadic agreement. In Matthew Baerman, Oliver Bond & Andrew Hippisley (eds.), Morphological perspectives: Papers in honour of Greville G. Corbett, 303–326. Edinburgh: Edinburgh University Press.10.1515/9781474446020-015Search in Google Scholar

Fedden, Sebastian. 2022. Agreement and argument realization in Mian discourse. Word Structure 15(3). 283–304. https://doi.org/10.3366/word.2022.0211.Search in Google Scholar

Fedden, Sebastian & Dunstan Brown. 2017. Participant marking: Corpus study and video elicitation. In Marian Klamer (ed.), Alor-Pantar languages: History and typology (Studies in Diversity Linguistics 3), 2nd edn., 403–446. Berlin: Language Science Press.Search in Google Scholar

Fedden, Sebastian, Dunstan Brown, Greville Corbett, Gary Holton, Marian Klamer, Laura C. Robinson & Antoinette Schapper. 2013. Conditions on pronominal marking in the Alor-Pantar languages. Linguistics 51(1). 33–74. https://doi.org/10.1515/ling-2013-0002.Search in Google Scholar

Fedden, Sebastian, Dunstan Brown, František Kratochvíl, Laura C. Robinson & Antoinette Schapper. 2014. Variation in pronominal indexing: Lexical stipulation vs. referential properties in Alor-Pantar languages. Studies in Language 38(1). 44–79. https://doi.org/10.1075/sl.38.1.02fed.Search in Google Scholar

Forker, Diana. 2018. Gender agreement is different. Linguistics 56(4). 865–894. https://doi.org/10.1515/ling-2018-0013.Search in Google Scholar

Gao, Chun-Hui. 2023. ggVennDiagram: A “ggplot2” implement of venn diagram. Available at: https://CRAN.R-project.org/package=ggVennDiagram.Search in Google Scholar

Göksel, Aslı & Celia Kerslake. 2005. Turkish: A comprehensive grammar (Routledge Comprehensive Grammars). London: Routledge.10.4324/9780203340769Search in Google Scholar

Grimm, Nadine. 2021. A grammar of Gyeli (Comprehensive Grammar Library 2). Berlin: Language Science Press.Search in Google Scholar

Haig, Geoffrey. 2018. The grammaticalization of object pronouns: Why differential object indexing is an attractor state. Linguistics 56(4). 781–818. https://doi.org/10.1515/ling-2018-0011.Search in Google Scholar

Haig, Geoffrey, Schnell Stefan & Nils N. Schiborr. 2021. Universals of reference in discourse and grammar: Evidence from the Multi-CAST collection of spoken corpora. In Geoffrey Haig, Stefan Schnell & Frank Seifart (eds.), Doing corpus-based typology with spoken language data: State of the art (Language Documentation & Conservation Special Publication 25), 141–177. Honolulu: University of Hawai’i Press.Search in Google Scholar

Hammarström, Harald, Robert Forkel, Martin Haspelmath & Sebastian Bank. 2024. Glottolog 5.0. Leipzig: Max Planck Institute for Evolutionary Anthropology.Search in Google Scholar

Haspelmath, Martin. 2013. Argument indexing: A conceptual framework for the syntactic status of bound person forms. In Dik Bakker & Martin Haspelmath (eds.), Languages across boundaries: Studies in memory of Anna Siewierska, 197–226. Berlin & Boston: De Gruyter Mouton.10.1515/9783110331127.197Search in Google Scholar

Haspelmath, Martin. 2021. Role-reference associations and the explanation of argument coding splits. Linguistics 59(1). 123–174. https://doi.org/10.1515/ling-2020-0252.Search in Google Scholar

Haspelmath, Martin. 2023. Differential argument marking: Why is indexing different? Paper presented at EDAP Conference, University of Potsdam, 21–23 March.Search in Google Scholar

Hellenthal, Anne-Christie. 2010. A grammar of Sheko (LOT 258). Utrecht: LOT.Search in Google Scholar

Hengeveld, Kees. 2012. Referential markers and agreement markers in Functional Discourse Grammar. Language Sciences 34(4). 468–479. https://doi.org/10.1016/j.langsci.2012.03.001.Search in Google Scholar

Himmelmann, Nikolaus P. & Beatrice Primus. 2015. Prominence beyond prosody-a first approximation. In Amedeo De Dominicis (ed.), PS-ProminenceS: Prominence in linguistics. Proceedings of the International Conference, 38–58. Viterbo: DISUCOM Press.Search in Google Scholar

Hopper, Paul J. & Sandra A. Thompson. 1980. Transitivity in grammar and discourse. Language 56(2). 251–299. https://doi.org/10.1353/lan.1980.0017.Search in Google Scholar

Iemmolo, Giorgio. 2010. Topicality and differential object marking: Evidence from Romance and beyond. Studies in Language 34(2). 239–272. https://doi.org/10.1075/sl.34.2.01iem.Search in Google Scholar

Iemmolo, Giorgio. 2011. Towards a typological study of differential object marking and differential object indexation. Pavia: University of Pavia PhD thesis.Search in Google Scholar

Iemmolo, Giorgio. 2013. Symmetric and asymmetric alternations in direct object encoding. STUF – Language Typology and Universals 66(4). 378–403. https://doi.org/10.1524/stuf.2013.0019.Search in Google Scholar

Järvinen, Liisa. 1991. The pronoun system of Mauwake. In Tom Dutton (ed.), Papers in Papuan linguistics (Pacific Linguistics A-73), vol. I, 57–95. Canberra: Australian National University.Search in Google Scholar

Just, Erika. 2022. A functional approach to differential indexing: Combining approaches from typology and corpus linguistics (LOT 620). Amsterdam: LOT.Search in Google Scholar

Just, Erika. 2024. A structural and functional comparison of differential A and P indexing. Linguistics 62(2). 295–321. https://doi.org/10.1515/ling-2021-0124.Search in Google Scholar

Kalin, Laura. 2018. Licensing and differential object marking: The view from Neo-Aramaic. Syntax 21(2). 112–159. https://doi.org/10.1111/synt.12153.Search in Google Scholar

Komen, Erwin R., Zarina Molochieva & Johanna Nichols. 2021. Chechen and Ingush. In Maria Polinsky (ed.), The Oxford handbook of languages of the Caucasus, 1st edn., 317–365. Oxford: Oxford University Press.10.1093/oxfordhb/9780190690694.013.10Search in Google Scholar

Kornfilt, Jaklin. 1997. Turkish. London: Routledge.Search in Google Scholar

Kroeber, Paul D. 2013. Pronominal clitics and indexability hierarchies in Hanis and Miluk Coosan. Anthropological Linguistics 55(2). 105–157. https://doi.org/10.1353/anl.2013.0006.Search in Google Scholar

Lachler, Jordan. 2006. A grammar of Laguna Keres. Albuquerque: University of New Mexico PhD thesis.Search in Google Scholar

Levshina, Natalia. 2015. How to do linguistics with R: Data exploration and statistical analysis. Amsterdam & Philadelphia: John Benjamins.10.1075/z.195Search in Google Scholar

Levshina, Natalia. 2022. Communicative efficiency: Language structure and use. Cambridge: Cambridge University Press.10.1017/9781108887809Search in Google Scholar

Loughnane, Robyn. 2009. A grammar of Oskapmin. Melbourne: University of Melbourne PhD thesis.Search in Google Scholar

McGregor, William B. 2009. Typology of ergativity. Language and Linguistics Compass 3(1). 480–508. https://doi.org/10.1111/j.1749-818x.2008.00118.x.Search in Google Scholar

Meakins, Felicity & Carmel O’Shannessy. 2010. Ordering arguments about: Word order and discourse motivations in the development and use of the ergative marker in two mixed Australian languages. Lingua 120(7). 1693–1713. https://doi.org/10.1016/j.lingua.2009.05.013.Search in Google Scholar

Miestamo, Matti. 2005. Standard negation: The negation of declarative verbal main clauses in a typological perspective. Berlin: De Gruyter Mouton.10.1515/9783110197631Search in Google Scholar

Miestamo, Matti, Dik Bakker & Antti Arppe. 2016. Sampling for variety. Linguistic Typology 20(2). 233–296. https://doi.org/10.1515/lingty-2016-0006.Search in Google Scholar

Mihas, Elena. 2015. A grammar of Alto Perené (Arawak) (Mouton Grammar Library 69). Berlin & Boston: De Gruyter Mouton.Search in Google Scholar

Mithun, Marianne. 1991. Active/agentive case marking and its motivations. Language 67(3). 510–546. https://doi.org/10.1353/lan.1991.0015.Search in Google Scholar

Morimoto, Yukiko. 2002. Prominence mismatches and differential object marking in Bantu. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG02 Conference, 292–314. Stanford: CSLI Publications.Search in Google Scholar

Næss, Åshild. 2004. What markedness marks: The markedness problem with direct objects. Lingua 114(9). 1186–1212. https://doi.org/10.1016/j.lingua.2003.07.005.Search in Google Scholar

Næss, Åshild. 2007. Prototypical transitivity (Typological Studies in Language 72). Amsterdam & Philadelphia: John Benjamins.10.1075/tsl.72Search in Google Scholar

Nichols, Johanna. 2018. Agreement with overt and null arguments in Ingush. Linguistics 56(4). 845–863. https://doi.org/10.1515/ling-2018-0015.Search in Google Scholar

Nichols, Johanna, Alena Witzlack-Makarevich & Balthasar Bickel. 2013. The AUTOTYP genealogy and geography database: 2013 release. Available at: http://www.spw.uzh.ch/autotyp/.Search in Google Scholar

Olsson, Bruno. 2016. A grammar or Mauwake (review). Linguist List 27. Available at: https://linguistlist.org/issues/27/27-2603/.Search in Google Scholar

Olsson, Bruno. 2021. A grammar of Coastal Marind (Mouton Grammar Library 87). Berlin & Boston: De Gruyter Mouton.10.1515/9783110747065Search in Google Scholar

Ozerov, Pavel. 2018. Tracing the sources of information structure: Towards the study of interactional management of information. Journal of Pragmatics 138. 77–97. https://doi.org/10.1016/j.pragma.2018.08.017.Search in Google Scholar

Peterson, John. 2011. A grammar of Kharia: A South Munda language (Studies in South and Southwest Asian Languages). Leiden: Brill.10.1163/ej.9789004187207.i-474Search in Google Scholar

Polinsky, Maria & Bernard Comrie. 1999. Agreement in Tsez. Folia Linguistica 33(1–2). 109–130. https://doi.org/10.1515/flin.1999.33.1-2.109.Search in Google Scholar

Posit Team. 2024. RStudio: Integrated development environment for R. Boston: Posit Software, PBC. Available at: http://www.posit.co/.Search in Google Scholar

R Core Team. 2023. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Available at: https://www.R-project.org/.Search in Google Scholar

Riesberg, Sonja. 2018. Optional ergative, agentivity and discourse prominence: Evidence from Yali (Trans-New Guinea). Linguistic Typology 22(1). 17–50. https://doi.org/10.1515/lingty-2018-0002.Search in Google Scholar

Schapper, Antoinette. 2014. Kamang. In Antoinette Schapper (ed.), The Papuan languages of Timor, Alor and Pantar, vol. I (Pacific Linguistics 644), 285–350. Berlin, Boston: De Gruyter Mouton.10.1515/9781614515241.285Search in Google Scholar

Schmidtke-Bode, Karsten & Natalia Levshina. 2018. Reassessing scale effects on differential case marking: Methodological, conceptual and theoretical issues in the quest for a universal. In Ilja A. Seržant & Alena Witzlack-Makarevich (eds.), Diachrony of differential argument marking (Studies in Diversity Linguistics 19), 509–538. Berlin: Language Science Press.Search in Google Scholar

Schnell, Stefan & Danielle Barth. 2020. Expression of anaphoric subjects in Vera’a: Functional and structural factors in the choice between pronoun and zero. Language Variation and Change 32(3). 267–291. https://doi.org/10.1017/S0954394520000125.Search in Google Scholar

Schrock, Terrill B. 2014. A grammar of Ik (Icé-tód): Northeast Uganda’s last thriving Kuliak language (LOT 374). Amsterdam: LOT.Search in Google Scholar

Sezer, E. 1978. Eylemlerin çoǧul öznelere uyumu [The agreement of verbs with plural subjects.]. Genel Dilbilim Dergisi (Ankara Dilbilim Çevresi Derneǧi). 25–32.Search in Google Scholar

Siewierska, Anna. 1999. From anaphoric pronoun to grammatical agreement marker: Why objects don’t make it. Folia Linguistica (special issue) 33(1–2). 225–251. https://doi.org/10.1515/flin.1999.33.1-2.225.Search in Google Scholar

Siewierska, Anna. 2004. Person (Cambridge Textbooks in Linguistics). Cambridge: Cambridge University Press.Search in Google Scholar

Siewierska, Anna. 2009. Person asymmetries in zero expression and grammatical function. Essais de linguistique generale et de typologie linguistique offerts au Professeur Denis Creissels à l’occasion de ses 65 ans. 425–438.Search in Google Scholar

Silverstein, Michael. 1976. Hierarchy of features and ergativity. In R. M. W. Dixon (ed.), Grammatical categories in Australian languages, 112–171. Canberra: Australian National University.Search in Google Scholar

Sinnemäki, Kaius. 2014. A typological perspective on differential object marking. Linguistics 52(2). 281–313. https://doi.org/10.1515/ling-2013-0063.Search in Google Scholar

Spencer, Andrew. 2020. Uninflectedness: Uninflecting, uninflectable and uninflected words, or the complexity of the simplex. In Lívia Körtvélyessy & Pavol Štekauer (eds.), Complex words, 1st edn., 142–158. Cambridge: Cambridge University Press.10.1017/9781108780643.009Search in Google Scholar

Stenzel, Kristine. 2013. A reference grammar of Kotiria (Wanano). Lincoln: University of Nebraska Press.10.2307/j.ctt1ddr99nSearch in Google Scholar

Stump, Gregory T. 1984. Agreement vs. incorporation in Breton. Natural Language and Linguistic Theory 2(3). 289–348. https://doi.org/10.1007/BF00133790.Search in Google Scholar

Timberlake, Alan. 1977. Reanalysis and actualization in syntactic change. In Charles N. Li (ed.), Mechanisms of syntactic change, 141–178. New York: University of Texas Press.10.7560/750357-006Search in Google Scholar

Unterladstetter, Volker. 2020. Multi-verb constructions in Eastern Indonesia. Berlin: Language Science Press.Search in Google Scholar

Vázquez Álvarez, Juan J. 2011. A grammar of Chol, a Mayan language. Austin: University of Texas PhD thesis.Search in Google Scholar

Wagner-Nagy, Beáta. 2018. A grammar of Nganasan (Grammars and Language Sketches of the World’s Languages: Indigenous Languages of Russia). Leiden: Brill.Search in Google Scholar

Walker, Katherine. 2024a. Conditional indexing (LOT 678). Amsterdam: LOT. https://doi.org/10.48273/LOT0678 (accessed 30 March 2025).Search in Google Scholar

Walker, Katherine. 2024b. Differential indexing in Kamang: A viewpoint alternation. Linguistics Vanguard (special issue) 10(1). 86–96.https://doi.org/10.1515/lingvan-2023-0052.Search in Google Scholar

Walker, Katherine, Pegah Faghiri & Eva van Lier. 2024. Argument indexing in Kamang. Studies in Language 48(2). 287–350. https://doi.org/10.1075/sl.21077.wal.Search in Google Scholar

Walker, Katherine & Nikolaus P. Himmelmann. In press. Iha. In Nicholas Evans & Sebastian Fedden (eds.), The Oxford handbook of Papuan languages. Oxford: Oxford University Press.Search in Google Scholar

Walker, Katherine & Eva van Lier. In prep. Dataset of conditions on argument indexing across languages.Search in Google Scholar

Wei, Taiyun & Viliam Simko. 2021. R package “corrplot”: Visualization of a correlation matrix. Available at: https://github.com/taiyun/corrplot.Search in Google Scholar

Wilson, Jennifer. 2017. A grammar of Yeri: A Torricelli language of Papua New Guinea. Buffalo: State University of New York PhD thesis.Search in Google Scholar

Witzlack-Makarevich, Alena & Ilja A. Seržant. 2018. Differential argument marking: Patterns of variation. In Ilja A. Seržant & Alena Witzlack-Makarevich (eds.), Diachrony of differential argument marking (Studies in Diversity Linguistics 19), 1–40. Berlin: Language Science Press.Search in Google Scholar

Witzlack-Makarevich, Alena, Taras Zakharko, Lennart Bierkandt, Fernando Zúñiga & Balthasar Bickel. 2016. Decomposing hierarchical alignment: Co-arguments as conditions on alignment and the limits of referential hierarchies as explanations in verb agreement. Linguistics 54(3). 531–561. https://doi.org/10.1515/ling-2016-0011.Search in Google Scholar

Wratil, Melani. 2018. Structural case and objective conjugation in Northern Samoyedic. In Ilja A. Seržant & Alena Witzlack-Makarevich (eds.), Diachrony of differential argument marking (Studies in Diversity Linguistics 19), 345–380. Berlin: Language Science Press.Search in Google Scholar


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/ling-2024-0091).


Received: 2024-05-16
Accepted: 2025-07-25
Published Online: 2026-01-20

© 2025 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 20.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/ling-2024-0091/html
Scroll to top button