Abstract
Despite extensive research on the ba-construction in Chinese, the diachronic change in the alternation between the ba and jiang constructions has received little attention. The present study takes a multifactorial approach to examine the factors that probabilistically condition the alternation based on diachronic data across twelve centuries. The results suggest two general trends. First, the odds of the ba-construction have increased over time at the expense of the jiang-construction. Second, over time, the effect size of the significant preference for the jiang-construction in informal genres has reduced from the 10th to the 19th century, and this preference has disappeared in modern times; accordingly, both informal and formal genres have converged to favor the ba-construction in modern times. Regression modeling also shows that there are both stable linguistic constraints (parallelism/syntactic priming, verb type, NP2 animacy, and NP2 length) and fluid constraints (adjunct semantics, and genre). This study advances our knowledge of the two disposal constructions and their evolution, sheds light on the Principle of No Synonymy (Bolinger, Dwight. 1977. Meaning and form. New York: Longman; Goldberg, Adele E. 1995. Constructions: A construction grammar approach to argument structure. Chicago: The University of Chicago Press; Goldberg, Adele E. 2002. Surface generalizations: An alternative to alternations. Cognitive Linguistics 13(4). 327–356), and makes a methodological contribution to the empirical testing of hypotheses. It can also provide insight into grammatical alternations in Mandarin.
1 Introduction
There seems to be a wide consensus that the basic word order of Mandarin Chinese is SVO order (e.g., Light 1979; Sun and Givon 1985) but there also exists a widely used syntactic configuration of SOV word order, whereby the object is preceded by the markers ba or jiang and is followed by the predicate verb (Sun and Givon 1985; Xing 1994: 201). This syntactic structure, which can be represented as “subject/NP1 + ba/jiang + object/NP2 + verb (+ X)” as in (1),[1] is commonly labeled “disposal form” or “disposal construction”. The term was introduced by Wang (1947) and it has generally been accepted in Chinese linguistic circles because the marked object in the construction tends to code a patient which undergoes an explicit change of state (Chappell 2013; Peyraube and Wiebusch 2020; Wang 1947).
| 弗兰西把书紧紧抱着, …… | ||||||
| Fúlánxī | bǎ | shū | jǐnjǐn | bào | zhe, | … |
| Francie | BA | book | tightly | hug | prog | |
| ‘Francie hugged the book tightly, …’ | ||||||
| 她已经将一头乌亮的美发全部剃光, 人也苍白瘦削了。 | ||||||||
| Tā | yǐjīng | jiāng | yītóu | wūliàng | dì | měifǎ | quánbù | |
| She | already | JIANG | one.head | black.bright | gen | beautiful.hair | all | |
| tì | guāng, | rén | yě | cāngbái | shòuxuē | le. | ||
| shave | unleft, | people | also | pale | thin | perf. | ||
| ‘She has shaved all her black hair, and she is pale and thin.’ | ||||||||
It is commonly accepted that the disposal markers ba and jiang can be traced back to the first verb (= V1, meaning ‘take/hold’) in a serial verb construction, as exemplified for ba in (2):
| 醉把茱萸仔细看。 | ||||
| Zuì | bǎ | zhūyú | zǐxì | kàn. |
| Drunk | BA | dogwood | carefully | watch |
| ‘(He) held the dogwood and watched it carefully./(He) watched the dogwood carefully.’ | ||||
A crucial process in the development of these disposal constructions is the grammaticalization from V1 to object marker, resulting in examples such as (1a) and (1b), where ba and jiang can no longer be interpreted as verbs (see Chappell and Peyraube 2011: 787–790; see also Chao 1968; Hopper and Traugott 2003: 28; Li and Thompson 1981: 466–479; Sun 2015: 429–442; Wang 1944).[2] Given that the ba- and the jiang-construction has been claimed to have undergone the same process of grammaticalization and have thus acquired the same disposal function over time (Cao and Long 2005; Wu 2003; among others), they can be considered interchangeable alternants of the Mandarin disposal construction.
The view that the two disposal constructions encode the same function, however, has been challenged in a number of diachronic and synchronic studies. Xing (1994), for instance, finds a register difference between ba and jiang: in the 16th century, ba was more likely to be used in informal texts while jiang was preferred in formal texts, and this difference has resulted in the disappearance[3] of jiang in written texts of modern times. In another diachronic study, Zhu (2016) investigates the distribution of the ba-construction and the jiang-construction in typical works from the dynasties of the early modern Chinese period (from the Tang dynasty to the Qing dynasty, i.e., 7th–19th century). He finds that in the initial stages, i.e., from the Tang dynasty to the Yuan dynasty (7th–14th century), the use of the jiang-construction was much more frequent than the ba-construction, but from the Ming dynasty onwards (15th–17th century), the ba-construction began to overtake the jiang-construction. Like Xing, she attributes this change of frequency to register: it was in colloquial texts that the ba-construction gained ground at the expense of the jiang-construction. Based on a synchronic investigation of ancient vernacular novels, Lu (2006) finds that the difference between ba and jiang mainly lies in the following three aspects: first, the ba-construction is usually used to describe the vulgarity of people;[4] second, the ba-construction has a more prominent descriptive function, that is, the use of it can bring a more vivid picture of the situation described; third, the object of the ba-construction can be very long while that of the jiang-construction is relatively shorter. Jing-Schmidt and Tao (2009), then, suggest that ba and jiang share the basic meaning of disposal, but contrast in terms of subjectivity and emotionality. Specifically, the ba-construction prototypically signifies subjectivity and emotionality while the jiang-construction prototypically signifies objectivity and precision (Jing-Schmidt and Tao 2009: 36). Wu (2013) finds that while the ba-construction and the jiang-construction in modern Chinese are largely interchangeable, they differ in that the jiang-construction has a strong color of Literary Chinese, viz., an antique flavor, while the ba-construction has a strong color of colloquialism and is usually used in colloquial texts.
As can be seen, these earlier studies largely ascribe the distinction between ba and jiang to pragmatic differences, with register as an important motivation (in addition to such factors as emotionality and subjectivity). In so doing, the studies subscribe to Goldberg’s (1995) Principle of No Synonymy, whereby two constructions that are formally different are also different semantically and/or pragmatically. Furthermore, rather than being based on wide-ranging corpus data, these studies are more limited in scope in that they only consider data from typical works of particular periods – whether from a diachronic or synchronic angle. The diachronic studies (Xing 1994; Zhu 2016), furthermore, only quantify the distribution differences of the two disposal constructions. In the present – diachronic – investigation, we advocate a single meaning approach, where the ba- and jiang-constructions are seen as essentially semantically/pragmatically equivalent; that is, semantic/pragmatic differences between the ba- and jiang-alternants may well exist (e.g., the register differences discussed above), but they have no truth-conditional value (see also Section 2.2). In line with the view that grammatical variation and change are multifaceted and non-deterministic (e.g., Bernaisch et al. 2014; Bresnan 2007; Theijssen et al. 2013), this study probes into the probabilistic nature of the ba versus jiang alternation, i.e., it examines the language-internal and language-external factors probabilistically constraining the alternation, as well as the change in these factors.
In sum, the present study adopts a corpus-based variationist approach to address the following three questions:
What are the factors that constrain the alternation between the ba-construction and the jiang-construction?
How did these factors change diachronically?
What are the motivations underlying this change of factors?
This paper is structured as follows. Section 2 describes data retrieval and annotation, as well as the circumscription of variable contexts. In Section 3, we introduce the regression model. Section 4 reports and discusses the results of mixed-effects regression analysis. Section 5 summarizes the findings of this study.
2 Methods
2.1 Data retrieval
Data for this study cover the period from the 10th century to the present day and comprise both Literary and Modern Mandarin Chinese. Data in Literary Chinese were extracted from the Center for Chinese Linguistics Corpus (henceforth CCL, Peking University) (Zhan et al. 2003), which spans about 3,000 years (1000 BCE – present), beginning with one of the earliest dynasties recorded in Chinese history, viz., the Zhou Dynasty (1046 BCE – 256 BCE). It subsumes two major sub-corpora: The Literary Chinese corpus (henceforth CCL-LC), the sub-corpus from which our Literary Chinese data were taken, and the Modern Chinese corpus (henceforth CCL-MC). The CCL-LC corpus covers the period from the Zhou Dynasty to the early years of the Republican era (1912–1920s), comprising 163,662,943 characters. Texts in CCL are from various genres, including fiction, drama, libretti, biography, history, news coverage, religious texts (Buddhist and Taoist), miscellanies, notes, dictionaries, and poetry.
Data in Modern Chinese were retrieved from the Beijing Language and Culture University Corpus Center (BCC) (Xun et al. 2016). BCC has a total size of about 9.5 billion characters and it comprises five genre-based sub-corpora: news reportage (2 billion), literary works (3 billion), dialogues (600 million), multi-genre (1.9 billion), and Literary Chinese (2 billion). The Modern Chinese data for our study were taken from the first three sub-corpora: news reportage, literary works, and dialogues.[5]
It should be noted that the CCL-LC corpus is dynasty-based. Chronological labeling of its texts is not in terms of specific years, but in terms of the historical dynasty in which they were created; each dynasty can therefore be seen as a sub-corpus of its own. Furthermore, taking into consideration proposals concerning the periodization of the Chinese language history (e.g., Sun 1996; Wang 2004 [1980]), the earliest three dynasties relevant for our study[6] – Wudai, Northern Song, and Southern Song – were conflated into period 1 (10th–12th century) and the following three dynasties – Yuan, Ming, and Qing – into period 2 (13th–19th century).[7] Period 3, then, is made up of the most recent period, i.e. Modern Chinese, and comprises the genre-based sub-corpora “dialogues”, “news reportage”, and “literary works” in BCC; see Table 1.
Finally retained data counts in each period.
| Period 1 | Period 2 | Period 3 | Total | |||||
|---|---|---|---|---|---|---|---|---|
| Wudai | Northern Song | Southern Song | Yuan dynasty | Ming dynasty | Qing dynasty | Modern Chinese | ||
| Ba-disposal | 33 | 160 | 112 | 368 | 374 | 714 | 1,129 | 2,890 |
| Jiang-disposal | 0 | 398 | 123 | 324 | 256 | 521 | 300 | 1,922 |
To extract the ba- and jiang-constructions, we first retrieved from the various CCL-LC and BCC sub-corpora all concordances involving ba/jiang, using the Chinese characters ba or jiang as the search query. The number of ba and jiang concordances thus retrieved from the Wudai, Northern Song, Southern Song, and Yuan sub-corpora were fairly limited, allowing processing (manual removal of noise[8] and coding) of all tokens. Accordingly, data counts for these dynasties, as presented in Table 1, comprise all relevant data. For the Ming and Qing sub-corpora, however, it was found that the number of retrieved concordances amounted to over 20,000, which is beyond manual manipulation. Thus, for each of these sub-corpora, we first sampled 1,000 ba-tokens as well as 1,000 jiang-tokens from the sets of retrieved concordances involving ba and jiang. All sampled concordances were then checked to get rid of noise (see footnote 8). The proportion of valid ba versus jiang tokens resulting from this operation was adjusted to reflect the proportion of valid ba versus jiang tokens within each sub-corpus as a whole.[9] As to the BCC sub-corpora, the number of concordances obtained involving ba and jiang ranged between 34,000 and 1,603,000; so, sampling was obviously used in the data collection. Care was taken that the valid ba and jiang tokens (i.e., the tokens retained after manual cleanup of noise) not only reflected the proportion of valid ba and jiang tokens in the whole corpus (cf. the processing of the tokens from the Ming and Qing dynasties), but also the distribution of ba and jiang tokens over the three genres. The token counts thus obtained are presented in Table 1.
2.2 Circumscribing the variable context
In this study, we adopt the quantitative variationist method to examine the factors probabilistically conditioning the choice between the two alternants of the disposal construction: the ba-construction and the jiang-construction. Specifically, we will set up a mixed-effects logistic regression model, with the two competing syntactic constructions as the values of a binary dependent variable and ten factors as independent variables.
In this study, it is assumed that the variants are largely interchangeable semantically/pragmatically (the single meaning approach). As was pointed out before, “semantic differences may well exist, but they have no truth conditional value” (De Smet 2019: 308). Nonetheless, the application of the variationist method to the study of syntactic variation has been questioned (e.g., Cheshire 1987; Lavandera 1978; for a recent overview, see Leclercq and Morin 2023), as it is always possible to come up with a semantic difference between two constructions. Recall that in Section 1, a number of studies were also discussed challenging the view that the ba- and jiang-construction has the same function. Nevertheless, Sankoff (1988: 153) defended the single meaning approach, noting that “While it is indisputable that some difference in connotation may, upon reflection, be postulated among so-called synonyms whether in isolation or in context … there is no reason to expect these differences to be pertinent every time one of the variant forms is used” (emphasis in the original). Kapatsinski (2009: 160) also argues that even if speakers/writers are actually aware of semantic distinctions when they make choices between competing constructions, the analyst is still justified to use multivariate analysis to study the choice as “the context of use may provide information about the semantics of the chosen variant”. In one word, distinctions in referential value or grammatical function among competing forms can be neutralized in discourse (Sankoff 1988: 153).
Accordingly, we feel justified in considering the ba-construction interchangeable with the jiang-construction if they can be mutually paraphrased, with no semantic/functional change, in the same context.
2.3 Explanatory factors
The tokens of the two disposal constructions (retained after cleanup of noise) were coded for a range of language-internal and language-external/lectal factors assumed to condition the variation. They are briefly presented in what follows.
2.3.1 Language-internal factors
Verb type. It is well known that the Mandarin disposal constructions involving ba and jiang emanated from serial verb constructions in Chinese linguistic history (e.g., Cao and Long 2005; Peyraube and Wiebusch 2020). As the two disposal markers – ba and jiang – had different original lexical meaning (Cao and Long 2005: 324), we posit that the verbs that collocate with them may vary diachronically and may affect the variation between ba and jiang. Hence, the semantics of 1,569 verbs were coded according to the taxonomy of semantic domains proposed by Biber et al. (1999: 360). After collapsing some categories owing to data sparsity, we ended up distinguishing four categories of verbs: (i) activity verbs (3), which consist of the types “activity verbs” and “communication verbs” proposed by Biber et al. (1999), (ii) verbs of mental activity (4), (iii) verbs of occurrence and existence (5), which comprise Biber’s existence verbs, occurrence verbs, and change-of-state verbs such as 杀 sha ‘kill’, 斩首 zhanshou ‘behead’, and (iv) other types of verbs (6).
| 因此,我就把我的花驴拴在小榆树儿上 …… | |||||||||||
| Yīncǐ, | wǒ | jiù | bǎ | wǒde | huā | lǘ | shuān | zài | xiǎo | ||
| Therefore, | I | then | BA | my | colorful | donkey | tie | on | little | ||
| yúshùer | shang | ||||||||||
| elm.tree | loc | ||||||||||
| ‘So I tied my donkey to the little elm tree…’ | |||||||||||
| 他埋怨弥娜把他忘了。 | ||||||
| Tā | mányuàn | mínà | bǎ | tā | wàng | le. |
| He | complain | Mina | BA | he | forget | perf |
| ‘He complained that Mina forgot him.’ | ||||||
| 西斜的日头把后窗照明亮如烛。 | ||||||||||
| Xī | xié | de | rìtou | bǎ | hòu | chuāng | zhào | míngliàng | rú | zhú. |
| West | disgonal | gen | sun | BA | rear | window | shine | bright | like | candle |
| ‘The west diagonal sun shines on the rear window as bright as a candle.’ | ||||||||||
| 安公子在山中料理了二日, 方将诸事办妥。 | |||||||
| Āngōngzǐ | zài | shān | zhōng | liàolǐ | le | èr | rì, |
| Angongzi | in | mountain | loc | deal.with | perf | two | day, |
| fāng | jiāng | zhū | shì | bàn | tuǒ | ||
| then | JIANG | all | matters | do | finish | ||
| ‘An Gongzi cooked in the mountains for two days, and then everything was done.’ | |||||||
It should be noted that in addition to “verb type” as a fixed-effect variable, we also incorporate specific “verb” as a random-effect variable in the regression analysis to gauge the by-item random effect that is by now customarily included in the multivariate study of syntactic variation (Hinrichs et al. 2015: 821).
Animacy of the object (NP2). The Mandarin disposal construction features differential object marking (DOM) (Peyraube and Wiebusch 2020), defined as the phenomenon in which certain objects of verbs are marked to reflect various semantic and pragmatic factors (Aissen 2003). Specifically, DOM in Mandarin involves the overt marking of the direct object by ba or jiang in its non-canonical preverbal position. As previous studies have revealed that animacy plays an important role in DOM cross-linguistically (cf. de Swart 2007 for Kannada; Comrie 1989 for Hindi; Epps 2008 for Hup; Shain and Tonhauser 2010 for Paraguayan Guaraní; Comrie 1989 for Russian, and von Heusinger and Kaiser 2003 for Spanish), animacy of NP2 is included as a variable in this multivariate study. To code the data for animacy, we followed the taxonomy proposed by Zaenen et al. (2004) and distinguished seven categories: animal, concrete entities, human, nonconcrete beings, organization (collective), place, and time. As the data for the categories of animal, organization, and place are sparse and there are no instances of the category “time”, we conflated animal and human into “animate”, nonconcrete beings and organization into “abstract entities”, and assigned place to “concrete entities”. That is, our final coding distinguished three categories for this variable: animate, concrete entities,[10] and abstract entities.
Definiteness of NP2. In addition to animacy, previous studies also corroborate that definiteness plays a role in the case-marking of direct object (Aissen 2003; Bossong 1985; De Swart and De Hoop 2018).
The semantic feature of definiteness seems to be quite obvious for some Indo-European languages such as English. In those languages, definite NPs are marked with definite articles, demonstrative pronouns, or possessive pronouns, or they are proper nouns per se; however, this is not the case for Mandarin Chinese, as it does not have definite articles. NPs in Mandarin can be classified into seven categories (Chen 2004): proper nouns, personal pronouns, NPs with the demonstrative pronouns 这 zhe/那 na ‘this/that’, bare nouns, numeral + quantifier + NP, 一yi ‘one’ + (quantifier) + NP, and numeral + NP. The first three categories are always definite, while the last two categories are always indefinite. The definiteness of the remaining two categories – bare nouns and numeral + quantifier + NP – depends on the context, i.e., if the NPs have been mentioned in the preceding context, they will be marked as definite, otherwise as indefinite. In addition, NPs with possessive pronouns are also definite.
Length of NP2. Previous studies (e.g., Lu 2006) point out that the object/NP2 of the ba-construction tends to be longer than that of the jiang-construction. In order to check this claim, the present study includes the length of the object/NP2 as a predictor. Since counting the number of characters has become the standard way of measuring length in Chinese (e.g., Liu 2007), and a character in either Literary Chinese or Modern Chinese corresponds to a syllable and also to a morpheme, we measure the length of the object/NP2 in terms of the number of orthographic characters. For the sake of reducing data skewing and the effect of outliers, NP2 length is capped at 10 characters. To ensure normal distribution, the values of NP2 length are log-transformed: log(# of characters in NP2 length).
Semantics of adjunct. Chinese syntax is characterized by a post-verbal restriction (e.g., Huang 1982, 1984]; Xing 1994: 209), that is, objects cannot co-occur with complex adjuncts in post-verbal positions. Accordingly, we posit that the semantics of adjuncts influence the choice of disposal constructions. The annotation distinguishes three levels – adverbial (as in (7)), complement (as in (8)), and without adjunct (as in (9)) – to gauge the effect of different types of adjuncts and to distinguish tokens with adjuncts from those without adjuncts.
| 保守思想把一些人束缚得紧紧的。 | |||||||
| Bǎoshǒu | sīxiǎng | bǎ | yīxiē | rén | shùfù | dé | jǐnjǐnde |
| Conservative | thought | BA | some | people | bind | de | tightly |
| ‘Conservative thoughts bind some people tightly.’ | |||||||
| 巴黎出版的一家杂志直截了当地把 1988 年称为非洲 “有毒废料年”。 | |||||||||
| Bālí | chūbǎn | de | yī | jiā | zázhì | zhíjiéliǎodāngdì | bǎ | 1988 | nián |
| Paris | publish | gen | one | class | journal | directly | BA | 1988 | year |
| Chēng | wéi | fēizhōu | “yǒudú | fèiliào | nián” | ||||
| call | as | Africa | “toxic | waste | year” | ||||
| ‘A magazine published in Paris directly referred to 1988 as Africa’s “toxic waste year”.’ | |||||||||
| 次日凌晨4时, 秘密蹲守的干警将一伙伺机作案的犯罪分子抓获。 | ||||||||||
| Cì | rì | língchén | 4 | shí, | mìmì | dūnshǒu | de | gànjǐng | ||
| Next | day | morning | 4 | o’clock, | secret | keeping.watch | gen | police.officer | ||
| jiāng | yī | huǒ | sìjī | zuò’àn | de | fànzuìfēnzǐ | zhuāhuò | |||
| JIANG | one | class | wait.for | commit.crime | gen | criminal | arrest | |||
| ‘At 4 am the next day, the secret police officer arrested the criminals who were waiting for the crime.’ | ||||||||||
Syntactic parallelism. This factor is intended to gauge the priming effect of syntactic structure, i.e., to capture the effect of a preceding ba-construction or jiang-construction on the repetition of that disposal construction in a following clause. The coding of the predictor distinguished two levels: yes (having a preceding parallel construction), and no (having no preceding parallel construction). When retrieving the data, we set a window of 30 characters to the left and 30 characters to the right of the query. If the resulting concordance/token of altogether 61 characters comprises two or more disposal constructions (no matter whether they are two (or more) ba-constructions, or two (or more) jiang-constructions, or one (or more) ba-construction and one (or more) jiang-construction), the token is coded as “yes” for the variable “parallelism”. For instance, in (10), there are two jiang-constructions.
| 女子无声地笑了笑, 将双腿在草地上放平。 “草也不错。” 陈河摸着草继续说。 他看到风将女子的头发吹拂起来。 | |||||||||||
| Nǚzǐ | wúshēngde | xiào | le | xiào, | jiāng | shuāng | tuǐ | zài | |||
| Woman | silently | smile | perf | smile, | JIANG | two | leg | on | |||
| cǎodì | shàng | fang | píng. | “Cǎo | yě | bùcuò.” | Chénhé | mō | zhe | ||
| Lawn | loc | put | flat. | “grass | also | good.” | Chenhe | touch | progr | ||
| cǎo | jìxù | shuō. | Tā | kàn | dào | fēng | jiāng | nǚzǐ | de | ||
| grass | continue | say. | He | see | arrive | wind | JIANG | woman | gen | ||
| tóufǎ | chuīfú | qǐlái. | |||||||||
| hair | blow | get.up | |||||||||
| ‘The woman smiled silently and flattened her legs on the grass. “The grass is also good.” Chen he touched the grass and continued to say. He saw the wind blowing the woman’s hair.’ | |||||||||||
2.3.2 Language-external factors
Genre. Genre is included as an explanatory factor, as previous research has shown that syntactic alternations may not only be contextually constrained by language-internal factors but also by language-external ones such as genre (see Cappelle 2009; Levshina et al. 2013; Röthlisberger and Tagliamonte 2020). In this respect, it has been suggested that the ba-construction tends to be favored in colloquial genres (e.g., Jing-Schmidt 2005; Jing-Schmidt and Tao 2009). In operationalizing the factor “genre”, we keep in mind that the various sub-corpora/dynasties of our dataset are not genre-balanced (some genres are only present in some sub-corpora and do not appear in other sub-corpora), and that tokens from some genres such as Buddhist scriptures are sparse. Hence, it was decided to merge related genres and assign the eleven genres in our dataset to two broad genre categories: formal and informal. Specifically, the formal genre comprises Buddhist scriptures, poems, fiction, literary works, history books, practical writings, and news reportage, while the informal genre comprises drama, letters, record of personal utterances, dialogues, microblog, and scripts for story-telling.
Period. As Jing-Schmidt and Tao (2009: 32) note, it is generally agreed in the literature that the distribution between the ba-construction and the jiang-construction changes over time. Tokens were therefore coded for the historical period in which they were created. Periodization of the tokens will also allow us to examine how the effects of other explanatory factors change over time. As mentioned earlier (see Section 2.1), three levels were distinguished: period 1, comprising Wudai (the 10th century), Northern Song (the 11th century), and Southern Song (the 12th century); period 2, comprising the Yuan dynasty (the 13th–14th century), the Ming dynasty (the 15th–16th century), and the Qing dynasty (the 17th–19th century); period 3 covering the Modern Chinese period (the 20th–21st century).
Corpus file. This factor captures individual bias at the level of the corpus text sample. It approximates a by-subject random effect which is often included in the multivariate study of language variation (Hinrichs et al. 2015: 821). Altogether the tokens in our dataset come from 1,059 files.
All the annotated independent variables and their values are listed in Table 2.
Independent variables.
| Variable | Values |
|---|---|
|
For fixed effects:
Categorical variables |
|
| Verb type (verb_type) NP2 animacy (NP2_Animacy) NP2 definiteness (NP2_Def) Adjunct semantics (AdjunctSem) Parallelism Period Genre Numerical variables NP2 length (NP2_length) For random effects: Verb Corpus file |
Activity, mental activity, occurrence and existence, other Animate, concrete, abstract |
| Definite, indefinite | |
| Adverbial, complement, without adjunct Yes, no |
|
| period_1 (10th–12th century), period_2 (13th–19th century), period_3 (20th–21th century) | |
| Formal, informal 1,569 levels 1,059 levels |
3 Regression analysis
We applied mixed-effects binary logistic regression using the lme4 package (Bates et al. 2015) in R. In our mixed-effects model, the binary response variable is one of two alternants of the Mandarin disposal construction, which is coded as ba-cx, or jiang-cx. As regards independent variables, mixed-effects modeling takes into account not only fixed effects, which are repeatable, but also random effects, which are non-repeatable and group-specific (Speelman et al. 2018: 1).
To find out the best-fitted model, we adopted backward stepwise selection: fitting a model with all predictors and possible interactions, and then conducting a stepwise selection (Levshina 2015: 266). Our maximal model included NP2 animacy, NP2 definiteness, NP2 length, adjunct semantics, verb type, parallelism, genre, and period as fixed effects as well as verb and corpus file as random effects. To explore the diachronic evolutionary dynamics, interactions between period and all the other fixed-effects explanatory variables were incorporated. In addition, interactions between genre and all the other language-internal fixed-effects variables were included to explore their interaction. For the stepwise selection, we used the anova() function in R to compare the log-likelihoods, AIC (Akaike Information Criterion), and BIC (Bayesian Information Criterion) in two adjacent models, which are goodness-of-fit measures for comparison of models with different number of variables (Levshina 2015: 194).
The fitted final model includes a random intercept for verb, and one for corpus file as random effects, three two-way interaction terms, NP2 animacy, log-transformed NP2 length, parallelism, and verb type as fixed effects. The final model is as follows:
variant ∼ (1 | verb) + (1 | corpus_file) + period * (AdjunctSem + genre) + genre * NP2_Animacy + log (NP2_length) + parallelism + verb_type
With the concordance index C and the Nagelkerke pseudo-R 2 as important goodness-of-fit statistics of logit models, this final model can predict the variant/outcome well (p < 0.001, pseudo-R 2 = 0.995, C = 0.965). According to Hosmer and Lemeshow (2000: 162), C ≥ 0.9 shows “outstanding discrimination”. Hence, the concordance index C value of the final model indicates that the model can discriminate very well between the ba-construction and the jiang-construction. The Nagelkerke pseudo-R 2 ranges from 0 (no predictive power) to 1 (perfect prediction). The value of pseudo-R 2 in our model (0.995) indicates good prediction of the model. The classification accuracy of the model is 89.53 %, which is much higher than the baseline of 60.10 % when the most frequent variant (the ba-construction) is selected. To check if there exists multicollinearity between explanatory factors, we computed the condition index k (11.41) and the VIFs for each factor (1.03–4.64), which are both below alarming thresholds (Baayen 2008: 182; Levshina 2015: 272). For model validation, our dataset was randomly divided 100 times into a training set (70 % of the data) and a test set (the remaining 30 %). Afterwards, the model was fitted to each training set and its predictions on the corresponding test set were computed. The results show that the prediction accuracy of the 100 models ranges from 75.67 to 78.31 % and the mean accuracy is 77.11 %, which suggests a good model fit.
4 Results and discussion
In exploring the factors probabilistically constraining the choice between ba and jiang, we first calculated the importance of each factor in the model by measuring the decrease in goodness-of-fit when leaving the factor out of the model. This was done with the Anova() function in the car package in R. The result is visualized in Figure 1.[11] We can see that the language-external factor “period” is the most important factor in the model, followed, at a distance, by the language-internal variables, and interaction terms. The importance of the language-external factors (especially “period”) can thus be said to reflect the significant role of socio-cultural developments in the three periods examined, but most outspokenly in period 3. More specifically, the high value of the factor “period” in Figure 1 arguably correlates with the fact that the Vernacular Language Movement in China boosted the use of the ba-construction in formal and informal texts alike in period 3.[12] This finding dovetails with the tenet of cognitive sociolinguistics that socio-cultural forces (co-)define language variation (e.g., Kristiansen and Geeraerts 2013).

Variable importance (decrease in model goodness-of-fit if factor removed) of all the factors found significant in predicting the ba and jiang variants.
Table 3 reports the fixed-effects structure of the final model. The column labeled “estimate” shows the estimates of coefficients on a logit-scale. Positive estimates indicate a preference for the ba-construction (the predicted outcome) while negative estimates suggest a preference for the jiang-construction.
Output regression: fixed effects (with the jiang-construction as reference value for response variable).
| Predictor | Estimate | Std. error | Pr (>|z|) | |
|---|---|---|---|---|
| (Intercept) | −10.297 | 0.995 | <0.001 | *** |
| log (NP2_length) | −0.267 | 0.093 | 0.004 | ** |
| Parallelism yes (default: no) | 0.345 | 0.128 | 0.007 | ** |
| Verb type (default: Activity) | ||||
| verb_typeExOcc | −0.731 | 0.230 | 0.002 | ** |
| verb_typeMentalAct | 0.592 | 0.207 | 0.004 | ** |
| verb_typeOther | 0.437 | 0.197 | 0.027 | * |
| Period (default: period_2) | ||||
| periodperiod_1 | 2.236 | 2.490 | 0.369 | |
| periodperiod_3 | 19.489 | 1.216 | <0.001 | *** |
| NP2 animacy (default: concrete) | ||||
| NP2_Animacyabstract | 0.536 | 0.220 | 0.015 | * |
| NP2_Animacyanimate | −0.296 | 0.194 | 0.126 | |
| Adjunct semantics (default: Adverbial) | ||||
| AdjunctSemcomplement | −0.580 | 0.186 | 0.002 | ** |
| AdjunctSemwithoutAdjunct | 0.002 | 0.154 | 0.989 | |
| Genre informal (default: formal) | −0.013 | 0.311 | 0.966 | |
| Interactions | ||||
| periodperiod_1:AdjunctSemcomplement | 1.091 | 0.317 | <0.001 | *** |
| periodperiod_3:AdjunctSemcomplement | 0.684 | 0.565 | 0.226 | |
| periodperiod_1:AdjunctSemwithoutAdjunct | 0.026 | 0.301 | 0.931 | |
| periodperiod_3:AdjunctSemwithoutAdjunct | −0.141 | 0.603 | 0.815 | |
| periodperiod_1:genreinformal | −17.563 | 4.274 | <0.001 | *** |
| periodperiod_3:genreinformal | 5.643 | 6.764 | 0.404 | |
| NP2_Animacyabstract:genreinformal | −0.907 | 0.268 | <0.001 | *** |
| NP2_Animacyanimate:genreinformal | 0.219 | 0.278 | 0.430 |
-
Signif. Codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1.
The table shows that “intercept” is significant in predicting the variants. The intercept coefficient (−10.297) is the estimated log odds of the predicted outcome when all quantitative/numerical predictor variables are equal to zero and categorical variables are at their reference levels. Exponentiating this coefficient yields its simple odds value, which is smaller than 0.0001.[13] This means that the odds of the ba-construction versus the jiang-construction are nearly zero in period 2 and in formal contexts with activity verbs, concrete NP2s, adverbial adjuncts, no parallel disposal constructions in adjacent contexts, and with the length of NP2 being one character.[14] In other words, when all categorical variables are at their reference level and the only quantitative variable “log(NP2_length)” is zero, the probability of the jiang-construction is nearly 100 %; that is, language users choose the jiang-construction in all likelihood under such circumstances.
Next, let us first look at variables which only show main effects in the ba versus jiang alternation.
The variable “verb type” is only involved in main effects. Table 3 shows that all three levels can significantly predict the ba or jiang variant, compared with the reference level “Activity”. They differ in that the category/level “ExOcc” (verbs of existence and occurrence) (see (11) and (12)) significantly boosts the odds of the jiang-construction compared with the reference category “Activity” verbs, while verbs of mental activity (13) significantly increase the choice of the ba-construction. The verb type “Other” (14), which includes all the types of verbs apart from verbs of activity, existence and occurrence, and mental activity, also boosts the ba-construction, compared with the reference category “Activity” verbs.
| 王林因车速过快和占道行驶, 将一少妇当场撞死后逃逸, ……. | ||||||||||
| Wáng | Lín | yīn | chē | sù | guò | kuài | hé | zhàn | dào | xíngshǐ, |
| Wang | Lin | because | car | speed | too | fast | and | occupy | road | drive, |
| jiāng | yī | Shào | fù | dāngchǎng | zhuàng | sǐ | hòu | táoyì … | ||
| JIANG | one | Young | woman | on.the.spot | collide | dead | after | abscond | ||
| ‘As Wang Lin drove too fast and occupied the road, he bumped into a young woman and killed her on the spot, and then he fled…’ | ||||||||||
| 鲍迪用一只火把, 将火葬堆变成了一场熊熊大火。 | |||||||
| Bàodí | yòng | yī | zhī | huǒbǎ, | jiāng | huǒzàngduī | biàn |
| Bowdy | use | one | class | torch, | JIANG | burial.mound | change |
| chéng | le | yī | chǎng | xióngxióng | dà | huǒ。 | |
| into | perf | one | class | raging | big | fire. | |
| ‘Using a torch, Bowdy turned the burial mound into a raging fire.’ | |||||||
| 田小辮子要見不能見, 真把他急得要死。 | |||||||
| Tiánxiǎobiànzi | yào | jiàn | bùnéng | jiàn, | zhēn | bǎ | tā |
| Tianxiaobianzi | want | see | cannot | see, | really | BA | he |
| jí | dé | yàosǐ. | |||||
| worry | de | extremely | |||||
| ‘He wanted to see Tian Xiaobianzi, but he couldn’t manage to, which made him extremely anxious.’ | |||||||
| 父母尚且把他作珍宝般爱惜 | |||||||
| Fùmǔ | shàngqiě | bǎ | tā | zuò | zhēnbǎo | bān | àixī |
| Parent | even | BA | he | treat.as | treasure | like | cherish |
| ‘Parents even cherish him like a treasure.’ | |||||||
Figure 2 demonstrates the ratio of the ba construction compared to the jiang construction by verb type across the three historical periods. We can observe that the probability of the ba-construction steadily increases in all four verb types, signaling its overall expansion over time; in other words, the jiang-construction has increasingly lost ground to the ba-construction. Compared to the ba versus jiang proportions in “Activity” verbs – the reference level in the regression analysis – in the three periods combined, we can see that the share of the ba-construction in “Mental activity” verbs and in “Other” verbs is higher (than that of jiang-constructions) while that of “ExOcc” verbs is lower (compared to the “Activity” verbs taking up an average of 55.5 % of ba-constructions, the “Mental activity” and “Other” verbs take up 66.9 and 63.1 %, respectively, whereas the “ExOcs” verbs only take up 39 %). In the output of the regression model, these results correspond with “Mental activity” and “Other” verbs boosting the odds of the ba-construction compared to the reference level and the “ExOcc” verbs boosting the odds of the jiang-construction.

The proportion of the ba and jiang constructions by verb types across three historical periods.
It is noteworthy that in the latest period (period 3), 91.3 % of mental activity verbs choose the ba-construction; it is especially this class of verbs that has contributed to the expansion of the ba-construction and its taking over of the jiang-construction. Recall that ba was initially a lexical verb with the meaning of “holding”; when it was used in serial verb constructions, the second verb usually designated physical manipulation, as in (15).
| 仰山便把茶树摇。 | |||||
| Yǎngshān | biàn | bǎ | chá | shù | yáo |
| Yangshan | then | BA | tea | tree | shake |
| ‘Yangshan then held the tea tree and shook it. / Yangshan then shook the tea tree.’ | |||||
In this example, the NP following ba is the object of both ba and the matrix verb 摇 yao ‘shake’. Specifically, in order to shake the tea tree, Yangshan (the subject) holds the tree first and then shakes it. It is from contexts of this type that the ba-construction has developed into a disposal construction (cf. Cao and Yu 2000; Cao and Long 2005); that is, the lexical meaning of ba has bleached and has become a disposal marker. Thus, in the initial stages of the development of the ba-construction, its matrix verb mainly denoted physical manipulation. After that, other types of verbs (most prominently among them, mental activity verbs) increasingly entered the matrix slot of the ba-construction, which can be seen as host-class expansion in Himmelmann’s (2004) terms.
On another note, the increase of mental activity verbs in the ba-construction in period 3 is consistent with the subjectivity/emotionality and metaphoricality of the ba-construction found in synchronous studies (e.g., Jing-Schmidt 2005; Shen 2002). That is, Jing-Schmidt (2005) and Shen (2002) argue that the ba-construction in modern Chinese is endowed with the notion of subjectivity.[15] Mental activity verbs such as ji ‘worry’ in (13) denote subjective feeling/emotion or mental activity of human beings. Hence, the finding that mental activity verbs have been used more frequently in the ba-construction chimes with the subjectivity/emotionality of the ba-construction found in previous studies.
The estimate of log(NP2_length) is −0.267, which means that the chances of the ba-construction decrease by 0.267 with every additional unit of log(NP2_length). In other words, the longer the NP2 is, the less likely the ba-construction will be chosen in comparison with the reference level – the jiang-construction. Figure 3 illustrates the distribution of NP2s with different length in the two variants. As we mentioned in Section 2.3, to reduce data skewing and the effect of outliers, NP2 length is capped at 10 characters. Thus, as Figure 3 shows, the NP2 complement ranges for both variants between 1 and 10 characters. However, the density for the two variants, shown by the two lines, is different. Overall, in both variants, the density of NP2s with a length of one to six characters is much higher. Specifically, NP2s of one character are much more densely used in the ba-construction, while NP2s with a length of two to six characters are more densely used in the jiang-construction than in the ba-construction. For NP2s of seven to ten characters, the two density lines almost overlap, which suggests that there is no significant difference between distributions of NP2s with such length in the two variants. Thus, overall, the longer the NP2 is, the less likely language users choose the ba-construction compared with the jiang-construction. This finding is not in line with assertions made in some previous studies (e.g., Lu 2006), where the object of the ba-construction is claimed to be longer than that of the jiang-construction. As noted in Section 1, these previous studies lack quantitative statistical analysis and therefore need further corroboration.

The distribution of NP2s with different length in the two disposal constructions.
The final variable which only demonstrates main effects in the alternation is “parallelism”. Table 3 shows that the level “yes”, viz., having parallel disposal constructions in the adjacent context, significantly predicts the use of the ba-construction compared with the reference level, i.e., without parallel disposal constructions in the adjacent contexts. It can be argued that parallelism reflects the effect of syntactic priming (also sometimes called structural priming, syntactic persistence, or plan reuse). Syntactic priming is the facilitation of processing that occurs when a syntactic structure is repeated across consecutive sentences (Bock 1986; Kempen 1977; Levelt 1989). This facilitation manifests itself both in language comprehension and in language production. During production, it has been documented that syntactic priming is reflected in an increased tendency to choose, for instance, the same grammatical voice (active vs. passive), the same relativizer (that vs. which), the same type of dative construction (double-object vs. prepositional dative), or the same type of syntactic structure (the ba-construction vs. SVO sentences) in consecutive sentences (e.g., Bock 1986; Bock and Loebell 1990; Fang and Liu 2021; Hinrichs et al. 2015). The present study echoes these previous studies in that language users increasingly choose the same type of syntactic structure, viz., the ba disposal construction, when there is already a disposal construction (a ba-construction or a jiang-construction) in the adjacent context.
Next, let us look at the interaction terms. Figure 4 visualizes the effect of the interaction term between “period” and “adjunct semantics”. From this figure, we can see that overall, the different levels of “adjunct semantics” in each period do not make much difference in constraining the alternation as the three points representing the three levels of the variable “AdjunctSem” are close to each other in each period. Across periods, however, there is great difference: obviously, period 3 greatly boosts the use of the ba-construction while in period 1 and period 2, the probability of the ba-construction is much lower. This is in accord with the finding in Figure 1 that “period” is the most important predictor of the alternation. Concretely, in period 1, the adjunct category “complement”, as in example (16), slightly boosts the odds of the ba-construction, compared to the reference level “adverbial”. In period 2, however, the category “complement” changes to significantly predict the jiang-construction, as can be seen from the main effect of the adjunct category “complement” in Table 3. In period 3, as the three points representing the three levels almost overlap, they show little difference in constraining the alternation, namely, all three levels prefer the ba-construction compared to period 2 (the reference level of “period”).

Interaction plot: predicted log odds for the ba-construction variant by the factor “adjunct semantics” across periods.
| 若堯當時把天下與丹朱, 舜把天下與商均, 則天下如何解安! | ||||||||
| Ruò | Yáo | dāngshí | bǎ | tiānxià | yǔ | dānzhū, | Shun | bǎ |
| If | Yao | then | BA | the.world | give | Danzhu, | Shun | BA |
| tiānxià | yǔ | shāngjūn, | zé | tiānxià | rúhé | jiěān! | ||
| the.world | give | Shangjun, | then | the.world | how | be.at.peace | ||
| ‘If Yao had abdicated and handed over power to Danzhu, and Shun had abdicated and handed over power to Shangjun, how the world would be at peace!’ | ||||||||
The interaction effect between “genre” and “period” is visualized in Figure 5. It shows that “informal genre” in period 1 can significantly predict the use of the jiang-construction compared with “formal genre” (the reference level), and the effect size of this prediction is very large, with the estimate being –17.563, as can be seen from Table 3. This may be ascribed to the frequency effect of the jiang-construction in this period. As we mentioned above, in the initial stage of the co-existence of the two disposal markers, jiang was still much more frequently used than ba. Figure 6 presents the distribution of the two disposal constructions across the two genres in period 1. The frequency count of the jiang-construction in the informal genre is nearly twice that of the ba-construction. A Chi-squared test shows that there exists a significant difference between the frequency counts of the two constructions in the two genres (χ 2 = 10.271, p = 0.0014).

Interaction plot: predicted log odds for the ba-construction variant by the factor “genre” across periods.

The distribution of the ba and jiang disposal constructions across the two genres in period 1.
When it comes to period 2, we can see from the plot that the two points representing the two levels of genre almost overlap, which indicates that they do not condition the alternation in a different way. Note as well that both genres prefer the jiang-construction as their log odds are negative. When it comes to the modern period, i.e., period 3, both genres now boost the odds of the ba-construction, and “informal genre” shows a stronger preference for the ba-construction compared to “formal genre”. Thus, we can see a clear change in the preference of informal texts over time, that is, from a preference for the jiang-construction in period 1 to a preference for the ba-construction in period 3. Hence, overall, in period 3, the ba-construction is preferred to its counterpart, the jiang-construction, both in informal and formal texts, thus signaling the expansion of the use of the ba-construction over time. This observation echoes Xing’s (1994) finding that the jiang disposal marker has largely been replaced by ba as a disposal marker in modern written texts.
We submit that the growth of the ba-construction can be attributed to two general trends (one language-internal and the other language-external). First, ba’s expansion can be seen as resulting from the emergence of a new grammaticalized layer (Hopper 1991) of the disposal construction, in addition to the older jiang-construction (Xing 1994). Jiang became a grammaticalized disposal marker much earlier than ba (e.g., Lü 1955; Wang 1957). Ba did not replace jiang immediately when it developed into a disposal marker, and the two have co-existed until modern times. Nonetheless, the newcomer has been encroaching on the territory of the old disposal marker jiang. Diao (1993) notes that in the initial stage of the co-existence of the two disposal markers, the use of jiang was still much more frequent while from the Yuan dynasty onwards, ba began to overtake jiang. Second, the overall preference for the ba-construction in modern times can be related to the Vernacular Language Movement in Chinese, which began in late 1910s. The goal of the movement was to replace Literary Chinese (文言 wenyan) with spoken language (白话 baihua) in nearly all written works. As a result, by the early 1920s, vernacular in written Chinese gained nationwide acceptance as a reputable style in prose, poetry, and fiction and was the officially designated style for school textbooks. As noted by previous studies (cf. Jing-Schmidt and Tao 2009; Wu 2013), while the jiang-construction has a strong color of Literary Chinese, viz., an antique flavor, the ba-construction has a strong color of colloquialism. Thus, the popularity of vernacular from the 1920s onwards has substantially contributed to jiang losing ground to ba.
With respect to the interaction between “NP2 animacy” and “genre”, we can see from Figure 7 that compared to “formal genre” (the reference level of “genre”), there is little change in informal genre in the position of “animate” relative to that of “concrete” (the reference level of “NP2_Animacy”). What does change substantially is the position of “abstract”. Compared to concrete entities, abstract NP2s (17) in formal genre boost the odds of the ba-construction while in informal genre, they boost the jiang-construction.

Interaction plot: predicted log odds for the ba-construction variant by the factor “NP2 animacy” across genres.
| 我们会将您的建议反馈到产品部门。 | |||||||
| Wǒmen | huì | jiāng | nínde | jiànyì | fǎnkuì | dào | chǎnpǐn |
| We | will | JIANG | your | suggestion | feed.back | arrive | product |
| bùmén | |||||||
| department | |||||||
As regards random effects, it is noticeable from Table 4 that the by-subject adjustment “corpus_file” (N = 1,059, Variance: 511.08, Std Dev: 22.604) accounts for more variability than the by-item adjustment “verb” (N = 1,569, Variance: 1.05, Std Dev: 1.024). It should be noted that in the CCL corpus, the name of the corpus file of a particular concordance is actually the title of the book the concordance has been taken from. Hence, the number of corpus files equals that of the books. The number of books to a large extent approximates the number of authors as the majority of the books investigated in the present study were written by different authors. It has been documented that author differences play a role in language change not only in Indo-European languages such as English (e.g., De Smet 2020; Petré and Anthonissen 2020), but also in Mandarin Chinese (Li et al. 2023). Thus, it is not surprising to observe the random effects of “corpus_file” in the present study. Previous studies have also found that verb differences play a part in the diachronic evolution of Chinese syntactic variation (cf. Li et al. 2023). This effect is echoed in the present study.
Output regression: random effects.
| Groups | Number of observations | Variance | Standard deviation |
|---|---|---|---|
| Verb | 1,569 | 1.05 | 1.024 |
| corpus_file | 1,059 | 511.08 | 22.607 |
-
Number of observations: 4,812.
5 Conclusion
While the disposal construction in Mandarin has been studied intensively, the evolution of the variation between the ba and jiang constructions has rarely been investigated in a multifactorially controlled research design. In this paper, we gauged the differential impact and interplay of a substantial set of language-internal and language-external factors. The results indicate two general trends. First of all, as the new layer of the grammaticalization of the Mandarin disposal construction, the ba-construction increased its odds over time at the expense of its counterpart, the jiang-construction, with a strong boost in period 3. Second, over time, the effect size of the significant preference for the jiang-construction in informal genres decreased from period 1 to period 2, and this preference disappeared in modern times (period 3), with both informal and formal genres converging to favor the use of the ba-construction. Ba’s dominance not only results from its emergence as a new grammaticalized layer of the disposal construction, it also needs to be explained from a language-external perspective, and specifically as the impact of the Vernacular Language Movement in China. The variable importance scale (Figure 1), with “period” as the most important variable, clearly reflects the impact of this movement in period 3.
In addition to the general trends, regression modeling quantifies the evolutionary dynamics of the alternation by revealing that there are both stable linguistic constraints (i.e., parallelism/syntactic priming, verb type, NP2 animacy, and NP2 length) and fluid constraints (i.e., adjunct semantics, and genre). Our findings concerning the stable linguistic constraints are the following: the factor “parallelism” (i.e., syntactic priming) predicts the use of the ba-construction; verbs of existence and occurrence prefer the jiang-construction, while mental activity verbs and “Other” verbs prefer the ba-construction; abstract NP2s in the informal genre significantly boost the use of the jiang-construction whereas in the formal genre, they significantly boost the odds of the ba-construction; long NP2s/objects significantly boost the jiang-construction. With regard to fluid constraints, our regression analysis shows that there are fluctuations in both their effect sizes and effect directions as a function of real time. These probabilistic fluctuations showcase the potential of multifactorial quantitative methods with a variationist/probabilistic twist, for these methods contribute to unveiling subtle grammar changes that went undetected in previous non-multifactorial approaches to syntactic alternation.
This empirical study has at least the following three theoretical contributions. First, while previous studies have mainly focused on syntactic alternations involving word order change (e.g., the ba-construction vs. SVO sentences in Chinese, and the theme-recipient alternation in Chinese; see Fang and Liu 2021; Li et al. 2023), the present study has been concerned with a new type of alternation that does not involve word order variation: the alternation between two grammatical markers, viz., object/disposal markers ba and jiang. In doing so, this study has not only advanced our knowledge of the two disposal constructions and their evolution, but has also shed new light on diachronic syntactic alternation in general, and on grammatical marker alternation in Mandarin Chinese in particular. Second, the results of our study indicate that the ba-construction has been winning out over time in its competition with the jiang-construction. Thus, it would be natural for the latter to be eventually preempted by the former in ordinary discourse.[16] In recent literature, statistical preemption has given rise to an interesting discussion, namely whether it is compatible with a “sameness of meaning” view (the point of departure of the present study; see Section 1) or with the “No Synonymy view”. On the one hand, one could argue that statistical preemption presupposes sameness of meaning (speakers end up consistently selecting one alternative rather than the other because the two alternatives are similar in meaning). On the other hand, advocates of the No Synonymy view hold that it is precisely “no synonymy” that preemption is compatible with: as Leclercq and Morin (2023: 4) point out: “Indeed, while the principle of no synonymy posits that no two constructions have the exact same function, statistical preemption ensures that this be the case by blocking the use of an alternative (or new) form when a function is already associated with a specific construction”. Third, this study makes a methodological contribution to the empirical testing of hypotheses. It is the first one, to our knowledge, to take a corpus-based long-term perspective on the variation, from the earliest period whereby the ba and jiang constructions coexisted to the present, and to gauge the effects under multifactorial control, including an elaborate random structure. It thus contributes to a data-driven corpus analysis of complex historical linguistic issues.
Still, the results have to be approached with due caution. Because of the lack of true diachronic oral data for the periods prior to modern Chinese, we had to classify drama, family letters, and the scripts of story-telling into the informal genre as proxy. Moreover, we have ignored regional differences and other sociolinguistic factors, as this kind of metadata is absent for historical texts.
Funding source: Philosophy and Social Science Program of Zhejiang Province, China
Award Identifier / Grant number: No. 22NDJC196YB
Acknowledgments
Many thanks go to Hendrik De Smet, Benedikt Szmrecsanyi, and Thomas Van Hoey for useful comments and suggestions on data extraction, and to Yi Li for his help with plotting the results of regression analysis. We are also indebted to three anonymous reviewers and the editors for their constructive feedback and suggestions. An earlier version of this paper was presented at the QLVL colloquium at KU Leuven (June 2023); we are grateful to the audience for their valuable feedback. As ever, all remaining errors are our responsibility.
-
Research funding: Work on this study was supported by Philosophy and Social Science Program of Zhejiang Province, China (No. 22NDJC196YB) granted to the first author.
Abbreviations
- BA
-
ba (a disposal marker in Chinese)
- JIANG
-
jiang (a disposal marker in Chinese)
- gen
-
genitive
- perf
-
perfective marker
- loc
-
locative phrase
- de
-
marker of resultative adjunct
- class
-
classifier
- prog
-
progressive marker
References
Aissen, Judith. 2003. Differential object marking: Iconicity versus economy. Natural Language & Linguistic Theory 21. 435–483. https://doi.org/10.1023/a:1024109008573.10.1023/A:1024109008573Suche in Google Scholar
Baayen, Harald. 2008. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.10.1017/CBO9780511801686Suche in Google Scholar
Bates, Douglas, Martin Mächler, Benjamin M. Bolker & Steven C. Walker. 2015. Fitting linear mixed effect models using lme4. Journal of Statistical Software 67(1). 1–48. https://doi.org/10.18637/jss.v067.i01.Suche in Google Scholar
Bernaisch, Tobias, Th. Gries Stefan & Joybrato Mukherjee. 2014. The dative alternation in South Asian English(es): Modelling predictors and predicting prototypes. English World-Wide 35(1). 7–31. https://doi.org/10.1075/eww.35.1.02ber.Suche in Google Scholar
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. Longman grammar of spoken and written English. Harlow: Pearson.Suche in Google Scholar
Bock, Kathryn. 1986. Syntactic persistence in language production. Cognitive Psychology 18(3). 355–387. https://doi.org/10.1016/0010-0285(86)90004-6.Suche in Google Scholar
Bock, Kathryn & Helga Loebell. 1990. Framing sentences. Cognition 35. 1–39. https://doi.org/10.1016/0010-0277(90)90035-i.Suche in Google Scholar
Bolinger, Dwight. 1977. Meaning and form. New York: Longman.Suche in Google Scholar
Bossong, Georg. 1985. Empirische Universalienforschung: Differentielle Objektmarkierung in den neuiranischen Sprachen. Tübingen: Narr.Suche in Google Scholar
Boyd, Jeremy & Adele Goldberg. 2011. Learning what not to say: The role of statistical pre-emption and categorization in “a”-adjective production. Language 81. 1–29. https://doi.org/10.1353/lan.2011.0012.Suche in Google Scholar
Bresnan, Joan. 2007. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Sam Featherston & Wolfgang Sternfeld (eds.), Roots: Linguistics in search of its evidential base, 77–96. Berlin: Mouton de Gruyter.10.1515/9783110198621.75Suche in Google Scholar
Cao, Guangshun & Guofu Long. 2005. 再谈中古汉语处置式 [The disposal construction in Middle Chinese revisited]. Zhongguo Yuwen [Studies of the Chinese Language] 307(4). 320–332.Suche in Google Scholar
Cao, Guangshun & Hsiao-jung Yu. 2000. 中古译经中的处置式 [The disposal construction translated from Middle Chinese Buddhist sutras]. Zhongguo Yuwen [Studies of the Chinese Language] 279(6). 555–563.Suche in Google Scholar
Cappelle, Bert. 2009. Contextual cues for particle placement: Multiplicity, motivation, modeling. In Alexander Bergs & Gabriele Diewald (eds.), Context in construction grammar, 145–192. Amsterdam: John Benjamins.10.1075/cal.9.07capSuche in Google Scholar
Chao, Yuen-ren. 1968. A grammar of spoken Chinese. Berkeley: University of California Press.Suche in Google Scholar
Chappell, Hilary. 2013. Pan-Sinitic object markers: Morphology and syntax. In Guangshun Cao, Hilary Chappell, Redouane Djamouri & Thekla Wiebusch (eds.), Breaking down the barriers: Interdisciplinary studies in Chinese linguistics and beyond, 785–816. Taipei: Academia Sinica.Suche in Google Scholar
Chappell, Hilary & Alain Peyraube. 2011. Grammaticalization in Sinitic languages. In Bernd Heine & Heiko Narrog (eds.), The Oxford handbook of grammaticalization, 783–795. Oxford: Oxford University Press.10.1093/oxfordhb/9780199586783.013.0065Suche in Google Scholar
Chen, Ping. 2004. Identifiability and definiteness in Chinese. Linguistics 42(6). 1129–1184. https://doi.org/10.1515/ling.2004.42.6.1129.Suche in Google Scholar
Cheshire, Jenny. 1987. Syntactic variation, the linguistic variable, and sociolinguistic theory. Linguistics 25(2). 257–282. https://doi.org/10.1515/ling.1987.25.2.257.Suche in Google Scholar
Comrie, Bernard. 1989. Language universals and linguistic typology, 2nd edn. Chicago: University of Chicago Press.Suche in Google Scholar
De Smet, Hendrik. 2019. The motivated unmotivated: Variation, function and context. In Kristin Bech & Ruth Möhlig-Falke (eds.), Grammar – discourse – context: Grammar and usage in language variation and change, 305–332. Berlin: De Gruyter.10.1515/9783110682564-011Suche in Google Scholar
De Smet, Hendrik. 2020. What predicts productivity? Theory meets individuals. Cognitive Linguistics 31(2). 251–278. https://doi.org/10.1515/cog-2019-0026.Suche in Google Scholar
de Swart, Peter. 2007. Cross-linguistic variation in object marking. Nijmegen: Radboud University PhD dissertation.Suche in Google Scholar
de Swart, Peter & Helen de Hoop. 2018. Shifting animacy. Theoretical Linguistics 44(1–2). 1–23. https://doi.org/10.1515/tl-2018-0001.Suche in Google Scholar
Diao, Yanbin. 1993. 近代汉语把字句与将字句的区别 [The difference between the ba-sentence and the jiang-sentence in early modern Chinese]. Journal of Liaoning Normal University 10(1). 50–52.Suche in Google Scholar
Epps, Patience. 2008. Hup’s typological treasures. Linguistic Typology 12. 169–193. https://doi.org/10.1515/lity.2008.036.Suche in Google Scholar
Fang, Yu & Haitao Liu. 2021. Predicting syntactic choice in Mandarin Chinese: A corpus-based analysis of ba sentences and SVO sentences. Cognitive Linguistics 32(2). 219–250. https://doi.org/10.1515/cog-2020-0005.Suche in Google Scholar
Goldberg, Adele E. 1995. Constructions: A construction grammar approach to argument structure. Chicago: The University of Chicago Press.Suche in Google Scholar
Goldberg, Adele E. 2002. Surface generalizations: An alternative to alternations. Cognitive Linguistics 13(4). 327–356.10.1515/cogl.2002.022Suche in Google Scholar
Himmelmann, Nikolaus P. 2004. Lexicalization and grammaticalization: Opposite or orthogonal? In Walter Bisang, Nikolaus Himmelmann & Björn Wiemer (eds.), What makes grammaticalization: A look from its fringes and its components, 19–40. Berlin/New York: Mouton de Gruyter.10.1515/9783110197440.1.21Suche in Google Scholar
Hinrichs, Lars, Benedikt Szmrecsanyi & Axel Bohmann. 2015. Which-hunting and the standard English relative clause. Language 91(4). 806–836. https://doi.org/10.1353/lan.2015.0062.Suche in Google Scholar
Hopper, Paul J. 1991. On some principles of grammaticalization. In Elizabeth C. Traugott & Bernd Heine (eds.), Approaches to grammaticalization, vol. I, 17–35. Amsterdam: John Benjamins.Suche in Google Scholar
Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization. Cambridge: Cambridge University Press.10.1017/CBO9781139165525Suche in Google Scholar
Hosmer, David W. & Stanley Lemeshow. 2000. Applied logistic regression. New York: Wiley.10.1002/0471722146Suche in Google Scholar
Huang, Cheng-Teh James. 1982. Logical relations in Chinese and the theory of grammar. Cambridge, MA: Massachusetts Institute of Technology PhD dissertation.Suche in Google Scholar
Huang, Cheng-Teh James. 1984. Phrase structure, lexical integrity, and Chinese compounds. Journal of the Chinese Language Teachers Association 19(2). 53–78.Suche in Google Scholar
Jing-Schmidt, Zhuo. 2005. Dramatized discourse: The Mandarin Chinese ba-construction. Amsterdam/Philadelphia: John Benjamins.10.1075/sfsl.56Suche in Google Scholar
Jing-Schmidt, Zhuo & Hongyin Tao. 2009. The Mandarin disposal constructions: Usage and development. Language and Linguistics 10(1). 29–58.Suche in Google Scholar
Kapatsinski, Vsevolod. 2009. Adversative conjunction choice in Russian (no, da, odnako): Semantic and syntactic influences on lexical selection. Language Variation and Change 21(2). 157–173. https://doi.org/10.1017/s0954394509990068.Suche in Google Scholar
Kempen, Gerard. 1977. Conceptualizing and formulating in sentence production. In Sheldon Rosenberg (ed.), Sentence production: Developments in research and theory, 259–274. Hillsdale, NJ: Erlbaum.Suche in Google Scholar
Kristiansen, Gitte & Dirk Geeraerts. 2013. Contexts and usage in cognitive sociolinguistics. Journal of Pragmatics 52. 1–4. https://doi.org/10.1016/j.pragma.2012.12.017.Suche in Google Scholar
Lavandera, Beatriz. 1978. Where does the sociolinguistic variable stop? Language in Society 7. 171–182. https://doi.org/10.1017/s0047404500005510.Suche in Google Scholar
Leclercq, Benoît & Cameron Morin. 2023. No equivalence: A new principle of no synonymy. Constructions 15. 1–16.Suche in Google Scholar
Levelt, Willem J. M. 1989. Speaking: From intention to articulation. Cambridge, MA: MIT Press.10.7551/mitpress/6393.001.0001Suche in Google Scholar
Levshina, Natalia. 2015. How to do linguistics with R: Data exploration and statistical analysis. Amsterdam: John Benjamins.10.1075/z.195Suche in Google Scholar
Levshina, Natalia, Dirk Geeraerts & Dirk Speelman. 2013. Towards a 3D-grammar: Interaction of linguistic and extralinguistic factors in the use of Dutch causative constructions. Journal of Pragmatics 52. 34–48. https://doi.org/10.1016/j.pragma.2012.12.013.Suche in Google Scholar
Li, Charles N. & Sandra A. Thompson. 1976. Development of the causative in Mandarin Chinese: Interaction of diachronic processes in syntax. In Masayoshi Shibatani (ed.), The grammar of causative constructions (Syntax and Semantics 6), 477–492. New York: Academic Press.10.1163/9789004368842_020Suche in Google Scholar
Li, Charles N. & Sandra A. Thompson. 1981. Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press.10.1525/9780520352858Suche in Google Scholar
Li, Yi, Benedikt Szmrecsanyi & Weiwei Zhang. 2023. The theme-recipient alternation in Chinese: Tracking syntactic variation across seven centuries. Corpus Linguistics and Linguistic Theory 19(2). 207–235. https://doi.org/10.1515/cllt-2021-0048.Suche in Google Scholar
Light, Timothy. 1979. Word order and word order change in Mandarin Chinese. Journal of Chinese Linguistics 7. 149–180.Suche in Google Scholar
Liu, Feng-hsi. 2007. Word order variation and ba sentences in Chinese. Studies in Language 31(3). 649–682. https://doi.org/10.1075/sl.31.3.05liu.Suche in Google Scholar
Lü, Shuxiang. 1955. 汉语语法论文集 [Papers on Chinese grammar]. Beijing: Kexue Chubanshe [The Science Publishing House].Suche in Google Scholar
Lu, Huihui. 2006. 古代白话小说 “把/将” 字句语体适应性 [The special “ba/jiang sentences” and their style adaptability in ancient vernacular Chinese novels]. Xibei Daxue Xuebao [Journal of Northwest University] (Philosophy and Social Sciences Edition) 3. 170–175.Suche in Google Scholar
Petré, Peter & Lynn Anthonissen. 2020. Individuality in complex systems: A constructionist approach. Cognitive Linguistics 31(2). 185–212. https://doi.org/10.1515/cog-2019-0033.Suche in Google Scholar
Peyraube, Alain & Thekla Wiebusch. 2020. New insights on the historical evolution of differential object marking (DOM) in Chinese. In Janet Zhiqun Xing (ed.), A typological approach to grammaticalization and lexicalization, 101–130. Berlin: De Gruyter Mouton.10.1515/9783110641288-005Suche in Google Scholar
Röthlisberger, Melanie & Sali A. Tagliamonte. 2020. The social embedding of a syntactic alternation: Variable particle placement in Ontario English. Language Variation and Change 32. 317–348. https://doi.org/10.1017/s0954394520000174.Suche in Google Scholar
Sankoff, David. 1988. Sociolinguistics and syntactic variation. In Frederick J. Newmeyer (ed.), Linguistics: The Cambridge survey, vol. IV, 140–161. Cambridge: Cambridge University Press.10.1017/CBO9780511620577.009Suche in Google Scholar
Shain, Cory & Judith Tonhauser. 2010. The synchrony and diachrony of differential object marking in Paraguayan Guaraní. Language Variation and Change 22. 321–346. https://doi.org/10.1017/s0954394510000153.Suche in Google Scholar
Shen, Jiaxuan. 2002. 如何处置 “处置式”? – 论把字句的主观性 [Can the disposal construction be disposed of ? – On the subjectivity of the ba-construction in Mandarin Chinese]. Zhongguo Yuwen [Studies of the Chinese Language] 290(5). 387–399.Suche in Google Scholar
Speelman, Dirk, Kris Heylen & Dirk Geeraerts. 2018. Introduction. In Dirk Speelman, Kris Heylen & Dirk Geeraerts (eds.), Mixed-effects regression models in linguistics, 1–10. Cham: Springer International Publishing AG.10.1007/978-3-319-69830-4_1Suche in Google Scholar
Sun, Chaofen. 1996. Word-order change and grammaticalization in the history of Chinese. Stanford: Stanford University Press.Suche in Google Scholar
Sun, Chaofen. 2015. The grammaticalization of the BA construction: Cause and effect in a case of specialization. In William S.-Y. Wang & Chaofen Sun (eds.), The oxford handbook of chinese linguistics. Oxford/New York: Oxford University Press.Suche in Google Scholar
Sun, Chaofen & Talmy Givon. 1985. On the so-called SOV word order in Mandarin Chinese: A quantified text study and its implications. Language 61. 329–351. https://doi.org/10.2307/414148.Suche in Google Scholar
Theijssen, Daphne, Louis Ten Bosch, Lou Boves, Bert Cranen & Hans van Halteren. 2013. Choosing alternatives: Using Bayesian networks and memory based learning to study the dative alternation. Corpus Linguistics and Linguistic Theory 9(2). 227–262. https://doi.org/10.1515/cllt-2013-0007.Suche in Google Scholar
Traugott, Elizabeth Closs. 2003. From subjectification to intersubjectification. In Raymond Hickey (ed.), Motives for language change, 124–142. Cambridge: Cambridge University Press.10.1017/CBO9780511486937.009Suche in Google Scholar
von Heusinger, Klaus & Georg A. Kaiser. 2003. Animacy, specificity, and definiteness in Spanish. In Klaus von Heusinger & Georg A. Kaiser (eds.), Proceedings of the workshop “semantic and syntactic aspects of specificity in Romance languages”, 41–65. Konstanz: Universität Konstanz.Suche in Google Scholar
Wang, Li. 1944. 中国语法理论 [The theory of Chinese grammar]. Beijing: Zhonghua Shuju [Zhonghua Book Company].Suche in Google Scholar
Wang, Li. 1947. 中国现代语法 [A grammar of modern Chinese]. Beijing: Zhonghua Shuju [Zhonghua Book Company].Suche in Google Scholar
Wang, Li. 1957. 汉语语法纲要 [Chinese grammar]. Shanghai: Xinzhishi Chubanshe.Suche in Google Scholar
Wang, Li. 2004 [1980]. 汉语史稿 [A draft history of the Chinese language]. Beijing: Zhonghua Book Company.Suche in Google Scholar
Wu, Fuxiang. 2003. 再论处置式的来源 [Further discussion on the origin of the disposal construction]. Yuyan Yanjiu [Language Research] 23(3). 1–14.Suche in Google Scholar
Wu, Liang. 2013. 把字句与将字句差异的多角度考察与分析 [Multi-angle investigation and analysis on differences between jiang-construction and ba-construction]. Nanjing Hangkong Hangtian Daxue Xuebao (shehui kexue ban) [Journal of Nanjing University of Aeronautics and Astronautics (Social Sciences Edition)] 15(3). 69–74.Suche in Google Scholar
Xing, Janet Zhiqun. 1994. Diachronic change of object markers in Mandarin Chinese. Language Variation and Change 6. 201–222. https://doi.org/10.1017/s0954394500001642.Suche in Google Scholar
Xun, Endong, Gaoqi Rao, Xiaoyue Xiao & Jiaojiao Zang. 2016. 大数据背景下 BCC 语料库的研制 [The construction of the BCC Corpus in the age of Big Data]. Yuliaoku Yuyanxue [Corpus Linguistics] 3(1). 93–118.Suche in Google Scholar
Zaenen, Annie, Jean Carlette, Gregory Garretson, Joan Bresnan, Andrew Koontz-Garboden, Tatiana Nikitina, Catherine O’Connor & Tom Wasow. 2004. Animacy encoding in English: Why and how. In Donna Byron & Bonnie Webber (eds.), Proceedings of the 2004 ACL workshop on discourse annotation, Barcelona, July 2004, 118–125. East Stroudsburg, PA: Association for Computational Linguistics.10.3115/1608938.1608954Suche in Google Scholar
Zhan, Weidong, Rui Guo & Yirong Chen. 2003. The CCL corpus of Chinese texts: 700 million Chinese characters, the 11th century BC–present. Available online at the website of Center for Chinese Linguistics (abbreviated as CCL) of Peking University: http://ccl.pku.edu.cn:8080/ccl_corpus.Suche in Google Scholar
Zhu, Yubin. 2016. 近代汉语 “把/将” 字句的竞争及成因 [On “ba/jiang” sentences in modern Chinese: competition and causes]. Yantai Daxue Xuebao (zhexue shehui kexue ban) [Journal of Yantai University (Philosophy and social science edition)] 29(6). 112–119.Suche in Google Scholar
© 2023 the author(s), published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Artikel in diesem Heft
- Frontmatter
- I couldn’t help but wonder: do modals and negation attract?
- Reliable detection and quantification of selective forces in language change
- Alternation in the Mandarin disposal constructions: quantifying their evolutionary dynamics across twelve centuries
- Critical contingency competition in L2 clause positioning acquisition: the case of concessive clause by Chinese EFL learners
- The distributional properties of long nominal compounds in scientific articles: an investigation based on the uniform information density hypothesis
- The counting principle makes number words unique
- Profiling analytic causative construction in Chinese: a multifactorial analysis of diachronic change
Artikel in diesem Heft
- Frontmatter
- I couldn’t help but wonder: do modals and negation attract?
- Reliable detection and quantification of selective forces in language change
- Alternation in the Mandarin disposal constructions: quantifying their evolutionary dynamics across twelve centuries
- Critical contingency competition in L2 clause positioning acquisition: the case of concessive clause by Chinese EFL learners
- The distributional properties of long nominal compounds in scientific articles: an investigation based on the uniform information density hypothesis
- The counting principle makes number words unique
- Profiling analytic causative construction in Chinese: a multifactorial analysis of diachronic change