Abstract
This paper aims to quantify distances between varieties of Mandarin (diachronic, regional, and situational) as a function of the similarity in the choice between syntactic variants in the Mandarin theme-recipient alternation (yŭ/gěi dative alternation). We use a novel corpus-based method, Variation-Based Distance and Similarity Modeling, which draws inspiration from work in comparative sociolinguistics and quantitative dialectometry. Analysis reveals that, while there is a relatively stable probabilistic grammar across the investigated varieties, historical varieties do exhibit a relatively higher degree of heterogeneity than synchronic varieties. Despite the overall high similarity of the latter, we identify substantial probabilistic differences between fictional writings of Modern Mainland Mandarin and all other synchronic varieties. Our findings thus provide evidence in support of the hypothesis that the transition from Early Mandarin to Modern Mandarin over the past two centuries has witnessed salient grammatical shifts and also empirically demonstrate the interaction between genre variability and regional variability in Modern Mandarin.
Funding source: China Scholarship Council
Award Identifier / Grant number: CSC201906900122
Acknowledgments
We would like to thank Professor Jason Grafmiller for his help with the VADIS modeling, Dr. Lei Ye for his help with the Python scripts for data extraction, and two anonymous reviewers and the editors for their constructive feedback and suggestions. The usual disclaimers apply.
-
Research funding: Work on this paper was supported by a China Scholarship Council grant to the first author (grant no. CSC201906900122).
Supplementary Materials
Detailed descriptions of the corpora and the linguistic constraints, R code, and fuller exemplification of tests and results are available at https://osf.io/u496a/.
References
Bates, Douglas, Martin Mächler, Benjamin M. Bolker & Steven C. Walker. 2015. Fitting linear mixed effect models using lme4. Journal of Statistical Software 67(1). 1–48. https://doi.org/10.18637/jss.v067.i01.Search in Google Scholar
Behaghel, Otto. 1909. Beziehungen zwischen Umfang und Reihenfolge von Satzgliedern. Indogermanische Forschungen 25. 110–142.Search in Google Scholar
Bybee, Joan. 2006. From usage to grammar: The mind’s response to repetition. Language 82(4). 711–733. https://doi.org/10.1353/lan.2006.0186.Search in Google Scholar
Chui, Kawai & Huei-ling Lai. 2008. The NCCU corpus of spoken Chinese: Mandarin, Hakka, and southern Min. Taiwan Journal of Linguistics 6(2). 119–144.Search in Google Scholar
De Vaus, David. 2002. Analyzing social science data. London: Sage.Search in Google Scholar
Gelman, Andrew. 2008. Scaling regression inputs by dividing by two standard deviations. Statistics in Medicine 27(15). 2865–2873. https://doi.org/10.1002/sim.3107.Search in Google Scholar
Goebl, Hans. 1982. Dialektometrie: Prinzipien und methoden des einsatzes der numerischen taxonomie im bereich der dialektgeographie. Vienna: Österreichische Akademie der Wissenschaften.Search in Google Scholar
Grafmiller, Jason. 2014. Variation in English genitives across modality and genres. English Language & Linguistics 18(3). 471–496. https://doi.org/10.1017/s1360674314000136.Search in Google Scholar
Grafmiller, Jason & Benedikt Szmrecsanyi. 2018. Mapping out particle placement in Englishes around the world: A study in comparative sociolinguistic analysis. Language Variation & Change 30(3). 385–412. https://doi.org/10.1017/s0954394518000170.Search in Google Scholar
Guy, Gregory R. 2005. Letters to Language. Language 81(3). 561–563. https://doi.org/10.1353/lan.2005.0132.Search in Google Scholar
Hashimoto, Mantaro. 1977. The double object construction in Chinese. Monumenta Serica 33(1). 268–285. https://doi.org/10.1080/02549948.1977.11745049.Search in Google Scholar
Hashimoto, Mantaro. 1986. The altaicization of northern Chinese. In John McCoy & Timothy Light (eds.), Contributions to Sino-Tibetan studies, 76–97. Leiden: Brill.10.1163/9789004655409_004Search in Google Scholar
Hawkins, John A. 1995. A performance theory of order and constituency. Cambridge: Cambridge University Press.10.1017/CBO9780511554285Search in Google Scholar
Hudson, Richard A. 1996. Sociolinguistics. Cambridge: Cambridge University Press.Search in Google Scholar
Hothorn, Torsten, Kurt Hornik & Achim Zeileis. 2006. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational & Graphical Statistics 15(3). 651–674. https://doi.org/10.1198/106186006x133933.Search in Google Scholar
Kruskal, Joseph B. & Myron Wish. 1978. Multidimensional scaling. London: Sage.10.4135/9781412985130Search in Google Scholar
La Peruta, Roberta. 2022. Using VADIS to weigh competing epicentral influence. World Englishes 41. 400–413. https://doi.org/10.1111/weng.12585.Search in Google Scholar
Labov, William. 2010. Principles of linguistic change, vol. 3, Cognitive and cultural factors (Language in Society 39). Malden, MA: Wiley-Blackwell.10.1002/9781444327496Search in Google Scholar
Levshina, Natalia. 2015. How to do linguistics with R: Data exploration and statistical analysis. Amsterdam: John Benjamins.10.1075/z.195Search in Google Scholar
Liu, Feng-Hsi. 2006. Dative constructions in Chinese. Language & Linguistics 7(4). 863–904.Search in Google Scholar
Li, Charles & Sandra Thompson. 1981. Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press.10.1525/9780520352858Search in Google Scholar
Liu, Yi-Hsien & Heeju Hwang. 2022. V-gei vs. double object construction: The mental representation of the Mandarin V-gei construction. In Andrew Simpson (ed.), New explorations in Chinese theoretical syntax, 539–553. Amsterdam: John Benjamins.10.1075/la.272.19liuSearch in Google Scholar
Li, Yi, Benedikt Szmrecsanyi & Weiwei Zhang. 2023. The theme-recipient alternation in Chinese: Tracking syntactic variation across seven centuries. Corpus Linguistics & Linguistic Theory 19(2). 207–235. https://doi.org/10.1515/cllt-2021-0048.Search in Google Scholar
Liu, Yi, Pascale Fung, Yongsheng Yang, Christopher Cieri, Shudong Huang & David Graff. 2006. HKUST/MTS: A very large scale Mandarin telephone speech corpus. In Qiang Huo, Bin Ma, Eng-Siong Chng & Haizhou Li (eds.), Chinese spoken language processing, 724–735. Berlin: Springer.10.1007/11939993_73Search in Google Scholar
Peng, Chun-Yi. 2020. The placement of co-verb gěi in spoken Mandarin varieties: A study on regional influences. Chinese Language & Discourse 11(2). 335–354. https://doi.org/10.1075/cld.18006.pen.Search in Google Scholar
Peyraube, Alain. 1985. Syntaxe diachronique du chinois: Évolution des constructions datives du XIVe siècle av. J.-C. au XVIIIe siècle. Cahiers de Linguistique Asie Orientale 14(2). 289–294.10.3406/clao.1985.1176Search in Google Scholar
Poplack, Shana & Sali A. Tagliamonte. 1999. The grammaticization of going to in (African American) English. Language Variation & Change 11(3). 315–342. https://doi.org/10.1017/s0954394599113048.Search in Google Scholar
R Core Team. 2020. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. http://www.R-project.org/.Search in Google Scholar
Röthlisberger, Melanie. 2021. Between context and community: Regional variation in register effects in the English dative alternation. In Elena Seoane & Douglas Biber (eds.), Corpus-based approaches to register variation, 111–142. Amsterdam: John Benjamins.10.1075/scl.103.05rotSearch in Google Scholar
Sun, Xixin (ed.). 2014. 中古近代汉语语法研究述要 [Studies on Middle Chinese and Early Mandarin grammar]. Shanghai: Fudan University Press.Search in Google Scholar
Szmrecsanyi, Benedikt. 2013. Grammatical variation in British English dialects: A study in corpus-based dialectometry. Cambridge: Cambridge University Press.10.1017/CBO9780511763380Search in Google Scholar
Szmrecsanyi, Benedikt, Jason Grafmiller & Laura Rosseel. 2019. Variation-based distance and similarity modeling: A case study in world Englishes. Frontiers in Artificial Intelligence 2. 1–14. https://doi.org/10.3389/frai.2019.00023.Search in Google Scholar
Tagliamonte, Sali. 2001. Comparative sociolinguistics. In Jack Chambers, Peter Trudgill & Natalie Schilling-Estes (eds.), Handbook of language variation and change, 729–763. Malden: Blackwell.10.1002/9780470756591.ch28Search in Google Scholar
Tagliamonte, Sali. 2012. Variationist sociolinguistics: Change, observation, interpretation. Malden: Wiley-Blackwell.Search in Google Scholar
Tamaredo, Iván, Melanie Röthlisberger, Jason Grafmiller & Benedikt Heller. 2020. Probabilistic indigenization effects at the lexis–syntax interface. English Language & Linguistics 24(2). 413–440. https://doi.org/10.1017/s1360674319000133.Search in Google Scholar
Tian, Xiaoyu, Weiwei Zhang & Dirk Speelman. 2022. Lectal variation in Chinese analytic causative constructions: What trees can and cannot tell us. In Dennis Tay & Molly Pan (eds.), Data analytics in cognitive linguistics: Methods and insights, 137–168. Berlin: De Gruyter Mouton.10.1515/9783110687279-006Search in Google Scholar
Wolk, Christoph, Joan Bresnan, Anette Rosenbach & Benedikt Szmrecsanyi. 2013. Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica 30(3). 382–419. https://doi.org/10.1075/dia.30.3.04wol.Search in Google Scholar
Yao, Yao & Feng-Hsi Liu. 2010. A working report on statistically modeling dative variation in Mandarin Chinese. In Chu-Ren Huang & Dan Jurafsky (eds.), Proceedings of the 23rd international conference on computational linguistics. Beijing: Coling 2010 Organizing Committee. https://aclanthology.org/C10-1.Search in Google Scholar
Zhang, Bojiang. 1999. 现代汉语的双及物结构式 [The ditransitive construction in Modern Chinese]. Zhongguo Yuwen 3. 175–184.Search in Google Scholar
Zhang, Wen. 2015. 影响汉语给予类双及物构式句式选择的制约因素 [The factors determining the alternation of the ditransitive construction]. Yuyan Jiaoxue Yu Yanjiu 2. 54–65.Search in Google Scholar
Zhang, Cong & Haitao Liu. 2019. Chinese evolution in recent 150 years: A diachronic study of word frequency in the Gospel of Mark. Journal of Chinese Linguistics 47(2). 497–530. https://doi.org/10.1353/jcl.2019.0021.Search in Google Scholar
Zhang, Dong & Jiajin Xu. 2019. 英汉与格交替现象的多因素研究 [A multifactorial study of dative alternation in English and Chinese]. Waiguoyu 2. 24–33.10.32629/eep.v2i6.326Search in Google Scholar
Zhang, Weiwei & Fang Wang. 2017. 从基于样例的概念空间看构式交替 – 以“让”和“给”的被动用法为例 [An exemplar-based conceptual space of Chinese passives with rang and gei]. Waiyu yu Waiyu Jiaoxue 6. 22–33.Search in Google Scholar
Zhu, Dexi. 1979. 与动词“给”相关的句法问题 [Syntactic issues related to the verb gei]. Fangyan 2. 81–87.Search in Google Scholar
© 2024 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Editorial
- Editorial 2024
- Phonetics & Phonology
- The role of recoverability in the implementation of non-phonemic glottalization in Hawaiian
- Epenthetic vowel quality crosslinguistically, with focus on Modern Hebrew
- Japanese speakers can infer specific sub-lexicons using phonotactic cues
- Articulatory phonetics in the market: combining public engagement with ultrasound data collection
- Investigating the acoustic fidelity of vowels across remote recording methods
- The role of coarticulatory tonal information in Cantonese spoken word recognition: an eye-tracking study
- Tracking phonological regularities: exploring the influence of learning mode and regularity locus in adult phonological learning
- Morphology & Syntax
- #AreHashtagsWords? Structure, position, and syntactic integration of hashtags in (English) tweets
- The meaning of morphomes: distributional semantics of Spanish stem alternations
- A refinement of the analysis of the resultative V-de construction in Mandarin Chinese
- L2 cognitive construal and morphosyntactic acquisition of pseudo-passive constructions
- Semantics & Pragmatics
- “All women are like that”: an overview of linguistic deindividualization and dehumanization of women in the incelosphere
- Counterfactual language, emotion, and perspective: a sentence completion study during the COVID-19 pandemic
- Constructing elderly patients’ agency through conversational storytelling
- Language Documentation & Typology
- Conative animal calls in Macha Oromo: function and form
- The syntax of African American English borrowings in the Louisiana Creole tense-mood-aspect system
- Syntactic pausing? Re-examining the associations
- Bibliographic bias and information-density sampling
- Historical & Comparative Linguistics
- Revisiting the hypothesis of ideophones as windows to language evolution
- Verifying the morpho-semantics of aspect via typological homogeneity
- Psycholinguistics & Neurolinguistics
- Sign recognition: the effect of parameters and features in sign mispronunciations
- Influence of translation on perceived metaphor features: quality, aptness, metaphoricity, and familiarity
- Effects of grammatical gender on gender inferences: Evidence from French hybrid nouns
- Processing reflexives in adjunct control: an exploration of attraction effects
- Language Acquisition & Language Learning
- How do L1 glosses affect EFL learners’ reading comprehension performance? An eye-tracking study
- Modeling L2 motivation change and its predictive effects on learning behaviors in the extramural digital context: a quantitative investigation in China
- Ongoing exposure to an ambient language continues to build implicit knowledge across the lifespan
- On the relationship between complexity of primary occupation and L2 varietal behavior in adult migrants in Austria
- The acquisition of speaking fundamental frequency (F0) features in Cantonese and English by simultaneous bilingual children
- Sociolinguistics & Anthropological Linguistics
- A computational approach to detecting the envelope of variation
- Attitudes toward code-switching among bilingual Jordanians: a comparative study
- “Let’s ride this out together”: unpacking multilingual top-down and bottom-up pandemic communication evidenced in Singapore’s coronavirus-related linguistic and semiotic landscape
- Across time, space, and genres: measuring probabilistic grammar distances between varieties of Mandarin
- Navigating linguistic ideologies and market dynamics within China’s English language teaching landscape
- Streetscapes and memories of real socialist anti-fascism in south-eastern Europe: between dystopianism and utopianism
- What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
- From sociolinguistic perception to strategic action in the study of social meaning
- Minority genders in quantitative survey research: a data-driven approach to clear, inclusive, and accurate gender questions
- Variation is the way to perfection: imperfect rhyming in Chinese hip hop
- Shifts in digital media usage before and after the pandemic by Rusyns in Ukraine
- Computational & Corpus Linguistics
- Revisiting the automatic prediction of lexical errors in Mandarin
- Finding continuers in Swedish Sign Language
- Conversational priming in repetitional responses as a mechanism in language change: evidence from agent-based modelling
- Construction grammar and procedural semantics for human-interpretable grounded language processing
- Through the compression glass: language complexity and the linguistic structure of compressed strings
- Could this be next for corpus linguistics? Methods of semi-automatic data annotation with contextualized word embeddings
- The Red Hen Audio Tagger
- Code-switching in computer-mediated communication by Gen Z Japanese Americans
- Supervised prediction of production patterns using machine learning algorithms
- Introducing Bed Word: a new automated speech recognition tool for sociolinguistic interview transcription
- Decoding French equivalents of the English present perfect: evidence from parallel corpora of parliamentary documents
- Enhancing automated essay scoring with GCNs and multi-level features for robust multidimensional assessments
- Sociolinguistic auto-coding has fairness problems too: measuring and mitigating bias
- The role of syntax in hashtag popularity
- Language practices of Chinese doctoral students studying abroad on social media: a translanguaging perspective
- Cognitive Linguistics
- Metaphor and gender: are words associated with source domains perceived in a gendered way?
- Crossmodal correspondence between lexical tones and visual motions: a forced-choice mapping task on Mandarin Chinese
Articles in the same Issue
- Frontmatter
- Editorial
- Editorial 2024
- Phonetics & Phonology
- The role of recoverability in the implementation of non-phonemic glottalization in Hawaiian
- Epenthetic vowel quality crosslinguistically, with focus on Modern Hebrew
- Japanese speakers can infer specific sub-lexicons using phonotactic cues
- Articulatory phonetics in the market: combining public engagement with ultrasound data collection
- Investigating the acoustic fidelity of vowels across remote recording methods
- The role of coarticulatory tonal information in Cantonese spoken word recognition: an eye-tracking study
- Tracking phonological regularities: exploring the influence of learning mode and regularity locus in adult phonological learning
- Morphology & Syntax
- #AreHashtagsWords? Structure, position, and syntactic integration of hashtags in (English) tweets
- The meaning of morphomes: distributional semantics of Spanish stem alternations
- A refinement of the analysis of the resultative V-de construction in Mandarin Chinese
- L2 cognitive construal and morphosyntactic acquisition of pseudo-passive constructions
- Semantics & Pragmatics
- “All women are like that”: an overview of linguistic deindividualization and dehumanization of women in the incelosphere
- Counterfactual language, emotion, and perspective: a sentence completion study during the COVID-19 pandemic
- Constructing elderly patients’ agency through conversational storytelling
- Language Documentation & Typology
- Conative animal calls in Macha Oromo: function and form
- The syntax of African American English borrowings in the Louisiana Creole tense-mood-aspect system
- Syntactic pausing? Re-examining the associations
- Bibliographic bias and information-density sampling
- Historical & Comparative Linguistics
- Revisiting the hypothesis of ideophones as windows to language evolution
- Verifying the morpho-semantics of aspect via typological homogeneity
- Psycholinguistics & Neurolinguistics
- Sign recognition: the effect of parameters and features in sign mispronunciations
- Influence of translation on perceived metaphor features: quality, aptness, metaphoricity, and familiarity
- Effects of grammatical gender on gender inferences: Evidence from French hybrid nouns
- Processing reflexives in adjunct control: an exploration of attraction effects
- Language Acquisition & Language Learning
- How do L1 glosses affect EFL learners’ reading comprehension performance? An eye-tracking study
- Modeling L2 motivation change and its predictive effects on learning behaviors in the extramural digital context: a quantitative investigation in China
- Ongoing exposure to an ambient language continues to build implicit knowledge across the lifespan
- On the relationship between complexity of primary occupation and L2 varietal behavior in adult migrants in Austria
- The acquisition of speaking fundamental frequency (F0) features in Cantonese and English by simultaneous bilingual children
- Sociolinguistics & Anthropological Linguistics
- A computational approach to detecting the envelope of variation
- Attitudes toward code-switching among bilingual Jordanians: a comparative study
- “Let’s ride this out together”: unpacking multilingual top-down and bottom-up pandemic communication evidenced in Singapore’s coronavirus-related linguistic and semiotic landscape
- Across time, space, and genres: measuring probabilistic grammar distances between varieties of Mandarin
- Navigating linguistic ideologies and market dynamics within China’s English language teaching landscape
- Streetscapes and memories of real socialist anti-fascism in south-eastern Europe: between dystopianism and utopianism
- What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
- From sociolinguistic perception to strategic action in the study of social meaning
- Minority genders in quantitative survey research: a data-driven approach to clear, inclusive, and accurate gender questions
- Variation is the way to perfection: imperfect rhyming in Chinese hip hop
- Shifts in digital media usage before and after the pandemic by Rusyns in Ukraine
- Computational & Corpus Linguistics
- Revisiting the automatic prediction of lexical errors in Mandarin
- Finding continuers in Swedish Sign Language
- Conversational priming in repetitional responses as a mechanism in language change: evidence from agent-based modelling
- Construction grammar and procedural semantics for human-interpretable grounded language processing
- Through the compression glass: language complexity and the linguistic structure of compressed strings
- Could this be next for corpus linguistics? Methods of semi-automatic data annotation with contextualized word embeddings
- The Red Hen Audio Tagger
- Code-switching in computer-mediated communication by Gen Z Japanese Americans
- Supervised prediction of production patterns using machine learning algorithms
- Introducing Bed Word: a new automated speech recognition tool for sociolinguistic interview transcription
- Decoding French equivalents of the English present perfect: evidence from parallel corpora of parliamentary documents
- Enhancing automated essay scoring with GCNs and multi-level features for robust multidimensional assessments
- Sociolinguistic auto-coding has fairness problems too: measuring and mitigating bias
- The role of syntax in hashtag popularity
- Language practices of Chinese doctoral students studying abroad on social media: a translanguaging perspective
- Cognitive Linguistics
- Metaphor and gender: are words associated with source domains perceived in a gendered way?
- Crossmodal correspondence between lexical tones and visual motions: a forced-choice mapping task on Mandarin Chinese