Home Predicting syntactic choice in Mandarin Chinese: a corpus-based analysis of ba sentences and SVO sentences
Article
Licensed
Unlicensed Requires Authentication

Predicting syntactic choice in Mandarin Chinese: a corpus-based analysis of ba sentences and SVO sentences

  • Yu Fang ORCID logo and Haitao Liu ORCID logo EMAIL logo
Published/Copyright: March 5, 2021

Abstract

This paper investigates the effects of 10 factors on the choice between alternative ba sentences and SVO sentences in Mandarin Chinese. These factors are givenness, definiteness, animacy and pronominality of NP2s, NP2 length, VP length, verb sense, syntactic parallelism, dependency distance, and surprisal. Using corpus data and mixed-effects logistic regression modeling, we find that on the one hand, givenness, syntactic parallelism, and the log-transformed ratio of NP2 length and VP length are significant predictors of the choice between ba sentences and SVO sentences. A new NP2, a large length ratio and a parallel construction predict an SVO sentence rather than a ba sentence. On the other hand, dependency distance and surprisal estimated by the trigram model are effective in predicting the choice between naturally occurring ba/SVO sentences and their alternatives. Naturally occurring sentences are more likely to have shorter dependency distances and smaller surprisal values than the converted sentences. The effects of these five factors on syntactic choice are congruent with results of previous studies, which suggests that some determinants of syntactic choice are shared among languages.


Corresponding author: Haitao Liu, Department of Linguistics, Zhejiang University, Hangzhou, China; and Center for Linguistics and Applied Linguistics, Guangdong University of Foreign Studies, Guangzhou, China, E-mail:

Acknowledgments

We would like to thank the editors and the anonymous reviewers for their helpful comments and suggestions.

  1. Data availability statement: The data that support the findings of this study are openly available in [multiple_regression_modeling] at https://github.com/fangyu92.

References

Arnold, Jennifer E., Losongco Anthony, Thomas Wasow & Ginstrom Ryan. 2000. Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering. Language 76(1). 28–55. https://doi.org/10.1353/lan.2000.0045.Search in Google Scholar

Barrouillet, Pierre, Sophie Bernardin & Valerie Camos. 2004. Time constraints and resource sharing in adults’ working memory spans. Journal of Experimental Psychology: General 133(1). 83–100. https://doi.org/10.1037/0096-3445.133.1.83.Search in Google Scholar

Bates, Douglas, Martin Mächler, Bolker Ben & Steven Walker. 2011. lme4: Linear mixed-effects models using S4 classes. Available at: https://cran.r-project.org/package=lme4 (accessed January 2021).Search in Google Scholar

Bates, Douglas, Reinhold Kliegl, Shravan Vasishth & Harald Baayen. 2015. Parsimonious mixed models. https://arxiv.org/abs/1506.04967 (accessed January 2021).Search in Google Scholar

Bernaisch, Tobias, Stefan Th. Gries & Joybrato Mukherjee. 2014. The dative alternation in South Asian English (es): Modelling predictors and predicting prototypes. English World-Wide 35(1). 7–31. https://doi.org/10.1075/eww.35.1.02ber.Search in Google Scholar

Boston, Marisa Ferrara. 2013. Humdep 3.0. An incremental dependency parser developed for human sentence processing modeling. http://conf.ling.cornell.edu/Marisa/ (accessed January 2021).Search in Google Scholar

Boston, Marisa Ferrara, John T. Hale, Shravan Vasishth & Reinhold Kliegl. 2011. Parallel processing and sentence comprehension difficulty. Language & Cognitive Processes 26(3). 301–349. https://doi.org/10.1080/01690965.2010.492228.Search in Google Scholar

Branigan, Holly P., Martin J. Pickering & Mikihiro Tanaka. 2008. Contributions of animacy to grammatical function assignment and word order during production. Lingua 118(2). 172–189. https://doi.org/10.1016/j.lingua.2007.02.003.Search in Google Scholar

Bresnan, Joan, Cueni Anna, Tatiana Nikitina & Harald Baayen. 2007. Predicting the dative alternation. In Gerlot Boume, Irene Krämer & Joost Zwarts (eds.), Cognitive foundations of interpretation, 69–94. Amsterdam: Royal Netherlands Academy of Science.Search in Google Scholar

Bresnan, Joan & Marilyn Ford. 2010. Predicting syntax: Processing dative constructions in American and Australian varieties of English. Language 86(1). 168–213.10.1353/lan.0.0189Search in Google Scholar

Chao, Yuanren. 1968. A grammar of spoken Chinese. Berkeley: University of California Press.Search in Google Scholar

Chen, Lijing, Xingshan Li & Yufang Yang. 2012. Focus, newness and their combination: Processing of information structure in discourse. PloS One 7(8). e42533. https://doi.org/10.1371/journal.pone.0042533.Search in Google Scholar

Chen, Ping. 2004. Identifiability and definiteness in Chinese. Linguistics 42(6). 1129–1184. https://doi.org/10.1515/ling.2004.42.6.1129.Search in Google Scholar

Chen, Xinying & Gerdes Kim. 2017. Classifying languages by dependency structure. Typologies of delexicalized universal dependency treebanks. In Simonetta Montemagni & Joakim Nivre (eds.), Proceedings of the Fourth International Conference on Dependency Linguistics, 54–63. Östergötland: Linköping University Electronic Press.Search in Google Scholar

Cheung, Hung-nin Samuel. 1973. A comparative study in Chinese grammars: The ba-construction. Journal of Chinese Linguistics. 1(3). 343–382.Search in Google Scholar

Collins, Peter. 1995. The indirect object construction in English: An informational approach. Linguistics 33(1). 35–50. https://doi.org/10.1515/ling.1995.33.1.35.Search in Google Scholar

Croft, William. 2001. Radical construction grammar: Syntactic theory in typological perspective. Oxford: Oxford University Press.10.1093/acprof:oso/9780198299554.001.0001Search in Google Scholar

Croft, William. 2003. Typology and universals, 2nd edn. Cambridge: Cambridge University Press.Search in Google Scholar

Dammalapati, Samvit, Rajakrishnan Rajkumar & Sumeet Agarwal. 2019. Expectation and locality effects in the prediction of disfluent fillers and repairs in English speech. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, 103–109. Stroundsburg, PA: Association for Computational Linguistics.10.18653/v1/N19-3015Search in Google Scholar

De Cuypere, Ludovic. 2015. The Old English to-dative construction. English Language and Linguistics 19(1). 1–26. https://doi.org/10.1017/s1360674314000276.Search in Google Scholar

Demberg, Vera & Frank Keller. 2008. Data from eye-tracking corpora as evidence for theories of syntactic processing complexity. Cognition 109(2). 193–210. https://doi.org/10.1016/j.cognition.2008.07.008.Search in Google Scholar

Diessel, Holger. 2017. Usage-based linguistics. In Mark Aronoff (ed.), Oxford research encyclopedia of linguistics. New York: Oxford University Press.10.1093/acrefore/9780199384655.013.363Search in Google Scholar

Dubey, Amit, Frank Keller & Patrick Sturt. 2008. A probabilistic corpus-based model of syntactic parallelism. Cognition 109(3). 326–344. https://doi.org/10.1016/j.cognition.2008.09.006.Search in Google Scholar

Frank, Stefan L. & Bod Rens. 2011. Insensitivity of the human sentence-processing system to hierarchical structure. Psychological Science 22(6). 829–834. https://doi.org/10.1177/0956797611409589.Search in Google Scholar

Fu, Yuxian. 1981. Conditions for transformation of ba constructions and SVO constructions. Language Teaching and Linguistic Studies (1). 27–44.Search in Google Scholar

Futrell, Richard, Kyle Mahowald & Edward Gibson. 2015. Large-scale evidence of dependency length minimization in 37 languages. Proceedings of the National Academy of Sciences of the United States of America 112(33). 10336–10341. https://doi.org/10.1073/pnas.1502134112.Search in Google Scholar

Futrell, Richard, Ethan Wilcox, Takashi Morita, Peng Qian, Miguel Ballesteros & Roger Levy. 2019. Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State. Proceedings of NAACL-HLT 2019. 32–42.10.18653/v1/N19-1004Search in Google Scholar

Geleyn, Tim. 2017. Syntactic variation and diachrony. The case of the Dutch dative alternation. Corpus Linguistics and Linguistic Theory 13(1). 65–96. https://doi.org/10.1515/cllt-2015-0062.Search in Google Scholar

Gibson, Edward. 2000. The dependency locality theory: A distance-based theory of linguistic complexity. In Alec Marantz, Miyashita Yasushi & O’Neil Wayne (eds.), Image, language, brain: Papers from the first mind articulation project symposium, 94–126. Cambridge, MA: The MIT Press.10.7551/mitpress/3654.003.0008Search in Google Scholar

Goldberg, Adele E. 2002. Surface generalizations: An alternative to alternations. Cognitive Linguistics 13(4). 327–356. https://doi.org/10.1515/cogl.2002.022.Search in Google Scholar

Grafmiller, Jason & Benedikt Szmrecsanyi. 2018. Mapping out particle placement in Englishes around the world: A study in comparative sociolinguistic analysis. Language Variation and Change 30(3). 385–412. https://doi.org/10.1017/s0954394518000170.Search in Google Scholar

Gries, Stefan Thomas. 2013. Statistics for linguistics with R: A practical introduction. Berlin/Boston: Walter de Gruyter.10.1515/9783110307474Search in Google Scholar

Gruberg, Nicholas, Rachel Ostrand, Shota Momma & Victor S. Ferreira. 2019. Syntactic entrainment: The repetition of syntactic structures in event descriptions. Journal of Memory and Language 107. 216–232. https://doi.org/10.1016/j.jml.2019.04.005.Search in Google Scholar

Hale, John. 2001. A probabilistic Earley parser as a psycholinguistic model. In Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics, 159–166. Stroudsburg, PA: Association for Computational Linguistics.10.3115/1073336.1073357Search in Google Scholar

Haskell, Todd R. & Maryellen C. MacDonald. 2005. Constituent structure and linear order in language production: Evidence from subject-verb agreement. Journal of Experimental Psychology Learning Memory and Cognition 31(5). 891–904. https://doi.org/10.1037/0278-7393.31.5.891.Search in Google Scholar

Hashimoto, Mantaro J. 1969. Observation on the passive Construction. Unicorn 5. 59–71.10.1515/9783110800432.53Search in Google Scholar

Heafield, Kenneth, Ivan Pouzyrevsky, Jonathan H. Clark & Philipp Koehn. 2013. Scalable modified Kneser-Ney language model estimation. In Hinrich Schuetze, Pascale Fung & Massimo Poesio (eds.), Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 690–696. Stroudsburg, PA: Association for Computational Linguistics.Search in Google Scholar

Heller, Benedikt, Benedikt Szmrecsanyi & Jason Grafmiller. 2017. Stability and fluidity in syntactic variation world-wide: The genitive alternation across varieties of English. Journal of English Linguistics 45(1). 3–27. https://doi.org/10.1177/0075424216685405.Search in Google Scholar

Hollmann, Willem. 2007. From language-specific constraints to implicational universals: A cognitive-typological view of the dative alternation. Functions of Language 14(1). 57–78. https://doi.org/10.1075/fol.14.1.05hol.Search in Google Scholar

Huang, C. T. James, Y. H. Audrey Li & Yafei Li. 2009. The syntax of Chinese. New York: Cambridge University Press.10.1017/CBO9781139166935Search in Google Scholar

Hudson, Richard. 1995. Measuring syntactic difficulty. http://dickhudson.com/wp-content/uploads/2013/07/Difficulty.pdf (accessed January 2021).Search in Google Scholar

Janssen, Niels, Yanchao Bi & Alfonso Caramazza. 2008. A tale of two frequencies: Determining the speed of lexical access for Mandarin Chinese and English compounds. Language & Cognitive Processes 23(7–8). 1191–1223. https://doi.org/10.1080/01690960802250900.Search in Google Scholar

Jing-Schmidt, Zhuo & Hongyin Tao. 2009. The Mandarin disposal constructions: Usage and development. Language and Linguistics 10(1). 29–58.Search in Google Scholar

Joachims, Thorsten. 2002. Optimizing search engines using clickthrough data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 133–142. New York: Association for Computing Machinery.10.1145/775047.775067Search in Google Scholar

Kendall, Tyler, Joan Bresnan & van Herk Gerard. 2011. The dative alternation in African American English: Researching syntactic variation and change across sociolinguistic datasets. Corpus Linguistics and Linguistic Theory 7(2). 229–244. https://doi.org/10.1515/cllt.2011.011.Search in Google Scholar

Kizach, Johannes. 2014. A multifactorial analysis of the Russian adversity impersonal construction. Russian Linguistics 38(2). 205–211. https://doi.org/10.1007/s11185-014-9128-z.Search in Google Scholar

Klavan, Jane & Dagmar Divjak. 2016. The cognitive plausibility of statistical classification models: Comparing textual and behavioral evidence. Folia Linguistica 50(2). 355–384. https://doi.org/10.1515/flin-2016-0014.Search in Google Scholar

Lam, Charles Tsz-Kwan. 2015. Understanding what verb phrases and adjective phrases have in common: Evidence from Mandarin alternations. West Lafayette: Purdue University doctoral dissertation.Search in Google Scholar

Lei, Yu. 2015. The definite direct object double object construction and its transitive “ba” sentence. Chinese Language Learning 1. 67–75.Search in Google Scholar

Levy, Roger. 2008. Expectation-based syntactic comprehension. Cognition 106(3). 1126–1177. https://doi.org/10.1016/j.cognition.2007.05.006.Search in Google Scholar

Lewis, Richard L. & Shravan Vasishth. 2005. An activation-based model of sentence processing as skilled memory retrieval. Cognitive Science 29(3). 375–419. https://doi.org/10.1207/s15516709cog0000_25.Search in Google Scholar

Li, Charles N. & Sandra A. Thompson. 1981. Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press.10.1525/9780520352858Search in Google Scholar

Lipenkova, Janna. 2011. Reanalysis of semantically required dependents as complements in the Chinese ba-construction. In Stefan Müller (ed.), Proceedings of the 18th International Conference on Head-Driven Phrase Structure Grammar, 147–166. Stanford, CA: CSLI Publications.10.21248/hpsg.2011.9Search in Google Scholar

Liu, Danqing. 2001. A typological study of giving-type ditranstitive patterns in Chinese. Studies of the Chinese Language (5). 387–398.Search in Google Scholar

Liu, Feng H. 2007. Word order variation and “ba” sentences in Chinese. Studies in Language 31(3). 649–682. https://doi.org/10.1075/sl.31.3.05liu.Search in Google Scholar

Liu, Haitao. 2008. Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science 9(2). 159–191.10.17791/jcs.2008.9.2.159Search in Google Scholar

Liu, Haitao. 2009. Dependency grammar: From theory to practice. Beijing: Science Press.Search in Google Scholar

Liu, Haitao, Richard Hudson & Zhiwei Feng. 2009. Using a Chinese treebank to measure dependency distance. Corpus Linguistics and Linguistic Theory 5(2). 161–174. https://doi.org/10.1515/cllt.2009.007.Search in Google Scholar

Liu, Haitao, Chunshan Xu & Junying Liang. 2017. Dependency distance: A new perspective on syntactic patterns in natural languages. Physics of Life Reviews 21. 171–193. https://doi.org/10.1016/j.plrev.2017.03.002.Search in Google Scholar

Lohmann, Arne & Tayo Takada. 2014. Order in NP conjuncts in spoken English and Japanese. Lingua 152. 48–64. https://doi.org/10.1016/j.lingua.2014.09.011.Search in Google Scholar

Lü, Shuxiang. 1980. 800 words of Modern Chinese. Beijing: The Commercial Press.Search in Google Scholar

Macdonald, Maryellen C. 2013. How language production shapes language form and comprehension. Frontiers in Psychology 4(226). 226. https://doi.org/10.3389/fpsyg.2013.00226.Search in Google Scholar

Meľčuk, Igor A. 2001. Communicative organization in natural language, studies in language companion series. Amsterdam: John Benjamins.10.1075/slcs.57Search in Google Scholar

Nam, Yun-ju, Upyong Hong & Hongoak Yun. 2014. Speakers are interconnected with comprehenders: The asymmetry of argument order by long-before-short preference in Korean. In Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 36, 3149–3154. Austin TX: Cognitive Science Society.Search in Google Scholar

Nicenboim, Bruno, Shravan Vasishth, Carolina Gattei, Mariano Sigman & Reinhold Kliegl. 2015. Working memory differences in long-distance dependency resolution. Frontiers in Psychology 6. 312. https://doi.org/10.3389/fpsyg.2015.00312.Search in Google Scholar

Nivre, Joakim. 2006. Inductive dependency parsing. In Nancy Ide & Jean Véronis (eds.), Text, Speech and Language Technology. Heidelberg: Springer.10.1007/1-4020-4889-0Search in Google Scholar

Pickering, Martin J., Janet F. McLean & Holly P. Branigan. 2013. Persistent structural priming and frequency effects during comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition 39(3). 890. https://doi.org/10.1037/a0029181.Search in Google Scholar

Prince, Ellen F. 1992. The ZPG letter: Subjects, definiteness and information status. In Sandra A. Thompson & William C. Mann (eds.), Discourse description: Diverse analyses of a fundraising text, 295–325. Amsterdam: John Benjamins.10.1075/pbns.16.12priSearch in Google Scholar

Qiu, Likun, Yue Zhang, Peng Jin & Houfeng Wang. 2014. Multi-view Chinese treebanking. In Proceedings of the 25th International Conference on Computational Linguistics, 257–268. Dublin, Ireland: Dublin City University and Association for Computational Linguistics.Search in Google Scholar

Rajkumar, Rajakrishnan, Marten van Schijndel, Michael White & Schuler William. 2016. Investigating locality effects and surprisal in written English syntactic choice phenomena. Cognition 155. 204–232. https://doi.org/10.1016/j.cognition.2016.06.008.Search in Google Scholar

Rochemont, Michael. 2013. Discourse new, F-marking, and normal stress. Lingua 136. 38–62. https://doi.org/10.1016/j.lingua.2013.07.016.Search in Google Scholar

Röthlisberger, Melanie, Jason Grafmiller & Benedikt Szmrecsanyi. 2017. Cognitive indigenization effects in the English dative alternation. Cognitive Linguistics 28(4). 673–710. https://doi.org/10.1515/cog-2016-0051.Search in Google Scholar

Ryan, Kevin M. 2019. Prosodic end-weight reflects phrasal stress. Natural Language & Linguistic Theory 37. 315–356. https://doi.org/10.1007/s11049-018-9411-6.Search in Google Scholar

Schäfer, Roland. 2018. Abstractions and exemplars: The measure noun phrase alternation in German. Cognitive Linguistics 29(4). 729–771. https://doi.org/10.1515/cog-2017-0050.Search in Google Scholar

Segaert, Katrien, Gerard Kempen, Karl Magnus Petersson & Peter Hagoort. 2013. Syntactic priming and the lexical boost effect during sentence production and sentence comprehension: An fMRI study. Brain and Language 124(2). 174–183. https://doi.org/10.1016/j.bandl.2012.12.003.Search in Google Scholar

Sheng, Yalan & Fuyun Wu. 2018. Animacy modulates demonstrative-classifier positioning in Chinese relative clauses. Foreign Language Research 200(1). 54–59.Search in Google Scholar

Shih, Stephanie S. & Zuraw. Kie. 2017. Phonological conditions on variable adjective and noun word order in Tagalog. Language 93(4). e317–e352. https://doi.org/10.1353/lan.2017.0075.Search in Google Scholar

Smith, Nathaniel J. & Levy. Roger. 2013. The effect of word predictability on reading time is logarithmic. Cognition 128(3). 302–319. https://doi.org/10.1016/j.cognition.2013.02.013.Search in Google Scholar

Squires, Lauren. 2019. Genre and linguistic expectation shift: Evidence from pop song lyrics. Language in Society 48(1). 1–30. https://doi.org/10.1017/s0047404518001112.Search in Google Scholar

Sun, Chao-Fen & Talmy Givón. 1985. On the so-called SOV word order in Mandarin Chinese: A quantified text study and its implications. Language 61(2). 329–351. https://doi.org/10.2307/414148.Search in Google Scholar

Sybesma, Rint. 1999. The Mandarin VP. Dordrecht: Kluwer Academic Publishers.10.1007/978-94-015-9163-8Search in Google Scholar

Szmrecsanyi, Benedikt, Jason Grafmiller, Benedikt Heller & Melanie Röthlisberger. 2016. Around the world in three alternations: Modeling syntactic variation in varieties of English. English World-Wide 37(2). 109–137. https://doi.org/10.1075/eww.37.2.01szm.Search in Google Scholar

Temperley, David. 2007. Minimization of dependency length in written English. Cognition 105(2). 300–333. https://doi.org/10.1016/j.cognition.2006.09.011.Search in Google Scholar

Thothathiri, Malathi & Jesse Snedeker. 2008. Give and take: Syntactic priming during spoken language comprehension. Cognition 108(1). 51–68. https://doi.org/10.1016/j.cognition.2007.12.012.Search in Google Scholar

Van de Velde, Maartje, Gerard Kempen & Karin Harbusch. 2015. Dative alternation and planning scope in spoken language: A corpus study on effects of verb bias in VO and OV clauses of Dutch. Lingua 165. 92–108.10.1016/j.lingua.2015.07.006Search in Google Scholar

Velnić, Marta. 2019. The influence of animacy, givenness, and focus on object order in Croatian ditransitives. Studia Linguistica 73(1). 175–201.10.1111/stul.12094Search in Google Scholar

Wang, Li. 1954. Theory of Chinese grammar. Beijing: Zhonghua Shuju.Search in Google Scholar

Wang, Shan & Francis Bond. 2013. Building the Chinese Wordnet (COW): Starting from Core Synsets. In Proceedings of the 11th Workshop on Asian Language Resources: ALR-2013 a Workshop of The 6th International Joint Conference on Natural Language Processing (IJCNLP-6), 10–18. Nagoya: Asian Federation of Natural Language Processing.Search in Google Scholar

Wang, Yaqin & Haitao Liu. 2017. The effects of genre on dependency distance and dependency direction. Language Sciences 59. 135–147. https://doi.org/10.1016/j.langsci.2016.09.006.Search in Google Scholar

Wolk, Christoph, Joan Bresnan, Anette Rosenbach & Benedikt Szmrecsanyi. 2013. Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica 30(3). 382–419. https://doi.org/10.1075/dia.30.3.04wol.Search in Google Scholar

Yamashita, Hiroko & Franklin Chang. 2001. “Long before short” preference in the production of a head-final language. Cognition 81(2). B45–B55. https://doi.org/10.1016/s0010-0277(01)00121-4.Search in Google Scholar

Zhan, Weidong, Rui Guo & Yirong Chen. 2003. The CCL Corpus of Chinese Texts: 700 million Chinese Characters, the 11th Century B.C. – present. http://ccl.pku.edu.cn:8080/ccl_corpus (accessed January 2021).Search in Google Scholar

Zhang, Yue & Stephen Clark. 2011. Syntactic processing using the generalized perceptron and beam search. Computational Linguistics 37(1). 105–151. https://doi.org/10.1162/coli_a_00037.Search in Google Scholar

Zhou, Junyi, Guojie Ma, Xingshan Li & Marcus Taft. 2018. The time course of incremental word processing during Chinese reading. Reading and Writing 31(3). 607–625. https://doi.org/10.1007/s11145-017-9800-y.Search in Google Scholar

Zou, Ke. 1995. The syntax of the Chinese BA-constructions and verb compounds: A morphosyntactic analysis. Los Angeles: University of Sourthern California doctoral dissertation.Search in Google Scholar

Received: 2020-01-22
Accepted: 2021-01-23
Published Online: 2021-03-05
Published in Print: 2021-05-26

© 2021 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 3.10.2025 from https://www.degruyterbrill.com/document/doi/10.1515/cog-2020-0005/html
Scroll to top button