Home Linguistics & Semiotics A probability distribution of dependencies in interlanguage
Article
Licensed
Unlicensed Requires Authentication

A probability distribution of dependencies in interlanguage

  • Yuxin Hao , Xuelin Wang , Shuai Bin and Haitao Liu EMAIL logo
Published/Copyright: January 2, 2023

Abstract

The diversity of syntactic units in second language has attracted much scholarly attention. Most existing studies on syntactic diversity have focused on only a small number of syntactic structures, and it is difficult to find studies that consider the full range of syntactic dependencies present in the dataset. Based on a syntactic annotated interlanguage corpus that we constructed, this paper is a quantitative study of dependencies in English-speaking Chinese learners’ interlanguage across proficiency levels. We fit the frequency distributions of dependency type, word class (both as dependent and governor), verb as a governor, and noun as a dependent with a modified right-truncated Zipf-Alekseev distribution and Zipf’s law. Our findings show that: (1) from the mathematical model, interlanguage followed distributional regularities like natural languages in terms of the syntactic structure distribution; (2) most of the determination coefficients’ R2 were high, indicating that the investigated distributions in interlanguage fit the distributional law finding in natural languages. This also demonstrated that both interlanguages and natural languages consistently conform to the law of linguistic diversity and uniformity; (3) the dependency relation distribution parameters a and b manifest the developmental trend of L2 learners’ proficiency levels, demonstrating that the parameters had universal applicability in reflecting interlanguage proficiency.


Corresponding author: Haitao Liu, Department of Linguistics, Zhejiang University, 866 Yuhangtang Rd, Hangzhou 310058, China; and Center for Linguistics and Applied Linguistics, Guangdong University of Foreign Studies, Guangzhou, China, E-mail:

Award Identifier / Grant number: 21BYY113

  1. Research funding: This research is supported by The National Social Science Fund of China (Grant No. 21BYY113).

References

Adjemian, Christian. 1976. On the nature of interlanguage systems. Language Learning 26(2). 297–320. https://doi.org/10.1111/j.1467-1770.1976.tb00279.x.Search in Google Scholar

Alexopoulou, Theodora, Marije Michel, Akira Murakami & Detmar Meurers. 2017. Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques. Language Learning 67(S1). 180–208. https://doi.org/10.1111/lang.12232.Search in Google Scholar

Altmann, Gabriel. 2005. Diversification processes. In Reinhard Köhler, Gabriel Altmann & Rajmund G. Piotrowski (eds.), Quantitative linguistics. An international handbook, 648–659. Berlin: de Gruyter.Search in Google Scholar

Bates, Elizabeth, Inge Bretherton & Lynn Sebestyen Snyder. 1988. From first words to grammar: Individual differences and dissociable mechanisms. Cambridge: Cambridge University Press.Search in Google Scholar

Best, Karl-Heinz. 2006. Quantitative Linguistik. Eine Annäherung. Göttingen: Peust & Gutschmidt.Search in Google Scholar

Bi, Peng & Jingyang Jiang. 2020. Syntactic complexity in assessing young adolescent EFL learners’ writings: Syntactic elaboration and diversity. System 91. 102248. https://doi.org/10.1016/j.system.2020.102248.Search in Google Scholar

Bulté, Bram & Alex Housen. 2014. Conceptualizing and measuring short-term changes in L2 writing complexity. Journal of Second Language Writing 26. 42–65.10.1016/j.jslw.2014.09.005Search in Google Scholar

Čech, Radek, Ján Mačutek, Zdeněk Žabokrtský & Aleš Horák. 2017. Polysemy and synonymy in syntactic dependency networks. Digital Scholarship in the Humanities 32(1). 36–49.10.1093/llc/fqv028Search in Google Scholar

Čech, Radek, Jiří Milička, Ján Mačutek, Michaela Koščová & Markéta Lopatková. 2018. Quantitative analysis of syntactic dependency in Czech. Quantitative Analysis of Dependency Structures 72. 53.10.1515/9783110573565-003Search in Google Scholar

Chao, Yuen Ren. 1968. A grammar of spoken Chinese. Berkeley and Los Angeles: University of California Press.Search in Google Scholar

Che, Wanxiang, Yunlong Feng, Libo Qin & Ting Liu. 2021. N-LTP: An open-source neural language technology platform for Chinese. In Proceedings of the 2021 conference on empirical methods in natural language processing: System demonstrations, 42–49. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.10.18653/v1/2021.emnlp-demo.6Search in Google Scholar

Chen, Heng & Hai Xu. 2019. Quantitative linguistics approach to interlanguage development: A study based on the Guangwai-Lancaster Chinese learner corpus. Lingua 230. 102736. https://doi.org/10.1016/j.lingua.2019.102736.Search in Google Scholar

Corder, Stephen Pit. 1967. The significance of learner’s errors. International Review of Applied Linguistic 5. 161–170. https://doi.org/10.1515/iral.1967.5.1-4.161.Search in Google Scholar

De Clercq, Bastien & Alex Housen. 2017. A cross-linguistic perspective on syntactic complexity in L2 development: Syntactic elaboration and diversity. Modern Language Journal 101(2). 315–334. https://doi.org/10.1111/modl.12396.Search in Google Scholar

Divjak, Dagmar. 2019. Frequency in language: Memory, attention and learning. Cambridge: Cambridge University Press.10.1017/9781316084410Search in Google Scholar

Durrant, Philip, Mark Brenchley & Lee McCallum. 2021. Understanding development and proficiency in writing: Quantitative corpus linguistic approaches. Cambridge: Cambridge University Press.10.1017/9781108770101Search in Google Scholar

Ellis, Nick C. 2002. Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition 24(2). 143–188. https://doi.org/10.1017/s0272263102002024.Search in Google Scholar

Ellis, Nick C., Rita Simpson-Vlach, Ute Römer, Matthew Brook O’Donnell & Stefanie Wulff. 2015. Learner corpora and formulaic language in SLA. In Sylviane Granger, Gaëtanelle Gilquin & Fanny Meunier (eds.), Cambridge handbook of learner corpus research, 357–378. Cambridge, UK: Cambridge University Press.10.1017/CBO9781139649414.016Search in Google Scholar

Ellis, Rod. 2003. Task-based language learning and teaching. Oxford: Oxford University Press.Search in Google Scholar

Ellis, Rod. 2009. The differential effects of three types of task planning on the fluency, complexity, and accuracy in L2 oral production. Applied Linguistics 30(4). 474–509. https://doi.org/10.1093/applin/amp042.Search in Google Scholar

Eppler, Eva Duran. 2014. The dependency distance hypothesis for bilingual code-switching. In Kim Gerdes, Eva Hajičová & Leo Wanner (eds.), Dependency linguistics: Recent advances in linguistic theory using dependency structures, 183–206. Amsterdam, The Netherlands: John Benjamins.10.1075/la.215.09durSearch in Google Scholar

Gao, Song, Hongxin Zhang & Haitao Liu. 2014. Synergetic properties of Chinese verb valency. Journal of Quantitative Linguistics 21(1). 1–21. https://doi.org/10.1080/09296174.2013.856132.Search in Google Scholar

Gao, Song, Wei Yan & Haitao Liu. 2010. A quantitative study on syntactic functions of Chinese verbs based on dependency treebank. Chinese Language Learning 5. 105–112.Search in Google Scholar

Givón, Talmy. 2009. The genesis of syntactic complexity: Diachrony, ontogeny, neuro-cognition, evolution. Amsterdam, The Netherlands: John Benjamins Publishing.10.1075/z.146Search in Google Scholar

Gries, Stefan Th. 2008. Corpus-based methods in analyses of second language acquisition data. In Peter Robinson & Nick C. Ellis (eds.), Handbook of cognitive linguistics and second language acquisition, 406–431. New York/London: Routledge.Search in Google Scholar

Housen, Alex & Folkert Kuiken. 2009. Complexity, accuracy, and fluency in second language acquisition. Applied Linguistics 30(4). 461–473. https://doi.org/10.1093/applin/amp048.Search in Google Scholar

Hřebíček, Luděk. 2002. Zipf’s law and text. Glottometrics 3. 27–38.Search in Google Scholar

Hudson, Richard. 2007. Language networks: The new word grammar. Oxford: Oxford University Press.10.1093/oso/9780199267309.001.0001Search in Google Scholar

Hyams, Nina. 1986. language acquisition and the theory of parameters. Dordrecht: Reidel.10.1007/978-94-009-4638-5Search in Google Scholar

Jiang, Jingyang, Ouyang Jinghui & Liu Haitao. 2019. Interlanguage: A perspective of quantitative linguistic typology. Language Sciences 74. 85–97. https://doi.org/10.1016/j.langsci.2019.04.004.Search in Google Scholar

Jin, Honggang. 1994. Topic-prominence and subject-prominence in L2 acquisition: Evidence of English-to-Chinese typological transfer. Language Learning 44(1). 101–122.Search in Google Scholar

Köhler, Reinhard. 2012. Quantitative syntax analysis, vol. 65. Berlin/Boston: Walter de Gruyter.10.1515/9783110272925Search in Google Scholar

Köhler, Reinhard, Altmann Gabriel & Rajmund G. Piotrowski. 2005. Quantitative linguistics – An international handbook. Berlin/NewYork: Mouton de Gruyter.Search in Google Scholar

Kyle, Kristopher. 2016. Measuring syntactic development in L2 writing: Fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication. University of Georgia State PhD dissertation.Search in Google Scholar

Kyle, Kristopher & Scott A. Crossley. 2018. Measuring syntactic complexity in L2 writing using fine-grained clausal and phrasal indices. The Modern Language Journal 102(2). 333–349. https://doi.org/10.1111/modl.12468.Search in Google Scholar

Larsen-Freeman, Diane. 1997. Chaos/complexity science and second language acquisition. Applied Linguistics 18(2). 141–165. https://doi.org/10.1093/applin/18.2.141.Search in Google Scholar

Li, Charles N. & Sandra A. Thompson. 1976. Subject and topic: A new typology of language. In Charles N. Li (ed.), Subject and topic, 457–489. Austin: University of Texas Press.Search in Google Scholar

Li, Hui & Haitao Liu. 2019. A quantitative study on the development of Chinese children’s early verbal valence. Applied Linguistics (Chinese) 4. 131–140.Search in Google Scholar

Liu, Haitao. 2007. Probability distribution of dependency distance. Glottometrics 15. 1–12.Search in Google Scholar

Liu, Haitao. 2009a. Dependency grammar: From theory to practice. Beijing: Science Press.Search in Google Scholar

Liu, Haitao. 2009b. Probability distribution of dependencies based on a Chinese dependency treebank. Journal of Quantitative Linguistics 16(3). 256–273. https://doi.org/10.1080/09296170902975742.Search in Google Scholar

Liu, Haitao. 2011. Quantitative properties of English verb valency. Journal of Quantitative Linguistics 18(3). 207–233. https://doi.org/10.1080/09296174.2011.581849.Search in Google Scholar

Liu, Haitao. 2017. An introduction to quantitative linguistics. Beijing: The Commercial Press.Search in Google Scholar

Liu, Haitao & Wei Huang. 2006. A Chinese dependency syntax for treebanking. In Proceedings of the 20th Pacific Asia conference on language, information and computation, 126–133. Huazhong Normal University, Wuhan, China: Tsinghua University Press.Search in Google Scholar

Liu, Haitao & Wei Huang. 2012. The current theory and method of quantitative linguistics. Journal of Zhejiang University (Humanities and Social Sciences) 42(02). 178–192.Search in Google Scholar

Lu, Xiaofei. 2011. A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL Quarterly 45(1). 36–62. https://doi.org/10.5054/tq.2011.240859.Search in Google Scholar

Lyu, Shuxiang. 1986. The flexibility of Chinese syntax. Studies of the Chinese Language 1. 1–9.Search in Google Scholar

Mel’čuk, Igor. 2003. Levels of dependency in linguistic description: Concepts and problems. In Ágel Vilmos, Ludwig Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen Heringer & Henning Lobin (eds), Dependency and valency: An international handbook of contemporary research, vol. 1, 188–229. Berlin – New York: W. de Gruyter.Search in Google Scholar

Meurers, Detmar & Markus Dickinson. 2017. Evidence and interpretation in language learning research: Opportunities for collaboration with computational linguistics. Language Learning 67(S1). 66–95. https://doi.org/10.1111/lang.12233.Search in Google Scholar

Mohanty, Panchanan & Ioan-Iovitz Popescu. 2014. Word length in Indian languages. Glottometrics 29. 95–109.Search in Google Scholar

Nemser, William. 1971. Approximative systems of foreign language learners. IRAL-Interlanguage Review of Applied Linguistics in Language Learning 9(2). 115–124. https://doi.org/10.1515/iral.1971.9.2.115.Search in Google Scholar

Nida, Eugene A. 1982. Translating meaning. San Dimas, CA: English Language Institute.Search in Google Scholar

Ninio, Anat. 2006. Language and the learning curve: A new theory of syntactic development. Oxford: Oxford University Press.10.1093/acprof:oso/9780199299829.001.0001Search in Google Scholar

Ortega, Lourdes. 2003. Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics 24(4). 492–518. https://doi.org/10.1093/applin/24.4.492.Search in Google Scholar

Ortega, Lourdes. 2012. Interlanguage complexity: A construct in search of theoretical renewal. In Bernd Kortmann & Benedikt Szmrecsanyi (eds.), Linguistic complexity in interlanguage varieties, L2 varieties, and contact languages, 127–155. Berlin: Walter de Gruyter.10.1515/9783110229226.127Search in Google Scholar

Ouyang, Jinghui & Jingyang Jiang. 2018. Can the probability distribution of dependency distance measure language proficiency of second language learners? Journal of Quantitative Linguistics 25(4). 295–313.10.1080/09296174.2017.1373991Search in Google Scholar

Pan, Xiaxing & Haitao Liu. 2014. Adnominal constructions in modern Chinese and their distribution properties. Glottometrics 29. 1–30.Search in Google Scholar

Paquot, Magali. 2018. Phraseological competence: A missing component in university entrance language tests? Insights from a study of EFL learners’ use of statistical collocations. Language Assessment Quarterly 15(1). 29–43. https://doi.org/10.1080/15434303.2017.1405421.Search in Google Scholar

Park, Ji-Hyun. 2017. Syntactic complexity as a predictor of second language writing proficiency and writing quality. Ann Arbor, MI: Michigan State University Unpublished PhD Dissertation.Search in Google Scholar

Popescu, Ioan-Iovitz, Karl-Heinz Best & Gabriel Altmann. 2014. Unified modeling of length in language (= studies in quantitative linguistics 16). Lüdenscheid: RAM-Verlag.Search in Google Scholar

Saville-Troike, Muriel & Karen Barto. 2012. Introducing second language acquisition. London: Cambridge University Press.10.1017/CBO9780511888830Search in Google Scholar

Selinker, Larry. 1972. Interlanguage. International Review of Applied Linguistics 10(3). 219–231. https://doi.org/10.1515/iral.1972.10.1-4.209.Search in Google Scholar

Strauss, Udo, Peter Grzybek & Gabriel Altmann. 2007. Word length and word frequency. In Peter Grzybek (ed.), Contributions to the science of text and language, 277–294. Dordrecht: Springer.10.1007/1-4020-4068-7_13Search in Google Scholar

Tesnière, Lucien. 1959. Eléments de la syntaxe structurale. Paris: Klincksieck.Search in Google Scholar

Tomasello, Michael. 1992. First verbs: A case study of early grammatical development. Cambridge: C. U. P.10.1017/CBO9780511527678Search in Google Scholar

Van Geert, Paul. 2008. The dynamic systems approach in the study of L1 and L2 acquisition: An introduction. Modern Languag Journal 92(2). 179–199. https://doi.org/10.1111/j.1540-4781.2008.00713.x.Search in Google Scholar

Way, Denise Paige, Elizabeth G. Joiner & Michael A. Seaman. 2000. Writing in the secondary foreign language classroom: The effects of prompts and tasks on novice learners of French. The Modern Language Journal 84(2). 171–184. https://doi.org/10.1111/0026-7902.00060.Search in Google Scholar

White, Lydia. 1989. Universal grammar and second language acquisition. Amsterdam: John Benjamins.10.1075/lald.1Search in Google Scholar

Yih, Tsy. 2021. Review of Divjak (2019): Frequency in language: Memory, attention and learning. Review of Cognitive Linguistics 19(2). 596–601. https://doi.org/10.1075/rcl.00097.yih.Search in Google Scholar

Yuan, Boping. 1995. Acquisition of base-generated topics by English-Speaking learners of Chinese. Language Learning 45(4). 567–603. https://doi.org/10.1111/j.1467-1770.1995.tb00455.x.Search in Google Scholar

Yue, Ming & Haitao Liu. 2011. Probability distribution of discourse relations based on a Chinese RST-annotated corpus. Journal of Quantitative Linguistics 18(2). 107–121. https://doi.org/10.1080/09296174.2011.556002.Search in Google Scholar

Zhang, Hongxi & Haitao Liu. 2017. Motifs of generalized valencies. In Haitao Liu & Junying Liang (eds.), Motifs in language and text, 231–260. Berlin/Boston: De Gruyter Mouton.10.1515/9783110476637-014Search in Google Scholar

Zhao, Qianying & Jingyang Jiang. 2020. Verb valency in interlanguage: An extension to valency theory and new perspective on L2 learning. Poznań Studies in Contemporary Linguistics 56(2). 339–363. https://doi.org/10.1515/psicl-2020-0010.Search in Google Scholar

Zhu, Dexi. 1985. The questions and answers on grammar. Beijing: Beijing Commercial Press.Search in Google Scholar

Zipf, George Kingsley. 1935. The psycho-biology of language: An introduction to dynamic philology. London: George Routledge & Sons Ltd.Search in Google Scholar

Zipf, George Kingsley. 1949. Human behaviour and principle of least effort. Cambridge: Addison-Wesley.Search in Google Scholar

Published Online: 2023-01-02
Published in Print: 2023-03-28

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 13.12.2025 from https://www.degruyterbrill.com/document/doi/10.1515/psicl-2022-2007/html
Scroll to top button