AI-generated, L2 learner, and native German writing: a comparative analysis of linguistic complexity

Shengzhou Sun; Yuan Li

doi:10.1515/glot-2025-2011

Article

AI-generated, L2 learner, and native German writing: a comparative analysis of linguistic complexity

Shengzhou Sun and Yuan Li

Published/Copyright: October 31, 2025

Published by

Become an author with De Gruyter Brill

Author Information Explore this Subject

From the journal Glottotheory Volume 16 Issue 2

Abstract

The development of generative AI presents both opportunities and challenges for language teaching. Understanding the linguistic features of AI-generated texts is essential, as it supports users in engaging with AI critically and appropriately for writing tasks. While existing studies have predominantly focused on English writing, the present study examines German argumentative essays produced by ChatGPT, DeepSeek, L1 speakers, and L2 learners, with a focus on linguistic complexity. The results reveal that AI-generated essays generally exhibit higher linguistic complexity. Specifically, DeepSeek essays demonstrate greater lexical complexity, whereas ChatGPT essays are characterized by more complex syntax. In comparison to human-authored essays, AI-generated essays tend to be more formal, marked by frequent nominalizations and a more extensive use of conjunctions. The findings are further interpreted in light of prior research and the underlying mechanisms of generative AI. Based on these results, pedagogical implications for foreign language writing instruction are proposed.

Keywords: generative AI; linguistic complexity; German writing; comparative analysis

Corresponding author: Yuan Li, Institute of German Studies, Zhejiang University, 866 Yuhangtang Road, Xihu District, Hangzhou, 310058, P.R. China, E-mail: liyuan1972@zju.edu.cn

Literature

Abdel Latif, Muhammad M. Mahmoud. 2013. What do we mean by writing fluency and how can it be validly measured? Applied Linguistics 34(1). 99–105. https://doi.org/10.1093/applin/ams073.Search in Google Scholar

Alamleh, Hosam, Ali Abdullah, S. & AbdElRahmanElSaid. 2023. Distinguishing human-written and ChatGPT-generated text using machine learning. In IEEE symposium on systems and information engineering design, SIEDS, 154–158. Charlottesville: University of Virginia.10.1109/SIEDS58326.2023.10137767Search in Google Scholar

Amirjalili, Forough, Masoud Neysani & Ahmadreza Nikbakht. 2024. Exploring the boundaries of authorship: A comparative analysis of AI-generated text and human academic writing in English literature. Frontiers in Education 9. 1347421. https://doi.org/10.3389/feduc.2024.1347421.Search in Google Scholar

Auswärtiges Amt. 2020. Deutsch als Fremdsprache weltweit. Datenerhebung 2020. Berlin: Auswärtiges Amt.Search in Google Scholar

Berriche, Lamia & Souad Larabi-Marie-Sainte. 2024. Unveiling ChatGPT text using writing style. Heliyon 10. e32976. https://doi.org/10.1016/j.heliyon.2024.e32976.Search in Google Scholar

Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever & Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the 34th international conference on neural information processing systems, 1877–1901. Red Hook: Curran Associates Inc.Search in Google Scholar

Bulté, Bram & Alex Housen. 2012. Defining and operationalising L2 complexity. In Alex Housen, Folkert Kuiken & Ineke Vedder (eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA, 21–46. Amsterdam: John Benjamins. https://benjamins.com/catalog/lllt.32.02bul (accessed 14 May 2025).10.1075/lllt.32.02bulSearch in Google Scholar

Casal, J. Elliott & Matt Kessler. 2023. Can linguists distinguish between ChatGPT/AI and human writing? A study of research ethics and academic publishing. Research Methods in Applied Linguistics 2. 100068. https://doi.org/10.1016/j.rmal.2023.100068.Search in Google Scholar

Chen, Xiaobin & Detmar Meurers. 2016. CTAP: A web-based tool supporting automatic complexity analysis. In Proceedings of the workshop on computational linguistics for linguistic complexity (CL4LC), 113–119. Osaka: The COLING 2016 Organizing Committee.Search in Google Scholar

Cohen, Jacob. 1988. Statistical power analysis for the behavioral sciences (2nd edition. Hillsdale: Lawrence Erlbaum Associates.Search in Google Scholar

DeepSeek-AI. 2025. DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning. arXiv 2501.12948. https://arxiv.org/abs/2501.12948 (accessed 14 May 2025).Search in Google Scholar

Desaire, Heather, Aleesa E. Chua, Madeline Isom, Romana Jarosova & David Hua. 2023. Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools. Cell Reports Physical Science 4. 101426. https://doi.org/10.1016/j.xcrp.2023.101426.Search in Google Scholar

Ellis, Rod. 2003. Task-based language learning and teaching. Oxford: Oxford University Press.Search in Google Scholar

Engel, Ulrich. 1996. Deutsche Grammatik. 3. korrigierte Auflage. Heidelberg: Groos.Search in Google Scholar

Godwin-Jones, Robert. 2024. Distributed agency in second language learning and teaching through generative AI. Language Learning and Technology 28(2). 5–31. https://doi.org/10.64152/10125/73570.Search in Google Scholar

Goulart, Larissa, Marine Laísa Matte, Alanna Mendoza, Lee Alvarado & Ingrid Veloso. 2024. AI or student writing? Analyzing the situational and linguistic characteristics of undergraduate student writing and AI-generated assignments. Journal of Second Language Writing 66. 101160. https://doi.org/10.1016/j.jslw.2024.101160.Search in Google Scholar

Hancke, Julia. 2013. Automatic prediction of CEFR proficiency levels based on linguistic features of learner language. Tübingen: Universität Tübingen MA thesis.Search in Google Scholar

Herbold, Steffen, Annett Hautli-Janisz, Ute Heuer, Zlata Kikteva & Alexander Trautsch. 2023. A large-scale comparison of human-written versus ChatGPT-generated essays. Scientific Reports 13. 18617. https://doi.org/10.1038/s41598-023-45644-9.Search in Google Scholar

Housen, Alex & Folkert Kuiken. 2009. Complexity, accuracy, and fluency in second language acquisition. Applied Linguistics 30(4). 461–473. https://doi.org/10.1093/applin/amp048.Search in Google Scholar

Islam, Niful, Debopom Sutradhar, Humaira Noor, Jarin Tasnim Raya, Monowara Tabassum Maisha & Dewan Md. Farid. 2023. Distinguishing human generated text from ChatGPT generated text using machine learning. arXiv 2306.01761. https://arxiv.org/abs/2306.01761 (accessed 14 May 2025).Search in Google Scholar

Jiang, Feng & Ken Hyland. 2024. Does ChatGPT argue like students? Bundles in argumentative essays. Applied Linguistics. 1–17. https://academic.oup.com/applij/advance-article-abstract/doi/10.1093/applin/amae052/7736875?redirectedFrom=fulltext (accessed 15 May 2025).Search in Google Scholar

Johnson, Rebecca L., Giada Pistilli, Natalia Menédez-González, Leslye Denisse Dias Duran, Enrico Panai, Julija Kalpokiene & Donald Jay Bertulfo. 2022. The ghost in the machine has an American accent: Value conflict in GPT-3. arXiv 2203.07785 https://arxiv.org/abs/2203.07785 (accessed 14 May 2025).Search in Google Scholar

Kar, Sujita Kumar, Teena Bansal, Sumit Modi & Amit Singh. 2024. How sensitive are the free AI-detector tools in detecting AI-generated texts? A comparison of popular AI-detector tools. Indian Journal of Psychological Medicine. 1–4. https://journals.sagepub.com/doi/full/10.1177/02537176241247934 (accessed 15 May 2025).Search in Google Scholar

Kasneci, Enkelejda, Kathrin Sessler, Stefan Küchemann, Maria Bannert, Daryna Dementieva, Frank Fischer, Urs Gasser, Georg Groh, Stephan Günnemann, Eyke Hüllermeier, Stephan Krusche, Gitta Kutyniok, Tilman Michaeli, Claudia Nerdel, Jürgen Pfeffer, Oleksandra Poquet, Michael Sailer, Albrecht Schmidt, Tina Seidel, Matthias Stadler, Jochen Weller, Jochen Kuhn & Gjergji Kasneci. 2023. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences 103. 102274. https://doi.org/10.1016/j.lindif.2023.102274.Search in Google Scholar

Köbis, Nils & Luca D. Mossink. 2021. Artificial intelligence versus Maya Angelou: Experimental evidence that people cannot differentiate AI-generated from human-written poetry. Computers in Human Behavior 114. 106553. https://doi.org/10.1016/j.chb.2020.106553.Search in Google Scholar

Lavalley, Rémi, Kay Berkling & Sebastian Stüker. 2015. Preparing children’s writing database for automated processing. In Language teaching, Learning and technology (LTLT-2015), 9–15. Leipzig. https://www.isca-archive.org/ltlt_2015/lavalley15_ltlt.html (accessed 15 May 2025).Search in Google Scholar

Li, Manli & ShuwenLi. 2021. 学术语言的概念、特征及教育意义 [The concept, features and educational significance of academic language]. Educational Research 42(6). 37–48.Search in Google Scholar

Liang, Weixin, Mert Yuksekgonul, Yining Mao, Eric Wu & James Zou. 2023. GPT detectors are biased against non-native English writers. Patterns 4. 1–4. https://doi.org/10.1016/j.patter.2023.100779.Search in Google Scholar

Lu, Xiaofei. 2010. Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics 15(4). 474–496. https://doi.org/10.1075/ijcl.15.4.02lu.Search in Google Scholar

Lu, Xiaofei. 2011. A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. Tesol Quarterly 45(1). 36–62. https://doi.org/10.5054/tq.2011.240859.Search in Google Scholar

Mizumoto, Atsushi, Sachiko Yasuda & Yu Tamura. 2024. Identifying ChatGPT-generated texts in EFL students’ writing: Through comparative analysis of linguistic fingerprints. Applied Corpus Linguistics (4). 100106. https://doi.org/10.1016/j.acorp.2024.100106.Search in Google Scholar

Neary-Sundquist, Colleen A. 2017. Syntactic complexity at multiple proficiency levels of L2 German speech. International Journal of Applied Linguistics 27(1). 242–262. https://doi.org/10.1111/ijal.12128.Search in Google Scholar

OpenAI. 2023. GPT-4 technical report. arXiv 2303.08774 https://arxiv.org/abs/2303.08774 (accessed 14 May 2025).Search in Google Scholar

Ortega, Lourdes. 2003. Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics 24(4). 492–518. https://doi.org/10.1093/applin/24.4.492.Search in Google Scholar

Reznicek, Marc, Anke Lüdeling, Cedric Krummes, Franziska Schwantuschke, Maik Walter, Karin Schmidt, Hagen Hirschmann & Torsten Andreas. 2012. Das Falko-Handbuch Korpusaufbau und Annotationen Version 2.01. https://www.linguistik.hu-berlin.de/de/institut/professuren/korpuslinguistik/forschung/falko/FalkoHandbuchV2/view (accessed 14 May 2025).Search in Google Scholar

Sardinha, Tony Berber. 2024. AI-generated vs human-authored texts: A multidimensional comparison. Applied Corpus Linguistics 4. 100083. https://doi.org/10.1016/j.acorp.2023.100083.Search in Google Scholar

Shah, Aditya, Prateek Ranka, Urmi Dedhia, Shruti Prasad, Siddhi Muni & Kiran Bhowmick. 2023. Detecting and unmasking AI-generated texts through explainable artificial intelligence using stylistic features. International Journal of Advanced Computer Science and Applications 14(10). 1043–1053. https://doi.org/10.14569/ijacsa.2023.01410110.Search in Google Scholar

Skehan, Peter. 2009. Modelling second language performance: Integrating complexity, accuracy, fluency, and lexis. Applied Linguistics 30(4). 510–532. https://doi.org/10.1093/applin/amp047.Search in Google Scholar

Theocharopoulos, Panagiotis C., Panagiotis Anagnostou, Anastasia Tsoukala, Spiros V. Georgakopoulos, Sotiris K. Tasoulis & Vassilis P. Plagianakos. 2023. Detection of fake generated scientific abstracts. In IEEE ninth international Conference on big data computing Service and applications BigDataService 2023, 33–39. Athens: IEEE Computer Society Conference Publishing Services.10.1109/BigDataService58306.2023.00011Search in Google Scholar

Wei, Yuming, Kai Jia, Runxi Zeng, Zhe He, Lin Qiu, Wenxuan Yu, Man Tang, Huang Huang, Xiong Zeng, Hong Zhang, Lei Zheng, Huiping Zhang, Xiaoyu Zhang, Jing Zhao, Hongyu Fu & Yuhao Jiang. 2025. DeepSeek 突破效应下的人工智能创新发展与治理变革 [The innovation and governance transformation of artificial intelligence under the DeepSeek breakthrough effect]. E-government 3. 2–39.Search in Google Scholar

Weiss, Zarah. 2017. Using measures of linguistic complexity to assess German L2 proficiency in learner corpora under consideration of task-effects. Tübingen: Universität Tübingen MA thesis.Search in Google Scholar

Weiss, Zarah & Detmar Meurers. 2019a. Analyzing linguistic complexity and accuracy in academic language development of German across elementary and secondary school. In Proceedings of the fourteenth workshop on innovative use of NLP for building educational applications, 380–393. Florence: Association for Computational Linguistics.10.18653/v1/W19-4440Search in Google Scholar

Weiss, Zarah & Detmar Meurers. 2019b. Broad linguistic modeling is beneficial for German L2 proficiency assessment. In Andrea Abel, Aivars Glaznieks, Verena Lyding & Lionel Nicolas (eds.), Widening the scope of learner corpus research. Selected papers from the fourth learner corpus research conference, 419–435. Louvain-la-Neuve: Presses universitaires de Louvain.Search in Google Scholar

Wendler, Chris, Veniamin Veselovsky, Giovanni Monea & Robert West. 2024. Do Llamas work in English? On the latent language of multilingual transformers. In Proceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: Long papers), 15366–15394. Bangkok: Association for Computational Linguistics.10.18653/v1/2024.acl-long.820Search in Google Scholar

Wolfe-Quintero, Kate, Shunji Inagaki & Hae-Young Kim. 1998. Second language development in writing: Measures of fluency, accuracy, and complexity. Honolulu: University of Hawaii Press.Search in Google Scholar

Yang, Lu & Rui Li. 2024. ChatGPT for L2 learning: Current status and implications. System 124. 103351. https://doi.org/10.1016/j.system.2024.103351.Search in Google Scholar

Zhang, Mengxuan & Peter Crosthwaite. 2025. More human than human? Differences in lexis and collocation within academic essays produced by ChatGPT-3.5 and human L2 writers. International Review of Applied Linguistics in Language Teaching. 1–28. https://www.degruyterbrill.com/document/doi/10.1515/iral-2024-0196/html (accessed 15 May 2025).10.1515/iral-2024-0196Search in Google Scholar

Zhou, Tongquan, Siyi Cao, Siruo Zhou, Yao Zhang & Aijing He. 2023. Chinese intermediate English learners outdid ChatGPT in deep cohesion: Evidence from English narrative writing. System 118. 103141. https://doi.org/10.1016/j.system.2023.103141.Search in Google Scholar

Published Online: 2025-10-31

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/glot-2025-2011

Keywords for this article

generative AI; linguistic complexity; German writing; comparative analysis