Home Linguistics & Semiotics Quantitative relationship between distribution of sentence length and dependency distance in Spanish
Article
Licensed
Unlicensed Requires Authentication

Quantitative relationship between distribution of sentence length and dependency distance in Spanish

  • Jinlu Liu ORCID logo , Nan Yang ORCID logo and Haitao Liu ORCID logo EMAIL logo
Published/Copyright: April 18, 2025

Abstract

Sentence length, defined by the number of words contained in a sentence, has always been of great concern in linguistic research. Many studies have been conducted on the distribution of sentence length in specific languages. To further explore the characteristics and patterns of sentence length and their relationship with dependency distance in Spanish, we use the SUD syntactic treebank and conduct a quantitative analysis within the theoretical framework of dependency grammar. It is found that the sentence length distribution of Spanish follows a positive negative binomial model, and there is no significant difference in sentence length distribution among different mean dependency distances (MDDs), but the distribution of the number of sentences in different sentence length intervals follows a normal distribution. In Spanish, sentence length and MDD interact with each other – the longer the sentence is, the greater the MDD is, and vice versa – which is consistent with previous research findings. Also, as sentence length increases in Spanish, short-distance dependencies decrease, but remain within a certain range of fluctuations, which confirms once again that language is a complex and self-adaptive system driven by humans.


Corresponding author: Haitao Liu, College of Foreign Languages and Literature, Fudan University, No. 220 Handan Road, Shanghai, 200433, China, E-mail:

Funding source: Graduate Research Training Program of Zhejiang University of Finance and Economics

Award Identifier / Grant number: 23XJKT062

Acknowledgment

We would like to thank for the editors and anonymous reviewers for their insightful and valuable comments on our present paper.

  1. Research funding: This work is supported by the Postgraduate Training Program (PRTP) of Zhejiang University of Finance and Economics (23XJKT062).

References

Álvarez-Cañizo, Marta, Suárez-Coalla Paz & Fernando Cuetos. 2018. Reading prosody development in Spanish children. Reading and Writing 31. 35–52. https://doi.org/10.1007/s11145-017-9768-7.Search in Google Scholar

Andrea, Junyent, María Blume, María Fernandez Flecha & Talía Tijero Neyra. 2020. El vocabulario productivo y su relación con la gramática en niños hablantes de castellano peruano entre los 16 y los 30 meses. Interdisciplinaria 37(2). 143–158.10.16888/interd.2020.37.2.9Search in Google Scholar

Best, Karl-Heinz. 2002. The distribution of rhythmic units in German short prose. Glottometrics 3. 136–142.Search in Google Scholar

Bi, Yude & Hua Tan. 2024. Language transfer in L2 academic writings: A dependency grammar approach. Frontiers in Psychology 15. https://doi.org/10.3389/fpsyg.2024.1384629.Search in Google Scholar

Chen, Xinyin & Kim Gerdes. 2022. Dependency distances and their frequencies in Indo-European language. Journal of Quantitative Linguistics 29(1). 106–125. https://doi.org/10.1080/09296174.2020.1771135.Search in Google Scholar

Collins, Michael. 1996. A new statistical parser based on bigram lexical dependencies. In 34th annual meeting on association for computational linguistics (ACL’ 96). Santa Cruz: Association for Computational Linguistics.10.3115/981863.981888Search in Google Scholar

Fan, Lu & Yue Jiang. 2019. Can dependency distance and direction be used to differentiate translational language from native language? Lingua 224. 51–59. https://doi.org/10.1016/j.lingua.2019.03.004.Search in Google Scholar

Ferrer-i-Cancho, Ramon & Haitao Liu. 2014. The risks of mixing dependency lengths from sequences of different length. Glottotheory 5(2). 143–155. https://doi.org/10.1515/glot-2014-0014.Search in Google Scholar

Ferrer-i-Cancho, Ramon, Carlos Gómez-Rodríguez, Juan Luis Esteban & Lluís Alemany-Puig. 2022. Optimality of syntactic dependency distances. Physical Review E 105(1). https://doi.org/10.1103/PhysRevE.105.014308.Search in Google Scholar

Ferrer-i-Cancho, Ramon. 2004. Euclidean distance between syntactically linked words. Physical Review E 70(5). https://doi.org/10.1103/PhysRevE.70.056135.Search in Google Scholar

Futrell, Richard, Kyle Mahowald & Edward Gibson. 2015. Large-scale evidence of dependency length minimization in 37 languages. Proceedings of the National Academy of Sciences of the United States of America 112(33). 10336–10341. https://doi.org/10.1073/pnas.1502134112.Search in Google Scholar

Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies. Cognition 68(1). 1–76. https://doi.org/10.1016/s0010-0277(98)00034-1.Search in Google Scholar

Gómez-Rodríguez, Carlos, Morten H. Christiansen & Ramon Ferrer-i-Cancho. 2022. Memory limitations are hidden in grammar. Glottometrics 52. 39–64. https://doi.org/10.53482/2022_52_397.Search in Google Scholar

Heringer, Hans Jürgen, Bruno Strecker & Rainer Wimmer. 1980. Syntax: Fragen-Lösungen-Alternativen. Munich: Wilhelm Fink.Search in Google Scholar

Hudson, Richard. 1995. Measuring syntactic difficulty. https://dickhudson.com/wp-content/uploads/2013/07/Difficulty.pdf (accessed 26 August 2024).Search in Google Scholar

Jiang, Jingyang & Haitao Liu. 2015. The effects of sentence length on dependency distance, dependency direction and the implications – based on a parallel English-Chinese dependency treebank. Language Sciences 50. 93–104. https://doi.org/10.1016/j.langsci.2015.04.002.Search in Google Scholar

Kromann, Matthias T. 2006. Discontinuous grammar: A dependency-based model of human parsing and language learning. Frederiksberg: Copenhagen Business School.Search in Google Scholar

Lei, Lei & Ju Wen. 2020. Is dependency distance experiencing a process of minimization? A diachronic study based on the state of the union addresses. Lingua 239. https://doi.org/10.1016/j.lingua.2019.102762.Search in Google Scholar

Liu, Haitao. 2008. Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science 9(2). 159–191. https://doi.org/10.17791/jcs.2008.9.2.159.Search in Google Scholar

Liu, Haitao. 2018. Language as a human-driven complex adaptive system. Physics of Life Reviews 26–27. 149–151. https://doi.org/10.1016/j.plrev.2018.06.006.Search in Google Scholar

Liu, Haitao & Chunshan Xu. 2012. Quantitative typological analysis of Romance languages. Poznań Studies in Contemporary Linguistics 48(4). 597–625. https://doi.org/10.1515/psicl-2012-0027.Search in Google Scholar

Liu, Haitao, Chunshan Xu & Junying Liang. 2017. Dependency distance: A new perspective on syntactic patterns in natural languages. Physics of Life Reviews 21. 171–193. https://doi.org/10.1016/j.plrev.2017.03.002.Search in Google Scholar

Liu, Jinlu, Nan Yang & Haitao Liu. 2024. Distribution of sentence length of English complex sentences. Moderna Sprak 118(3). 51–69. https://doi.org/10.58221/mosp.v118i3.15574.Search in Google Scholar

Lu, Qian & Haitao Liu. 2016. Does dependency distance distribute regularly? Journal of Zhejiang University 4. 63–76.Search in Google Scholar

Lu, Qian, Chunshan Xu & Haitao Liu. 2016. Can chunking reduce syntactic complexity of natural languages? Complexity 21(S2). 33–41. https://doi.org/10.1002/cplx.21779.Search in Google Scholar

Lu, Xiaofei. 2010. Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics 15(4). 474–496. https://doi.org/10.1075/ijcl.15.4.02lu.Search in Google Scholar

Medina, Almitra, Gilda Socarrás & Sridhar Krishnamurti. 2020. L2 Spanish listening comprehension: The role of speech rate, utterance length, and L2 oral proficiency. The Modern Language Journal 104(2). 439–456.10.1111/modl.12639Search in Google Scholar

Miller, George A. 1956. The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review 63(2). 81–97. https://doi.org/10.1037/h0043158.Search in Google Scholar

Miller, George A. & Noam Chomsky. 1963. Introduction to the formal analysis of natural languages. New York: Wiley.Search in Google Scholar

Pande, Hemlata & Hoshiyar S. Dhami. 2015. Determination of the distribution of sentence length frequencies for Hindi language texts and utilization of sentence length frequency profiles for authorship attribution. Journal of Quantitative Linguistics 22(4). 338–348. https://doi.org/10.1080/09296174.2015.1106269.Search in Google Scholar

Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A comprehensive grammar of the English language. London: Longman.Search in Google Scholar

Rossi, Eleonora & Yanina Prystauka. 2020. Oscillatory brain dynamics of pronoun processing in native Spanish speakers and in late second language learners of Spanish. Bilingualism: Language and Cognition 23(5). 964–977. https://doi.org/10.1017/s1366728919000798.Search in Google Scholar

Sichel, Herbert S. 1974. On a distribution representing sentence-length in written prose. Journal of the Royal Statistical Society: Series A 137(1). 25–34. https://doi.org/10.2307/2345142.Search in Google Scholar

Sigurd, Bengt, Mats Eeg-Olofsson & Joost Van Weijer. 2004. Word length, sentence length and frequency – Zipf revisited. Studia Linguistica 58(1). 37–52. https://doi.org/10.1111/j.0039-3193.2004.00109.x.Search in Google Scholar

Wimmer, Gejza & Gabriel Altmann. 1999. Thesaurus of univariate discrete probability distributions. Germany: Stamm.Search in Google Scholar

Yan, Jianwei & Haitao Liu. 2019. Which annotation scheme is more expedient to measure syntactic difficulty and cognitive demand? In Xinyin Chen & Ramon Ferrer-i-Cancho (eds.), Proceedings of the first workshop on quantitative syntax, 16–24. Paris: Association for Computational Linguistics.10.18653/v1/W19-7903Search in Google Scholar

Yan, Jianwei & Haitao Liu. 2022. Semantic roles or syntactic functions: The effects of annotation scheme on the results of dependency measures. Studia Linguistica 76(2). 406–428. https://doi.org/10.1111/stul.12177.Search in Google Scholar

Zipf, George Kingsley. 1949. Human behaviour and the principle of least effort. Cambridge: Addison-Wesley.Search in Google Scholar

Received: 2024-09-24
Accepted: 2025-03-21
Published Online: 2025-04-18

© 2025 Walter de Gruyter GmbH, Berlin/Boston

Articles in the same Issue

  1. Frontmatter
  2. Editorial
  3. Editorial 2025
  4. Research Articles
  5. Vowel formant track normalization using discrete cosine transform coefficients
  6. Asymmetry in French speech-in-noise perception: the effects of native dialect and cross-dialectal exposure
  7. Direct pseudo-partitives in US English
  8. A baseline for object clitic climbing in Italian
  9. Semantic granularity in derivation
  10. Shared processing strategies as a mechanism for contact-induced change in flexible constituent order
  11. The (non)canonical status of the ka- passive in Balinese
  12. A comparative study of 时 si 2 /shi 2 in Meixian Hakka and Ancient Chinese using the Minimalist Program
  13. A quantitative method for syntactic gradience: words, phrases, and the constructions in between
  14. Yeah, but how? Operationalizing the functions of the discourse-pragmatic marker yeah
  15. Hotspots for acoustic politeness in Korean and Japanese deferential speech
  16. How fast is fast and how slow is slow in mental simulation? Two rating studies on Estonian speed adverbs
  17. Discourse effects in processing Chinese reflexive pronouns
  18. Attitudinal negotiation: the analysis of online commentary videos about an international event on Chinese social media platform bilibili.com
  19. Crosslinguistic constructions and strategies: where do concessive conditionals fit in?
  20. Recurring patterns in tone (chain) shift
  21. Null pronoun interpretation probed via thematic role ambiguity: a case in Korean
  22. Experimental investigation on quantifier scope in Chinese relative clauses
  23. Sensitivity to honorific agreement: a window into predictive processing
  24. The negative concord illusion: an acceptability study with Czech neg-words
  25. Expletive negation in Italian temporal clauses: an acceptability judgement and a self-paced reading study
  26. Effects of information structure on pronoun resolution: the number of pronouns matters
  27. The cognitive processing of nouns and verbs in second language reading: an eye-tracking study
  28. Comprehension of conversational implicatures in L3 Mandarin
  29. Effects of crosslinguistic influence in definiteness acquisition: comparing HL-English and HL-Russian bilingual children acquiring Hebrew
  30. Multimodal language processing in school-aged Mandarin-speaking children: the role of beat gesture in enhancing memory for discourse information
  31. My Memoji, my self: prosodic correlates of online performed code-switching via avatar
  32. Gender effects in Mandarin creaky voice evaluation: a matched-guise study
  33. Narrating the doctoral journey on Chinese social media: chronotopes and scales in user interaction on Xiaohongshu
  34. Salient Language in Context (SLIC): a web app for collecting real-time attention data in response to audio samples
  35. Children’s emerging sociolinguistic expectations around social roles: a triangulated approach
  36. Situating speakers in change: a methodology for quantifying degree and direction of change over the lifespan
  37. Testing the effect of speech separation on vowel formant estimates
  38. Researching dialects with high school students: a citizen science approach
  39. Sociolinguistic research projects as brands
  40. Do readers perceive various types of knowledge expressed through evidentials in news reports with different degrees of certainty?
  41. Quantitative relationship between distribution of sentence length and dependency distance in Spanish
  42. Large corpora and large language models: a replicable method for automating grammatical annotation
  43. Using ATLAS.ti for constructing and analysing multimodal social media corpora
  44. Exploring the effect of semantic diversity on boundary permeability in verb/noun heterosemy using deep contextualized word embedding
  45. Communicative pressures influence the use of adverbs as well as adjectives: evidence from a crosslinguistic investigation
  46. Non-signers favor two-handed gestures when expressing inherently plural meanings
  47. Encoding Chinese metaphorical motion: a typological perspective
  48. Frequency does not predict the processing speed of multi-morpheme sequences in Japanese
  49. Did he lead monologues or did he talk to himself? How typological distance between source and target language influences the preservation of metaphorical mappings in translation
  50. How long is too long? Production-internal and communicative constraints in the coding of conditionality in Spanish
  51. Long English objects and short Chinese objects: language diversity shaped by cognitive universality
  52. Corrigendum
  53. Corrigendum to: Sign recognition: the effect of parameters and features in sign mispronunciations
Downloaded on 16.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/lingvan-2024-0185/html
Scroll to top button