Do dependency lengths explain constraints on crossing dependencies?

Himanshu Yadav; Samar Husain; Richard Futrell

doi:10.1515/lingvan-2019-0070

Abstract

In syntactic dependency trees, when arcs are drawn from syntactic heads to dependents, they rarely cross. Constraints on these crossing dependencies are critical for determining the syntactic properties of human language, because they define the position of natural language in formal language hierarchies. We study whether the apparent constraints on crossing syntactic dependencies in natural language might be explained by constraints on dependency lengths (the linear distance between heads and dependents). We compare real dependency trees from treebanks of 52 languages against baselines of random trees which are matched with the real trees in terms of their dependency lengths. We find that these baseline trees have many more crossing dependencies than real trees, indicating that a constraint on dependency lengths alone cannot explain the empirical rarity of crossing dependencies. However, we find evidence that a combined constraint on dependency length and the rate of crossing dependencies might be able to explain two of the most-studied formal restrictions on dependency trees: gap degree and well-nestedness.

Keywords: crossing dependencies; dependency length; dependency treebanks; efficiency; language processing; syntax

Corresponding author: Himanshu Yadav, University of Potsdam, Potsdam, Germany, E-mail: hyadav@uni-potsdam.de

Acknowledgments

We thank the three anonymous reviewers for helpful suggestions. This work was supported by a gift from the NVIDIA Corporation.

References

Baayen, R. Harald, Douglas J. Davidson & Douglas M. Bates. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59(4). 390–412. https://doi.org/10.1016/j.jml.2007.12.005.Suche in Google Scholar

Bach, Emmon, Colin Brown & William D. Marslen-Wilson. 1986. Cross and nested dependencies in German and Dutch: A psycholinguistic study. Language and Cognitive Processes 1(4). 249–262. https://doi.org/10.1080/01690968608404677.Suche in Google Scholar

Barr, Dale J., Roger P. Levy, Christoph Scheepers & Harry J. Tily. 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language 68(3). 255–278. https://doi.org/10.1016/j.jml.2012.11.001.Suche in Google Scholar

Bodirsky, Manuel, Marco Kuhlmann & Mathias Möhl. 2005. Well-nested drawings as models of syntactic structure. In Tenth conference on formal grammar and ninth meeting on mathematics of language, Edinburgh, 195–203.Suche in Google Scholar

Boston, Marisa Ferrara, John T. Hale & Marco Kuhlmann. 2010. Dependency structures derived from minimalist grammars. In Proceedings of the 10th and 11th biennial conference on the mathematics of language, 1–12. Berlin: Springer-Verlag.10.1007/978-3-642-14322-9_1Suche in Google Scholar

Chen, Xinying & Kim Gerdes. 2019. The relation between dependency distance and frequency. In Proceedings of the first workshop on quantitative syntax, 75–82. Paris: Association for Computational Linguistics.10.18653/v1/W19-7909Suche in Google Scholar

Chen-Main, Joan & Aravind, K. Joshi. 2010. Unavoidable ill-nestedness in natural language and the adequacy of tree local-MCTAG induced dependency structures. In Proceedings of the 10th international conference on tree adjoining grammars and related formalisms (TAG+10), 53–60. New Haven: Yale University.Suche in Google Scholar

Chomsky, Noam. 1959. On certain formal properties of grammars. Information and Control 2(2). 137–167. https://doi.org/10.1016/s0019-9958(59)90362-6.Suche in Google Scholar

Chomsky, Noam. 1995. The minimalist program. Cambridge, MA: MIT Press.Suche in Google Scholar

Chomsky, Noam. 2005. Three factors in language design. Linguistic Inquiry 36(1). 1–61. https://doi.org/10.1162/0024389052993655.Suche in Google Scholar

Chomsky, Noam & Marcel P. Schützenberger. 1963. The algebraic theory of context free languages. In P. Braffot & D. Hirschberg (eds.), Computer programming and formal languages, 118–161. Amsterdam: North Holland.10.1016/S0049-237X(08)72023-8Suche in Google Scholar

Chung, Fan-Rong King. 1984. On optimal linear arrangements of trees. Computers & Mathematics with Applications 10(1). 43–60. https://doi.org/10.1016/0898-1221(84)90085-3.Suche in Google Scholar

Dyer, William E. 2017. Minimizing integration cost: A general theory of constituent order. Davis, CA: University of California, Davis Dissertation.Suche in Google Scholar

Eisner, Jason & Giorgio Satta. 1999. Efficient parsing for bilexical context-free grammars and head automaton grammars. In Proceedings of the 37th annual meeting of the association for computational linguistics, 457–464. College Park: Association for Computational Linguistics.10.3115/1034678.1034748Suche in Google Scholar

Ferrer-i-Cancho, Ramon. 2004. Euclidean distance between syntactically linked words. Physical Review E 70. 056135. https://doi.org/10.1103/physreve.70.056135.Suche in Google Scholar

Ferrer-i-Cancho, Ramon. 2006. Why do syntactic links not cross? Europhysics Letters 76(6). 1228. https://doi.org/10.1209/epl/i2006-10406-0.Suche in Google Scholar

Ferrer-i-Cancho, Ramon. 2014. A stronger null hypothesis for crossing dependencies. Europhysics Letters 108(5). 58003. https://doi.org/10.1209/0295-5075/108/58003.Suche in Google Scholar

Ferrer-i-Cancho, Ramon. 2016. Non-crossing dependencies: Least effort, not grammar. In Alexander Mehler, Andy Lücking, Sven Banisch, Philippe Blanchard & Barbara Job (eds.), Towards a theoretical framework for analyzing complex linguistic networks, 203–234. Berlin: Springer.10.1007/978-3-662-47238-5_10Suche in Google Scholar

Ferrer-i-Cancho, Ramon & Carlos Gómez-Rodríguez. 2016. Crossings as a side effect of dependency lengths. Complexity 21(S2). 320–328. https://doi.org/10.1002/cplx.21810.Suche in Google Scholar

Ferrer-i-Cancho, Ramon, Carlos Gómez-Rodríguez & Juan Luis Esteban. 2018. Are crossing dependencies really scarce? Physica A: Statistical Mechanics and its Applications 493. 311–329. https://doi.org/10.1016/j.physa.2017.10.048.Suche in Google Scholar

Ferrer-i-Cancho, Ramon & Ricard V. Solé. 2003. Least effort and the origins of scaling in human language. Proceedings of the National Academy of Sciences 100(3). 788. https://doi.org/10.1073/pnas.0335980100.Suche in Google Scholar

Futrell, Richard, Kyle Mahowald & Edward Gibson. 2015a. Large-scale evidence of dependency length minimization in 37 languages. Proceedings of the National Academy of Sciences 112(33). 10336–10341. https://doi.org/10.1073/pnas.1502134112.Suche in Google Scholar

Futrell, Richard, Kyle Mahowald & Edward Gibson. 2015b. Quantifying word order freedom in dependency corpora. In Proceedings of the third international conference on dependency linguistics (DepLing 2015), 91–100. Uppsala: Uppsala University.Suche in Google Scholar

Gerdes, Kim, Bruno Guillaume, Sylvain Kahane & Guy Perrier. 2018. SUD or surface-syntactic universal dependencies: An annotation scheme near-isomorphic to UD. In Proceedings of the second workshop on universal dependencies (UDW 2018), 66–74. Brussels: Association for Computational Linguistics.10.18653/v1/W18-6008Suche in Google Scholar

Gerdes, Kim, Bruno Guillaume, Sylvain Kahane & Guy Perrier. 2019. Improving surface-syntactic universal dependencies (SUD): Surface-syntactic relations and deep syntactic features. In Proceedings of the 18th international workshop on treebanks & linguistic theory, 126–132. Paris: Association for Computational Linguistics.10.18653/v1/W19-7814Suche in Google Scholar

Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies. Cognition 68(1). 1–76. https://doi.org/10.1016/s0010-0277(98)00034-1.Suche in Google Scholar

Gibson, Edward, Richard Futrell, Steven T. Piantadosi, Isabelle Dautriche, Kyle Mahowald, Leon Bergen & Roger Levy. 2019. How efficiency shapes human language. Trends in Cognitive Sciences 23(5). 389–407. https://doi.org/10.1016/j.tics.2019.02.003.Suche in Google Scholar

Gildea, Daniel & David Temperley. 2007. Optimizing grammars for minimum dependency length. In Proceedings of the 45th annual meeting of the association for computational linguistics, 184–191. Prague: Association for Computational Linguistics.Suche in Google Scholar

Gildea, Daniel & David Temperley. 2010. Do grammars minimize dependency length? Cognitive Science 34(2). 286–310. https://doi.org/10.1111/j.1551-6709.2009.01073.x.Suche in Google Scholar

Gómez-Rodríguez, Carlos, Morten H. Christiansen & Ramon Ferrer-i-Cancho. 2019. Memory limitations are hidden in grammar. CoRR abs/1908.06629.Suche in Google Scholar

Gómez-Rodríguez, Carlos, Marco Kuhlmann & Giorgio Satta. 2010. Efficient parsing of well-nested linear context-free rewriting systems. In Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics, Mexico, 276–284.Suche in Google Scholar

Greenberg, Joseph H. 1963. Some universals of grammar with particular reference to the order of meaningful elements. In Joseph H. Greenberg (ed.), Universals of language, 73–113. Cambridge, MA: MIT Press.Suche in Google Scholar

Harper, Lawrence H.Jr. 1964. Optimal assignments of numbers to vertices. Journal of the Society for Industrial Applied Mathematics 12. 131–135. https://doi.org/10.1137/0112012.Suche in Google Scholar

Haspelmath, Martin. 2008. Parametric versus functional explanations of syntactic universals. In Theresa Biberauer (ed.), The limits of syntactic variation, 75–107. Amsterdam: Benjamins.10.1075/la.132.04hasSuche in Google Scholar

Havelka, Jiří. 2007. Beyond projectivity: Multilingual evaluation of constraints and measures on non-projective structures. In Proceedings of the 45th annual meeting of the association for computational linguistics, 608–615. Prague: Association for Computational Linguistics.Suche in Google Scholar

Hawkins, John A. 1994. A performance theory of order and constituency. Cambridge: Cambridge University Press.10.1017/CBO9780511554285Suche in Google Scholar

Hawkins, John A. 2004. Efficiency and complexity in grammars. Oxford: Oxford University Press.10.1093/acprof:oso/9780199252695.001.0001Suche in Google Scholar

Hawkins, John A. 2014. Cross-linguistic variation and efficiency. Oxford: Oxford University Press.10.1093/acprof:oso/9780199664993.001.0001Suche in Google Scholar

Hochberg, Robert A. & Matthias F. Stallmann. 2003. Optimal one-page tree embeddings in linear time. Information Processing Letters 87. 59–66. https://doi.org/10.1016/s0020-0190(03)00261-8.Suche in Google Scholar

Hockett, Charles F. 1960. The origin of language. Scientific American 203(3). 88–96. https://doi.org/10.1038/scientificamerican0960-88.Suche in Google Scholar

Hopcroft, John E. & Jeffrey D. Ullman. 1979. Introduction to automata theory, languages and computation. Boston, MA: Addison-Wesley.Suche in Google Scholar

ISO 639-3. 2007. Codes for the representation of names of languages — Part 3: Alpha-3 code for comprehensive coverage of languages. Geneva, CH: Standard International Organization for Standardization.Suche in Google Scholar

Joshi, Aravind K. 1990. Processing crossed and nested dependencies: An automaton perspective on the psycholinguistic results. Language and Cognitive Processes 5. 1–27. https://doi.org/10.1080/01690969008402095.Suche in Google Scholar

Joshi, Aravind K., Krishnamurti Vijay-Shanker & David J. Weir. 1991. The convergence of mildly context-sensitive grammar formalisms. In Peter Sells, Stuart M. Shieber & Thomas Wasow (eds.), Foundational issues in natural language processing, 31–81. Cambridge, MA: MIT Press.Suche in Google Scholar

Kuhlmann, Marco. 2013. Mildly non-projective dependency grammar. Computational Linguistics 39(2). 355–387. https://doi.org/10.1162/coli_a_00125.Suche in Google Scholar

Kuhlmann, Marco & Joakim Nivre. 2006. Mildly non-projective dependency structures. In Proceedings of the COLING/ACL 2006 main conference poster sessions, 507–514. Sydney: Association for Computational Linguistics10.3115/1273073.1273139Suche in Google Scholar

Liu, Haitao. 2008. Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science 9(2). 159–191.10.17791/jcs.2008.9.2.159Suche in Google Scholar

Liu, Haitao. 2010. Dependency direction as a means of word-order typology: A method based on dependency treebanks. Lingua 120(6). 1567–1578. https://doi.org/10.1016/j.lingua.2009.10.001.Suche in Google Scholar

Liu, Haitao, Chunshan Xu & Junying Liang. 2017. Dependency distance: A new perspective on syntactic patterns in natural languages. Physics of Life Reviews 21. 171–193. https://doi.org/10.1016/j.plrev.2017.03.002.Suche in Google Scholar

Lu, Qian & Haitao Liu. 2016. A quantitative study of the relationship between crossing and distance in human language. Journal of Shanxi University (Philosophy & Science) 39(4). 49–56.Suche in Google Scholar

Maier, Wolfgang & Timm Lichte. 2009. Characterizing discontinuity in constituent treebanks. In International conference on formal grammar, 167–182.10.1007/978-3-642-20169-1_11Suche in Google Scholar

Mambrini, Francesco & Marco Passarotti. 2013. Non-projectivity in the ancient greek dependency treebank. In Proceedings of the second international conference on dependency linguistics (DepLing 2013), 177–186. Prague: Charles University in Prague.Suche in Google Scholar

Marcus, Solomon. 1965. Sur la notion de projectivité. Mathematical Logic Quarterly 11(2). 181–192. https://doi.org/10.1002/malq.19650110212.Suche in Google Scholar

Michaelis, Jens. 1998. Derivational minimalism is mildly context-sensitive. In Logical aspects of computational linguistics, 179–198. Berlin: Springer.10.1007/3-540-45738-0_11Suche in Google Scholar

Miletic, Aleksandra & Assaf Urieli. 2017. Non-projectivity in Serbian: Analysis of formal and linguistic properties. In Proceedings of the fourth international conference on dependency linguistics (DepLing 2017), 135–144. Pisa: Association for Computational Linguistics.Suche in Google Scholar

Nivre, Joakim. 2015. Towards a universal grammar for natural language processing. In Computational linguistics and intelligent text processing, 3–16. Berlin: Springer.10.1007/978-3-319-18111-0_1Suche in Google Scholar

Nivre, Joakim, et al. 2019. Universal dependencies 2.4. LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL). Faculty of Mathematics and Physics, Charles University. Available at: http://hdl.handle.net/11234/1-2988.Suche in Google Scholar

Nivre, Joakim & Jens Nilsson. 2005. Pseudo-projective dependency parsing. In Proceedings of the 43rd annual meeting of the association for computational linguistics, 99–106. Ann Arbor: Association for Computational Linguistics.10.3115/1219840.1219853Suche in Google Scholar

Park, Y. Albert & Roger Levy. 2009. Minimal-length linearizations for mildly context-sensitive dependency trees. In Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics, 335–343. Boulder: Association for Computational Linguistics.10.3115/1620754.1620803Suche in Google Scholar

Pitler, Emily, Sampath Kannan & Mitchell Marcus. 2013. Finding optimal 1-endpoint-crossing trees. Transactions of the Association for Computational Linguistics 1. 13–24. https://doi.org/10.1162/tacl_a_00206.Suche in Google Scholar

Pollard, Carl & Ivan A. Sag. 1994. Head-driven phrase structure grammar. Stanford, CA: Center for the Study of Language and Information.Suche in Google Scholar

Shieber, Stuart M. 1985. Evidence against the context-freeness of natural language. In The formal complexity of natural language, 320–334. Berlin: Springer.10.1007/978-94-009-3401-6_12Suche in Google Scholar

Shiloach, Yossi. 1979. A minimum linear arrangement algorithm for undirected trees. SIAM Journal on Computing 8(1). 15–32. https://doi.org/10.1137/0208002.Suche in Google Scholar

Steedman, Mark & Jason Baldridge. 2011. Combinatory categorial grammar. Non-transformational syntax: Formal and explicit models of grammar, 181–224. Oxford: Blackwell Publishing Ltd.10.1002/9781444395037.ch5Suche in Google Scholar

Temperley, David. 2008. Dependency-length minimization in natural and artificial languages. Journal of Quantitative Linguistics 15(3). 256–282. https://doi.org/10.1080/09296170802159512.Suche in Google Scholar

Temperley, David & Daniel Gildea. 2018. Minimizing syntactic dependency lengths: Typological/cognitive universal? Annual Review of Linguistics 4. 1–15. https://doi.org/10.1146/annurev-linguistics-011817-045617.Suche in Google Scholar

von der Gabelentz, Georg. 1901. Die Sprachwissenschaft, ihre Aufgaben, Methoden, und bisherigen Ergebnisse. Leipzig: Weigel.Suche in Google Scholar

Weir, David J. 1988. Characterizing mildly context-sensitive grammar formalisms. Philadelphia, PA: University of Pennsylvania Dissertation.Suche in Google Scholar

Yadav, Himanshu, Samar Husain & Richard Futrell. 2019. Are formal restrictions on crossing dependencies epiphenomenal? In Proceedings of the 18th international workshop on treebanks & linguistic theory, 2–12. Paris: Association for Computational Linguistics.10.18653/v1/W19-7802Suche in Google Scholar

Yadav, Himanshu, Ashwini Vaidya & Samar Husain. 2017. Understanding constraints on non-projectivity using novel measures. In Proceedings of the fourth international conference on dependency linguistics (DepLing 2017), 276–286. Pisa: Association for Computational Linguistics.Suche in Google Scholar

Yadav, Himanshu, Ashwini Vaidya, Vishakha Shukla & Samar Husain. 2020. Word order typology interacts with linguistic complexity: A cross-linguistic corpus study. Cognitive Science 44(4). e12822. https://doi.org/10.1111/cogs.12822.Suche in Google Scholar

Yan, Jianwei & Haitao Liu. 2019. Which annotation scheme is more expedient to measure syntactic difficulty and cognitive demand? In Proceedings of the first workshop on quantitative syntax (Quasy, SyntaxFest 2019), 16–24. Paris: Association for Computational Linguistics.10.18653/v1/W19-7903Suche in Google Scholar

Zipf, George Kingsley. 1949. Human behavior and the principle of least effort. Oxford, UK: Addison-Wesley Press.Suche in Google Scholar

Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/lingvan-2019-0070).

Received: 2019-10-15

Accepted: 2020-10-23

Published Online: 2021-04-21

Sie haben derzeit keinen Zugang zu diesem Inhalt.

Supplementary Material

Do dependency lengths explain constraints on crossing dependencies?

Abstract

Acknowledgments

References

Supplementary Material

Artikel in diesem Heft

Artikel in diesem Heft

Do dependency lengths explain constraints on crossing dependencies?

Artikel

Abstract

Acknowledgments

References

Supplementary Material

Zusatzmaterial

Artikel in diesem Heft

Artikel in diesem Heft

Artikel in diesem Heft