Abstract
The legal language has been extensively studied due to its profound and far-reaching impact on society. Among the various research areas, the complexity of legal language has received particular attention due to its implications for legal practice and societal access to justice. Relevant studies seek to identify the factors contributing to the difficulty of legal language and explore potential methods of simplification. However, despite extensive qualitative research, a comprehensive, quantitative analysis of the essential linguistic features distinguishing legal language from everyday language remains underexplored. The present study fills this gap and quantitatively examines the key features of German legal language by comparing it to journalistic language as a benchmark for everyday discourse, providing a rigorous, corpus-based perspective on the lexical, syntactic, and typological features of legal texts. Using data from the HDT-UD and DGT-UD treebanks, the study analyzes 18 quantitative indicators across three linguistic domains – lexicon, syntax, and word-order typology. The results indicate that legal language is characterized by a limited vocabulary, a higher frequency of long words and dependent clauses, increased syntactic and structural complexity, and a predominance of SV and OV word order patterns. By providing a detailed comparison of legal and journalistic registers, this study advances the understanding of legal language through objective, empirical analysis. The findings have practical implications for legal communication, suggesting that greater attention to lexical simplification and syntactic clarity could improve accessibility and comprehension.
References
Arbel, Yonathan. 2024. The readability of contracts: Big data analysis. Journal of Empirical Legal Studies 21(4). 927–978. https://doi.org/10.1111/jels.12400.Search in Google Scholar
Arnould, Arthur, Rita Hendricusdottir & Jeroen Bergmann. 2021. The complexity of medical device regulations has increased, as assessed through data-driven techniques. Prosthesis 3(4). 314–330. https://doi.org/10.3390/prosthesis3040029.Search in Google Scholar
Baumann, Antje. 2015. Bedeutung in Gesetzen: Wie man eine spezielle textsorte mit korpuslinguistischen Mitteln verständlicher machen könnte. In Friedemann Vogel (ed.), Zugänge zur Rechtssemantik: Interdisziplinäre Ansätze im Zeitalter der Mediatisierung, 254–274. Berlin, München, Boston: De Gruyter.10.1515/9783110348941-013Search in Google Scholar
Bielawski, Paweł. 2022. Juristische Phraseologie im Kontext der Rechtsübersetzung am Beispiel deutscher und polnischer Anklageschriften. Berlin: Frank & Timme.10.57088/978-3-7329-9124-2Search in Google Scholar
Blasie, Michael. 2022. The rise of plain language laws. University of Miami Law Review 76(2). 447–524.Search in Google Scholar
Brandt, Wolfgang. 1991. Müssen Gesetze schwer verständlich sein? Einwände eines Linguisten gegen Schutzbehauptungen der Juristen. In Jörn Eckert & Hans Hattenhauer (eds.), Sprache - Recht – Geschichte: Rechtshistorisches Kolloquium, 339–350. Heidelberg: C. F. Müller Juristischer Verlag.Search in Google Scholar
Busse, Dietrich. 2013. Juristische Fachsprache und Öffentlicher Sprachgebrauch: Richterliche Bedeutungsdefinitionen und ihr Einfluß auf die Semantik politischer Begriffe. In Frank Liedtke, Martin Wengeler & Karin Böke (eds.), Begriffe besetzen: Strategien des Sprachgebrauchs in der Politik, 160–185. Berlin: Springer-Verlag.10.1007/978-3-322-92242-7_10Search in Google Scholar
Chakhnashvili, Tamar. 2012. Besonderheiten der deutschen Rechtssprache bei der fachsprachlichen Kommunikation. In Nino Abralava, Manana Kutelia, Tea Petelava, Elisabeth Venohr & Heiner Dintera (eds.), Beiträge zur Internationalen Tagung Theorie und Praxis der deutschen Fachsprache(n) in Georgien, 61–70. Georgia: Universitätsverlag.Search in Google Scholar
Charrow, Veda, Jo Crandall & Robert Charrow. 1982. Characteristics and functions of legal language. In Richard Kittredge & John Lehrberger (eds.), Sublanguage: Studies of language in restricted semantic domains, 175–190. Berlin, Boston: De Gruyter.Search in Google Scholar
Chen, Xiaobin & Detmar Meurers. 2016. CTAP: A web-based tool supporting automatic complexity analysis. In Dominique Brunato, Felice Dell’ Orletta, Giulia Venturi, Thomas François & Philippe Blache (eds.), Proceedings of the workshop on computational linguistics for linguistic complexity, 113–119. Osaka: The COLING 2016 Organizing Committee.Search in Google Scholar
Cheng, Le & Jiamin Pei. 2025. Legal discourse in transition: Technology, methodology, and sociology. International Journal of Legal Discourse 10(1). 1–11. https://doi.org/10.1515/ijld-2025-2001.Search in Google Scholar
Codarcea, Emilia. 2021. Linguistische Merkmale der juristischen Fachsprache: Bemerkungen zur Fachlichkeit und Verständlichkeit juristischer Texte. In Roxana-Maria Nistor, Camelia Teglaş, Roxana Mihele & Raluca Zglobiu-Sandu (eds.), Limbaje specializate: Abordari si provocari pentru viitor, 132–145. Cluj-Napoca: Presa Universitară Clujeană.Search in Google Scholar
Cowan, Nelson. 2005. Working memory capacity. Hove: Psychology Press.Search in Google Scholar
Crossley, Scott, Stephen Skalicky & Mihai Dascalu. 2019. Moving beyond classic readability formulas: New methods and new models. Journal of Research in Reading 42(3–4). 541–561. https://doi.org/10.1111/1467-9817.12283.Search in Google Scholar
Curtotti, Michael & Eric McCreath. 2013. Right to access implies right to know: An open online platform for research on the readability of law. Journal of Open Access to Law. 1(1). 1–56.Search in Google Scholar
European Parliament. 2016. Regulation (EU) 2016/1037 of the European Parliament and of the council of 8 June 2016 on protection against subsidised imports from countries not members of the European Union (codification). https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016R1037&qid=1711864492492 (accessed 29 March 2024).Search in Google Scholar
Felder, Ekkehard & Friedemann Vogel. 2017. Handbuch Sprache im Recht. Berlin: Walter de Gruyter.10.1515/9783110296198Search in Google Scholar
Foth, Kilian, Arne Köhn, Niels Beuck & Wolfgang Menzel. 2014. Because size does matter: The Hamburg dependency treebank. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk & Stelios Piperidis (eds.), Proceedings of the ninth international conference on language resources and evaluation, 2326–2333. Reykjavik: European Language Resources Association.Search in Google Scholar
Gibbons, John. 1999. Language and the law. Annual Review of Applied Linguistics 19. 156–173. https://doi.org/10.1017/s0267190599190081.Search in Google Scholar
Grieshofer, Tatiana, Matt Gee & Ralph Morton. 2022. The journey to comprehensibility: Court forms as the first barrier to accessing justice. International Journal for the Semiotics of Law 35. 1733–1759.10.1007/s11196-021-09870-6Search in Google Scholar
Hudson, Richard. 1995. Measuring syntactic difficulty. London: University College.Search in Google Scholar
Jiang, Jingyang & Haitao Liu. 2015. The effects of sentence length on dependency distance, dependency direction and the implications: Based on a parallel English Chinese dependency treebank. Language Sciences 50. 93–104. https://doi.org/10.1016/j.langsci.2015.04.002.Search in Google Scholar
Joint Research Centre. 2024. DGT-translation memory. https://joint-research-centre.ec.europa.eu/language-technology-resources/dgt-translation-memory (accessed 26 March 2024).Search in Google Scholar
Kubát, Miroslav, Matlach Vladimír & Radek, Čech. 2014. QUITA – Quantitative Index Text Analyzer. https://ram-verlag.de/software-neu/quita-quantitative-index-text-analyzer/ (accessed 1 October 2025).Search in Google Scholar
Li, Jian & Zhanglei Ye. 2024. Stance expressions in legal academic discourse: A corpus-based analysis of legal journals. International Journal of Legal Discourse 9(2). 367–385. https://doi.org/10.1515/ijld-2024-2016.Search in Google Scholar
Liu, Haitao. 2008. Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science 9(2). 159–191. https://doi.org/10.17791/jcs.2008.9.2.159.Search in Google Scholar
Liu, Haitao. 2010. Dependency direction as a means of word-order typology: A method based on dependency treebanks. Lingua 120(6). 1567–1578. https://doi.org/10.1016/j.lingua.2009.10.001.Search in Google Scholar
Liu, Haitao. 2022. Dependency relation & language networks. Beijing: Science Press.Search in Google Scholar
Liu, Haitao, Yiyi Zhao & Wenwen Li. 2009. Chinese syntactic and typological properties based on dependency syntactic treebanks. Poznań Studies in Contemporary Linguistics 45(4). 509–523. https://doi.org/10.2478/v10010-009-0025-3.Search in Google Scholar
Liu, Bingli, Yaxian Niu & Haitao Liu. 2012. Word class, syntactic function and style: A comparative study based on annotated corpora. Applied Linguistics(4). 134–142.Search in Google Scholar
Ljubešić, Nikola & Tomaž Erjavec. 2018. JRC EU DGT Translation Memory Parsebank DGT-UD (1.0). Slovenia: CLARIN.SI. https://www.clarin.si/repository/xmlui/handle/11356/1197?show=full (accessed 2 February 2024).Search in Google Scholar
Marneffe, Marie-Catherine de, Christopher Manning, Joakim Nivre & Daniel Zeman. 2021. Universal dependencies. Computational Linguistics 47(2). 255–308.Search in Google Scholar
Melinkoff, David. 1963. The language of the law. Boston: Little Brown.Search in Google Scholar
Roelcke, Thorsten. 2010. Fachsprachen. Berlin: Erich Schmidt Verlag.Search in Google Scholar
Schendera, Christian. 2004. Die Verständlichkeit von Rechtstexten: Eine kritische Darstellung der Forschungslage. In Kent Lerch (ed.), Die Sprache des Rechts, 321–332. Berlin: Walter de Gruyter.Search in Google Scholar
Schriver, Karen. 2017. Plain language in the US gains momentum: 1940–2015. IEEE Transactions on Professional Communications 60(4). 343–383. https://doi.org/10.1109/tpc.2017.2765118.Search in Google Scholar
Scott, Mike. 2024. WordSmith tools version 9. Stroud: Lexical analysis software. https://lexically.net/wordsmith/. (accessed 26 March 2024).Search in Google Scholar
Sun, Yuxiu & Cheng Le. 2017. Linguistic variation and legal representation in legislative discourse: A corpus-based multi-dimensional study. International Journal of Legal Discourse 2(2). 315–339. https://doi.org/10.1515/ijld-2017-0017.Search in Google Scholar
Tesnière, Lucien. 1976. Eléments de syntaxe structural. Paris: Klincksieck.Search in Google Scholar
Twain, Mark. 1880. A tramp abroad. Hartford: American publishing company.Search in Google Scholar
Völker, Emanuel Borges, Maximilian Wendt, Felix Hennig & Arne Köhn. 2019. HDT-UD: A very large universal dependencies treebank for German. In Alexandre Rademaker & Francis Tyers (eds.), Proceedings of the third workshop on universal dependencies, 46–55. Paris: Association for Computational Linguistics.10.18653/v1/W19-8006Search in Google Scholar
Walter, Tonio. 2007. Sprache und Stil in Rechtstexten. Juristische Rundschau 2. 61–65. https://doi.org/10.1515/juru.2007.016.Search in Google Scholar
Williams, Christopher. 2004. Legal English and plain language: An introduction. ESP across Cultures 1(1). 111–124.Search in Google Scholar
Yan, Jianwei & Haitao Liu. 2023. Quantitative word-order typology based on the dependency direction of syntactic relations with high frequencies. Applied Linguistics (02). 79–90.Search in Google Scholar
Yngve, Victor. 1960. A model and an hypothesis for language structure. Proceedings of the American Philosophical Society 104(5). 444–466.Search in Google Scholar
Zhou, Pinyu, Ning Ye & Jiamin Pei. 2024. Evolution and regulation of online public opinion on Weibo: A corpus-based topic-sentiment aggregation analysis. International Journal of Legal Discourse 10(1). 121–152. https://doi.org/10.1515/ijld-2025-2007.Search in Google Scholar
Ződi, Zsolt. 2019. The limits of plain legal language: Understanding the comprehensible style in law. International Journal of Law in Context 15(3). 246–262. https://doi.org/10.1017/s1744552319000260.Search in Google Scholar
© 2025 Walter de Gruyter GmbH, Berlin/Boston