Home German Linguistics Die RUEG-Korpora: Ein Blick auf Design, Aufbau, Infrastruktur und Nachnutzung multilingualer Forschungsdaten
Article
Licensed
Unlicensed Requires Authentication

Die RUEG-Korpora: Ein Blick auf Design, Aufbau, Infrastruktur und Nachnutzung multilingualer Forschungsdaten

  • Martin Klotz ORCID logo EMAIL logo , Rahel Gajaneh Hartz ORCID logo , Annika Labrenz ORCID logo , Anke Lüdeling ORCID logo and Anna Shadrova ORCID logo
Published/Copyright: December 3, 2024

Abstract

The article presents the RUEG corpora. We begin by describing the basic principles and research questions, that influenced and shaped the corpora and introduce the method of data generation. We proceed by describing the fundamental components of the building process and the overall outcome. Said outcome provides interfaces for new researchers, who wish to conduct their own research using the RUEG corpora. Options for such research are discussed by showing that it can succeed within the range of re-using exisiting annotations and metadata up to building entirely new, but comparable corpora.

Danksagung

Die Forschungsergebnisse dieser Veröffentlichung wurden gefördert durch die Deutsche Forschungsgemeinschaft (DFG) – SFB 1412, 416591334 sowie FOR 2537, 313607803, GZ LU 856/16-1. Wir bedanken uns zudem für die konstruktive Begutachtung unseres Beitrags.

Literatur

Allen, Shanley, Keller, Mareike, Alexiadou, Artemis, & Wiese, Heike. (Hrsg.). Erscheint. Linguistic Dynamics in Heritage Speakers. Language Science Press.Search in Google Scholar

Biber, Douglas & Conrad, Susan. 2009. Register, genre, and style. Cambridge University Press.10.1017/CBO9780511814358Search in Google Scholar

Boersma, Paul. 2001. Praat, a system for doing phonetics by computer. Glot International, 5(9), 341–345.Search in Google Scholar

de Marneffe, Marie-Catherine, Manning, Christopher D., Nivre, Joakim, & Zeman, Daniel. 2021. Universal Dependencies. Computational Linguistics, 47(2), 255–308. https://doi.org/10.1162/coli_a_0040210.1162/coli_a_00402Search in Google Scholar

Druskat, Stephan, Krause, Thomas, Lachenmaier, Clara, & Bunzeck, Bastian. 2023. Hexatomic (Version 1.5.0-SNAPSHOT). https://doi.org/10.5281/zenodo.6900689Search in Google Scholar

Gerdes, Kim, Guillaume, Bruno, Kahane, Sylvain, & Perrier, Guy. 2019. Improving Surface-syntactic Universal Dependencies (SUD): surface-syntactic relations and deep syntactic features. TLT 2019 – 18th International Workshop on Treebanks and Linguistic Theories. https://hal.inria. fr/hal-0226600310.18653/v1/W19-7814Search in Google Scholar

Goldberg, Lewis R. 1993. The structure of phenotypic personality traits. American Psychologist, 48(1), 26–34. https://doi.org/10.1037/0003-066X.48.1.2610.1037//0003-066X.48.1.26Search in Google Scholar

Guillaume, Bruno. 2021. Graph Matching and Graph Rewriting: GREW tools for corpus exploration, maintenance and conversion. EACL 2021 – 16th conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. https:// inria.hal.science/hal-0317770110.18653/v1/2021.eacl-demos.21Search in Google Scholar

Iefremenko, Kateryna. 2024. Word order in Turkish and Kurmanji Kurdish in Language Contacts: Evidence of Emerging Varieties? [Diss., Universität Potsdam].Search in Google Scholar

Iefremenko, Kateryna, Klotz, Martin, & Schroeder, Christoph. 2024. RUEG subcorpus of Kurmanji Kurdish and Turkish data (Version 0.1). Zenodo. https://doi.org/10.5281/zenodo.10810768Search in Google Scholar

Keller, Mareike, Zürn, Nadine, Tracy, Rosemarie, & Lüdeling, Anke. Erscheint. Dynamic properties of the heritage speaker lexicon. In Allen, Shanley, Keller, Mareike, Alexiadou, Artemis, & Wiese, Heike. (Hrsg.), Linguistic Dynamics in Heritage Speakers. Language Science Press.Search in Google Scholar

Kisler, Thomas, Reichel, Ulrich, & Schiel, Florian. 2017. Multilingual processing of speech via web services. Computer Speech & Language, 45, 326–347. https://doi.org/10.1016/j.csl.2017.01.00510.1016/j.csl.2017.01.005Search in Google Scholar

Krause, Thomas. 2019. ANNIS: A graph-based query system for deeply annotated text corpora [Doctoral Dissertation]. Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät. https://doi. org/10.18452/19659Search in Google Scholar

Krause, Thomas, & Klotz, Martin. 2024. Annatto (Version 0.14.0). https://github.com/korpling/annatto/Search in Google Scholar

Krause, Thomas, & Zeldes, Amir 2014. ANNIS3: A new architecture for generic corpus query and visualization. Digital Scholarship in the Humanities, 31(1), 118–139. https://doi.org/10.1093/llc/fqu05710.1093/llc/fqu057Search in Google Scholar

Loban, Walter. 1976. Language development: Kindergarten through grade twelve (Techn. Ber. Nr. 18) (ERIC Number: ED128818). National Council of Teachers of English, 1111 Kenyon Road, Urbana, Illinois 61801 (Stock No. 26545). Verfügbar 27. April 2023 unter https://eric.ed. gov/?id=ED128818Search in Google Scholar

Lüdeling, Anke, Alexiadou, Artemis, Allen, Shanley, Bunk, Oliver, Gagarina, Natalia, Grigoriadou, Sofia, Hartz, Rahel Gajaneh, Iefremenko, Kateryna, Jahns, Esther, Katsika, Kalliopi, Keller, Mareike, Klotz, Martin, Krause, Thomas, Labrenz, Annika, Martynova, Marina, Özsoy, Onur, Pashkova, Tatiana, Pohle, Maria, Purkarthofer, Judith, Rizou, Vicky, Schroeder, Christoph, Shadrova, Anna, Szucsich, Luka, Tracy, Rosemarie, Tsehaye, Wintai, Wiese, Heike, Zerbian, Sabine, Zuban, Yulia, & Zürn, Nadine. 2024. RUEG Corpus (Version 1.0). Zenodo. https://doi.org/10.5281/zenodo.3236068Search in Google Scholar

Martynova, Marina, Özsoy, Onur, Rizou, Vicky, Szucsich, Luka, Gagarina, Natalia, & Alexiadou, Artemis. 2024. Demonstratives in heritage Greek, Russian, and Turkish in Germany and the US. International Journal of Bilingualism.10.1177/13670069241261052Search in Google Scholar

Pescuma, Valentina N., Serova, Dina, Lukassek, Julia, Sauermann, Antje, Schäfer, Roland, Adli, Aria, Bildhauer, Felix, Egg, Markus, Hülk, Kristina, Ito, Aine, Jannedy, Stefanie, Kordoni, Valia, Kuehnast, Milena, Kutscher, Silvia, Lange, Robert, Lehmann, Nico, Liu, Mingya, Lütke, Beate, Maquate, Katja, Mooshammer, Christine, Mortezapour, Vahid, Müller, Stefan, Norde, Muriel, Pankratz, Elizabeth, Patarroyo, Angela G., Pleşca, Ana-Maria, Ronderos, Camilo R., Rotter, Stephanie, Sauerland, Uli, Schnelle, Gohar, Schulte, Britta, Schüppenhauer, Gediminas, Sell, Bianca Maria, Solt, Stephanie, Terada, Megumi, Tsiapou, Dimitra, Verhoeven, Elisabeth, Weirich, Melanie, Wiese, Heike, Zaruba, Kathy, Zeige, Lars Erik, Lüdeling, Anke, & Knoeferle, Pia. 2023. Situating language register across the ages, languages, modalities, and cultural aspects: Evidence from complementary methods. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.96465810.3389/fpsyg.2022.964658Search in Google Scholar

Rehbein, Ines, Schalowski, Sören, & Wiese, Heike. 2014. The KiezDeutsch Korpus (KiDKo) Release 1.0. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk & Stelios Piperidis (Hrsg.), Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14) (S. 3927–3934).Search in Google Scholar

Rizou, Vicky, Özsoy, Onur, Martynova, Marina, Szucsich, Luka, Alexiadou, Artemis, & Gagarina, Natalia. 2024. Grammatical aspect in heritage and monolingual Greek, Russian and Turkish. Register Aspects of Language in Situation. REALIS.Search in Google Scholar

Schmidt, Thomas, & Wörner, Kai. 2014. EXMARaLDA. In Jacques Durand, Ulrike Gut & Gjert Kristoffersen (Hrsg.), Handbook on Corpus Phonology (S. 402–419). Oxford University Press. http://ukcatalogue.oup.com/product/9780199571932.doSearch in Google Scholar

Shadrova, Anna. 2020. Measuring coselectional constraint in learner corpora: A graph-based approach [Dissertation]. Humboldt-Universität zu Berlin, Sprach- und literaturwissenschaftliche Fakultät. https://doi.org/http://dx.doi.org/10.18452/21606Search in Google Scholar

Shadrova, Anna, Klotz, Martin, Hartz, Rahel G., & Lüdeling, Anke. Erscheint. Mapping the mappings and then containing them all: Quality assurance, interface modeling, and epistemology in complex corpus projects. In Allen, Shanley, Keller, Mareike, Alexiadou, Artemis, & Wiese, Heike. (Hrsg.), Linguistic Dynamics in Heritage Speakers.Search in Google Scholar

Shadrova, Anna, Lüdeling, Anke, Hartz, Rahel G., Klotz, Martin, & Krause, Thomas. Erscheint. Step Away from the Computer! Zeitschrift für germanistische Linguistik.Search in Google Scholar

Tsehaye, Wintai. 2024. Variation revisited: syntactic and morphosyntactic variation in heritage speakers of German in the United States [Dissertation]. Universität Mannheim.Search in Google Scholar

Wiese, Heike. 2020. Language Situations: A method for capturing variation within speakers’ repertoires. In Y Yoshiyuki Asahi (Hrsg.), Methods in Dialectology XVI (S. 105–117). Peter Lang.Search in Google Scholar

Wiese, Heike, Allen, Shanley, Keller, Mareike, & Alexiadou, Artemis. Erscheint. Introduction: Investigating the dynamics of language-contact situations. In Shanley Allen, Mareike Keller, Artemis Alexiadou & Heike Wiese (Hrsg.), Linguistic Dynamics in Heritage Speakers.Search in Google Scholar

Wiese, Heike, Labrenz, Annika, & Roy, Albrun. Erscheint. Tapping into speakers’ repertoires: Elicitation of register-differentiated productions across groups. In Shanley Allen, Mareike Keller, Artemis Alexiadou & Heike Wiese (Hrsg.), Linguistic Dynamics in Heritage Speakers. Language Science Press.Search in Google Scholar

Zerbian, Sabine, Zuban, Yulia, & Klotz, Martin. 2024. Intonational Features of Spontaneous Narrations in Monolingual and Heritage Russian in the U. S.—An Exploration of the RUEG Corpus. Languages, 9(1). https://doi.org/10.3390/languages901000210.3390/languages9010002Search in Google Scholar

Zipser, Florian, & Romary, Laurent. 2010. A model oriented approach to the mapping of annotation formats using standards. Workshop on Language Resource and Language Technology Standards, LREC 2010. https://inria.hal.science/inria-00527799Search in Google Scholar

Online erschienen: 2024-12-03
Erschienen im Druck: 2024-11-29

© 2024 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 25.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/zgl-2024-2026/html
Scroll to top button