Startseite Germanistische Linguistik Die RUEG-Korpora: Ein Blick auf Design, Aufbau, Infrastruktur und Nachnutzung multilingualer Forschungsdaten
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Die RUEG-Korpora: Ein Blick auf Design, Aufbau, Infrastruktur und Nachnutzung multilingualer Forschungsdaten

  • Martin Klotz ORCID logo EMAIL logo , Rahel Gajaneh Hartz ORCID logo , Annika Labrenz ORCID logo , Anke Lüdeling ORCID logo und Anna Shadrova ORCID logo
Veröffentlicht/Copyright: 3. Dezember 2024
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

The article presents the RUEG corpora. We begin by describing the basic principles and research questions, that influenced and shaped the corpora and introduce the method of data generation. We proceed by describing the fundamental components of the building process and the overall outcome. Said outcome provides interfaces for new researchers, who wish to conduct their own research using the RUEG corpora. Options for such research are discussed by showing that it can succeed within the range of re-using exisiting annotations and metadata up to building entirely new, but comparable corpora.

Danksagung

Die Forschungsergebnisse dieser Veröffentlichung wurden gefördert durch die Deutsche Forschungsgemeinschaft (DFG) – SFB 1412, 416591334 sowie FOR 2537, 313607803, GZ LU 856/16-1. Wir bedanken uns zudem für die konstruktive Begutachtung unseres Beitrags.

Literatur

Allen, Shanley, Keller, Mareike, Alexiadou, Artemis, & Wiese, Heike. (Hrsg.). Erscheint. Linguistic Dynamics in Heritage Speakers. Language Science Press.Suche in Google Scholar

Biber, Douglas & Conrad, Susan. 2009. Register, genre, and style. Cambridge University Press.10.1017/CBO9780511814358Suche in Google Scholar

Boersma, Paul. 2001. Praat, a system for doing phonetics by computer. Glot International, 5(9), 341–345.Suche in Google Scholar

de Marneffe, Marie-Catherine, Manning, Christopher D., Nivre, Joakim, & Zeman, Daniel. 2021. Universal Dependencies. Computational Linguistics, 47(2), 255–308. https://doi.org/10.1162/coli_a_0040210.1162/coli_a_00402Suche in Google Scholar

Druskat, Stephan, Krause, Thomas, Lachenmaier, Clara, & Bunzeck, Bastian. 2023. Hexatomic (Version 1.5.0-SNAPSHOT). https://doi.org/10.5281/zenodo.6900689Suche in Google Scholar

Gerdes, Kim, Guillaume, Bruno, Kahane, Sylvain, & Perrier, Guy. 2019. Improving Surface-syntactic Universal Dependencies (SUD): surface-syntactic relations and deep syntactic features. TLT 2019 – 18th International Workshop on Treebanks and Linguistic Theories. https://hal.inria. fr/hal-0226600310.18653/v1/W19-7814Suche in Google Scholar

Goldberg, Lewis R. 1993. The structure of phenotypic personality traits. American Psychologist, 48(1), 26–34. https://doi.org/10.1037/0003-066X.48.1.2610.1037//0003-066X.48.1.26Suche in Google Scholar

Guillaume, Bruno. 2021. Graph Matching and Graph Rewriting: GREW tools for corpus exploration, maintenance and conversion. EACL 2021 – 16th conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. https:// inria.hal.science/hal-0317770110.18653/v1/2021.eacl-demos.21Suche in Google Scholar

Iefremenko, Kateryna. 2024. Word order in Turkish and Kurmanji Kurdish in Language Contacts: Evidence of Emerging Varieties? [Diss., Universität Potsdam].Suche in Google Scholar

Iefremenko, Kateryna, Klotz, Martin, & Schroeder, Christoph. 2024. RUEG subcorpus of Kurmanji Kurdish and Turkish data (Version 0.1). Zenodo. https://doi.org/10.5281/zenodo.10810768Suche in Google Scholar

Keller, Mareike, Zürn, Nadine, Tracy, Rosemarie, & Lüdeling, Anke. Erscheint. Dynamic properties of the heritage speaker lexicon. In Allen, Shanley, Keller, Mareike, Alexiadou, Artemis, & Wiese, Heike. (Hrsg.), Linguistic Dynamics in Heritage Speakers. Language Science Press.Suche in Google Scholar

Kisler, Thomas, Reichel, Ulrich, & Schiel, Florian. 2017. Multilingual processing of speech via web services. Computer Speech & Language, 45, 326–347. https://doi.org/10.1016/j.csl.2017.01.00510.1016/j.csl.2017.01.005Suche in Google Scholar

Krause, Thomas. 2019. ANNIS: A graph-based query system for deeply annotated text corpora [Doctoral Dissertation]. Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät. https://doi. org/10.18452/19659Suche in Google Scholar

Krause, Thomas, & Klotz, Martin. 2024. Annatto (Version 0.14.0). https://github.com/korpling/annatto/Suche in Google Scholar

Krause, Thomas, & Zeldes, Amir 2014. ANNIS3: A new architecture for generic corpus query and visualization. Digital Scholarship in the Humanities, 31(1), 118–139. https://doi.org/10.1093/llc/fqu05710.1093/llc/fqu057Suche in Google Scholar

Loban, Walter. 1976. Language development: Kindergarten through grade twelve (Techn. Ber. Nr. 18) (ERIC Number: ED128818). National Council of Teachers of English, 1111 Kenyon Road, Urbana, Illinois 61801 (Stock No. 26545). Verfügbar 27. April 2023 unter https://eric.ed. gov/?id=ED128818Suche in Google Scholar

Lüdeling, Anke, Alexiadou, Artemis, Allen, Shanley, Bunk, Oliver, Gagarina, Natalia, Grigoriadou, Sofia, Hartz, Rahel Gajaneh, Iefremenko, Kateryna, Jahns, Esther, Katsika, Kalliopi, Keller, Mareike, Klotz, Martin, Krause, Thomas, Labrenz, Annika, Martynova, Marina, Özsoy, Onur, Pashkova, Tatiana, Pohle, Maria, Purkarthofer, Judith, Rizou, Vicky, Schroeder, Christoph, Shadrova, Anna, Szucsich, Luka, Tracy, Rosemarie, Tsehaye, Wintai, Wiese, Heike, Zerbian, Sabine, Zuban, Yulia, & Zürn, Nadine. 2024. RUEG Corpus (Version 1.0). Zenodo. https://doi.org/10.5281/zenodo.3236068Suche in Google Scholar

Martynova, Marina, Özsoy, Onur, Rizou, Vicky, Szucsich, Luka, Gagarina, Natalia, & Alexiadou, Artemis. 2024. Demonstratives in heritage Greek, Russian, and Turkish in Germany and the US. International Journal of Bilingualism.10.1177/13670069241261052Suche in Google Scholar

Pescuma, Valentina N., Serova, Dina, Lukassek, Julia, Sauermann, Antje, Schäfer, Roland, Adli, Aria, Bildhauer, Felix, Egg, Markus, Hülk, Kristina, Ito, Aine, Jannedy, Stefanie, Kordoni, Valia, Kuehnast, Milena, Kutscher, Silvia, Lange, Robert, Lehmann, Nico, Liu, Mingya, Lütke, Beate, Maquate, Katja, Mooshammer, Christine, Mortezapour, Vahid, Müller, Stefan, Norde, Muriel, Pankratz, Elizabeth, Patarroyo, Angela G., Pleşca, Ana-Maria, Ronderos, Camilo R., Rotter, Stephanie, Sauerland, Uli, Schnelle, Gohar, Schulte, Britta, Schüppenhauer, Gediminas, Sell, Bianca Maria, Solt, Stephanie, Terada, Megumi, Tsiapou, Dimitra, Verhoeven, Elisabeth, Weirich, Melanie, Wiese, Heike, Zaruba, Kathy, Zeige, Lars Erik, Lüdeling, Anke, & Knoeferle, Pia. 2023. Situating language register across the ages, languages, modalities, and cultural aspects: Evidence from complementary methods. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.96465810.3389/fpsyg.2022.964658Suche in Google Scholar

Rehbein, Ines, Schalowski, Sören, & Wiese, Heike. 2014. The KiezDeutsch Korpus (KiDKo) Release 1.0. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk & Stelios Piperidis (Hrsg.), Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14) (S. 3927–3934).Suche in Google Scholar

Rizou, Vicky, Özsoy, Onur, Martynova, Marina, Szucsich, Luka, Alexiadou, Artemis, & Gagarina, Natalia. 2024. Grammatical aspect in heritage and monolingual Greek, Russian and Turkish. Register Aspects of Language in Situation. REALIS.Suche in Google Scholar

Schmidt, Thomas, & Wörner, Kai. 2014. EXMARaLDA. In Jacques Durand, Ulrike Gut & Gjert Kristoffersen (Hrsg.), Handbook on Corpus Phonology (S. 402–419). Oxford University Press. http://ukcatalogue.oup.com/product/9780199571932.doSuche in Google Scholar

Shadrova, Anna. 2020. Measuring coselectional constraint in learner corpora: A graph-based approach [Dissertation]. Humboldt-Universität zu Berlin, Sprach- und literaturwissenschaftliche Fakultät. https://doi.org/http://dx.doi.org/10.18452/21606Suche in Google Scholar

Shadrova, Anna, Klotz, Martin, Hartz, Rahel G., & Lüdeling, Anke. Erscheint. Mapping the mappings and then containing them all: Quality assurance, interface modeling, and epistemology in complex corpus projects. In Allen, Shanley, Keller, Mareike, Alexiadou, Artemis, & Wiese, Heike. (Hrsg.), Linguistic Dynamics in Heritage Speakers.Suche in Google Scholar

Shadrova, Anna, Lüdeling, Anke, Hartz, Rahel G., Klotz, Martin, & Krause, Thomas. Erscheint. Step Away from the Computer! Zeitschrift für germanistische Linguistik.Suche in Google Scholar

Tsehaye, Wintai. 2024. Variation revisited: syntactic and morphosyntactic variation in heritage speakers of German in the United States [Dissertation]. Universität Mannheim.Suche in Google Scholar

Wiese, Heike. 2020. Language Situations: A method for capturing variation within speakers’ repertoires. In Y Yoshiyuki Asahi (Hrsg.), Methods in Dialectology XVI (S. 105–117). Peter Lang.Suche in Google Scholar

Wiese, Heike, Allen, Shanley, Keller, Mareike, & Alexiadou, Artemis. Erscheint. Introduction: Investigating the dynamics of language-contact situations. In Shanley Allen, Mareike Keller, Artemis Alexiadou & Heike Wiese (Hrsg.), Linguistic Dynamics in Heritage Speakers.Suche in Google Scholar

Wiese, Heike, Labrenz, Annika, & Roy, Albrun. Erscheint. Tapping into speakers’ repertoires: Elicitation of register-differentiated productions across groups. In Shanley Allen, Mareike Keller, Artemis Alexiadou & Heike Wiese (Hrsg.), Linguistic Dynamics in Heritage Speakers. Language Science Press.Suche in Google Scholar

Zerbian, Sabine, Zuban, Yulia, & Klotz, Martin. 2024. Intonational Features of Spontaneous Narrations in Monolingual and Heritage Russian in the U. S.—An Exploration of the RUEG Corpus. Languages, 9(1). https://doi.org/10.3390/languages901000210.3390/languages9010002Suche in Google Scholar

Zipser, Florian, & Romary, Laurent. 2010. A model oriented approach to the mapping of annotation formats using standards. Workshop on Language Resource and Language Technology Standards, LREC 2010. https://inria.hal.science/inria-00527799Suche in Google Scholar

Online erschienen: 2024-12-03
Erschienen im Druck: 2024-11-29

© 2024 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 25.1.2026 von https://www.degruyterbrill.com/document/doi/10.1515/zgl-2024-2026/pdf
Button zum nach oben scrollen