Abstract
Hungarian shows variable front vowel harmony, particularly in suffixed back vowel + [ɛ] nouns. The study aims to address two main research questions: (1) To what extent does stem-level information (similarity across stems) predict suffix variation for back vowel + [ɛ] stems in Hungarian corpus data? (2) Do suffixes themselves predict suffix variation beyond the stem-level information? We draw on a dataset of 200 noun stems, 4,501 suffixed forms and 4 × 106 tokens, based on the New Hungarian Webcorpus, and use a K-Nearest Neighbours learner and a hierarchical generalised linear model to address these questions. We find that the majority of back vowel + [ɛ] stems show variable vowel harmony, that this depends on stem similarity and that similarity effects are amplified by vowel-initial suffixes. This points to a model of Hungarian vowel harmony in which stem- and suffix-level information are lexically specified.
Funding source: Nemzeti Kutatási, Fejlesztési és Innovaciós Alap
Award Identifier / Grant number: OTKA-138188
Award Identifier / Grant number: OTKA-139271
Award Identifier / Grant number: NKFI-K139271
Award Identifier / Grant number: NKFI-FK138188
Award Identifier / Grant number: TKP2021-EGA-02
Funding source: Innovációs és Technológiai Minisztérium
Award Identifier / Grant number: TKP2023-NVA-2
Acknowledgments
We are grateful to our Editor, our Reviewers and Vásárhelyi Dani.
-
Conflicts of interest: The authors declare no conflict of interest.
-
Research funding: This research was supported by grants NKFI-K139271, NKFI-FK138188, TKP2021-EGA-02, OTKA-138188 and OTKA-139271 by the Hungarian National Research, Development and Innovation Fund. This was also supported by grant TKP2023-NVA-2 by Innovációs és Technológiai Minisztérium.
References
Albright, Adam & Bruce Hayes. 2003. Rules vs. analogy in English past tenses: A computational/experimental study. Cognition 90(2). 119–161. https://doi.org/10.1016/s0010-0277(03)00146-x.Suche in Google Scholar
Baayen, R. Harald. 2010. Demythologizing the word frequency effect: A discriminative learning perspective. The Mental Lexicon 5(3). 436–461. https://doi.org/10.1075/ml.5.3.10baa.Suche in Google Scholar
Bates, Douglas, Martin Mächler, Ben Bolker & Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1). 1–48. https://doi.org/10.18637/jss.v067.i01.Suche in Google Scholar
Berko, Jean. 1958. The child’s learning of English morphology. Word 14(2–3). 150–177. https://doi.org/10.1080/00437956.1958.11659661.Suche in Google Scholar
Dawdy-Hesterberg, Lisa Garnand & Janet Breckenridge Pierrehumbert. 2014. Learnability and generalisation of Arabic broken plural nouns. Language, Cognition and Neuroscience 29(10). 1268–1282. https://doi.org/10.1080/23273798.2014.899377.Suche in Google Scholar
Emmert-Streib, Frank & Matthias Dehmer. 2019. High-dimensional lasso-based computational regression models: Regularization, shrinkage, and selection. Machine Learning and Knowledge Extraction 1(1). 359–383. https://doi.org/10.3390/make1010021.Suche in Google Scholar
Forró, Orsolya. 2013. Ingadozás a magyar elölségi harmóniában [variation in Hungarian backness harmony]. Pázmány Péter Katolikus Egyetem Dissertation.Suche in Google Scholar
Gerstner, Károly, Zita Horváth-Papp, Andrea Kacskovics-Reményi, László Horváth, Zsuzsanna Molnár, Mária Hochbauer, Dóra Tamás, Attila Mártonfi & Csaba Merényi. 2024. UESzWeb Új Magyar Etimológiai Szótár — uesz.nytud.hu. https://uesz.nytud.hu/index.html (accessed 5 June 2024).Suche in Google Scholar
Goldsmith, John. 1985. Vowel harmony in Khalkha Mongolian, Yaka, Finnish and Hungarian. Phonology 2. 253–275. https://doi.org/10.1017/s0952675700000452.Suche in Google Scholar
Halácsy, Péter, András Kornai, Németh László, Rung András, István Szakadát & Trón Viktor. 2004. Creating open language resources for Hungarian. In Proceedings of the 4th international conference on language resources and evaluation (LREC2004).Suche in Google Scholar
Hay, Jennifer B. & R. Harald Baayen. 2005. Shifting paradigms: Gradient structure in morphology. Trends in Cognitive Sciences 9(7). 342–348. https://doi.org/10.1016/j.tics.2005.04.002.Suche in Google Scholar
Hayes, Bruce. 2022. Deriving the wug-shaped curve: A criterion for assessing formal theories of linguistic variation. Annual Review of Linguistics 8(1). 473–494. https://doi.org/10.1146/annurev-linguistics-031220-013128.Suche in Google Scholar
Hayes, Bruce, Péter Siptár, Kie Zuraw & Zsuzsa Londe. 2009. Natural and unnatural constraints in Hungarian vowel harmony. Language 85. 822–863. https://doi.org/10.1353/lan.0.0169.Suche in Google Scholar
Van der Hulst, Harry. 2016. Vowel harmony. In Oxford research encyclopedia of linguistics. Oxford: OUP.10.1093/acrefore/9780199384655.013.38Suche in Google Scholar
Janda, Laura A., Tore Nesset & R. Harald Baayen. 2010. Capturing correlational structure in Russian paradigms: A case study in logistic mixed-effects modeling. Corpus Linguistics and Linguistic Theory 6. 29–48. https://doi.org/10.1515/cllt.2010.002.Suche in Google Scholar
Johnson, Keith. 2006. Resonance in an exemplar-based lexicon: The emergence of social identity and phonology. Journal of Phonetics 34(4). 485–499. https://doi.org/10.1016/j.wocn.2005.08.004.Suche in Google Scholar
Kertész, Zsuzsa. 2003. Vowel harmony and the stratified lexicon of Hungarian. The Odd Yearbook 7. 62–77.Suche in Google Scholar
Lindsay-Smith, Emily, Matthew Baerman, Sacha Beniamine, Helen Sims-Williams & Erich R. Round. 2024. Analogy in inflection. Annual Review of Linguistics 10. 211–231. https://doi.org/10.1146/annurev-linguistics-030521-040935.Suche in Google Scholar
Lüdecke, Daniel. 2023. sjplot: Data visualization for statistics in social science. Available at: https://CRAN.R-project.org/package=sjPlot.Rpackageversion2.8.15.Suche in Google Scholar
Lüdecke, Daniel, Mattan S. Ben-Shachar, Indrajeet Patil, Philip Waggoner & Dominique Makowski. 2021. performance: An R package for assessment, comparison and testing of statistical models. Journal of Open Source Software 6(60). 3139. https://doi.org/10.21105/joss.03139.Suche in Google Scholar
Nemeskey, Dávid Márk. 2020. Natural language processing methods for language modeling. Eötvös Loránd University PhD thesis.Suche in Google Scholar
Nosofsky, Robert M. 2011. The generalized context model: An exemplar model of classification. In Emmanuel M. Pothos & Andy J. Wills (eds.), Formal approaches in categorization, 18–39. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511921322.002.Suche in Google Scholar
Ooms, Jeroen. 2023. hunspell: High-performance stemmer, tokenizer, and spell checker. Available at: https://CRAN.R-project.org/package=hunspell.Rpackageversion3.0.3.Suche in Google Scholar
Pedersen, Thomas Lin. 2024. patchwork: The composer of plots. Available at: https://CRAN.R-project.org/package=patchwork.Rpackageversion1.2.0.Suche in Google Scholar
Peterson, Leif E. 2009. K-nearest neighbor. Scholarpedia 4(2). 1883. https://doi.org/10.4249/scholarpedia.1883.Suche in Google Scholar
Pierrehumbert, Janet B. 2016. Phonological representation: Beyond abstract versus episodic. Annual Review of Linguistics 2. 33–52. https://doi.org/10.1146/annurev-linguist-030514-125050.Suche in Google Scholar
R Core Team. 2023. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.R-project.org/.Suche in Google Scholar
Rácz, Péter, Péter Rebrus & Miklós Törkenczy. 2021. Attractors of variation in Hungarian inflectional morphology. Corpus Linguistics and Linguistic Theory 17(2). 287–317. https://doi.org/10.1515/cllt-2018-0014.Suche in Google Scholar
Rácz, Péter, Péter Rebrus & Szilárd Tóth. 2024. Evaluating an ensemble model of linguistic categorization on three variable morphological patterns in Hungarian. In Proceedings of the annual meeting of the cognitive science society, vol. 46.Suche in Google Scholar
Rebrus, Péter, Péter Szigetvári & Miklós Törkenczy. 2012. Dark secrets of Hungarian vowel harmony. In Eugeniusz Cyran, Henryk Kardela & Bogdan Szymanek (eds.), Sound, structure and sense: Studies in memory of Edmund Gussmann, 491–508. Lublin: Wydawnictwo KUL.Suche in Google Scholar
Rebrus, Péter, Péter Szigetvári & Miklós Törkenczy. 2022. How morphological is Hungarian vowel harmony? In Proceedings of the annual meetings on phonology.10.3765/amp.v10i0.5440Suche in Google Scholar
Rebrus, Péter, Péter Szigetvári & Miklós Törkenczy. 2024. No lowering, only paradigms: A paradigm-based account of linking vowels in Hungarian. Acta Linguistica Academica 71(1–2). 137–170. https://doi.org/10.1556/2062.2023.00674.Suche in Google Scholar
Robinson, David, Alex Hayes & Simon Couch. 2023. broom: Convert statistical objects into tidy tibbles. Available at: https://CRAN.R-project.org/package=broom.Rpackageversion1.0.5.Suche in Google Scholar
Siptár, Péter & Miklós Törkenczy. 2000. The phonology of Hungarian. Oxford, UK: OUP Oxford.Suche in Google Scholar
Törkenczy, Miklós. 2011. Hungarian vowel harmony. The Blackwell Companion to Phonology 5. 2963–2990.10.1002/9781444335262.wbctp0123Suche in Google Scholar
Wickham, Hadley. 2011. ggplot2. Wiley Interdisciplinary Reviews: Computational Statistics 3(2). 180–185. https://doi.org/10.1002/wics.147.Suche in Google Scholar
Zaicz, Gábor, Ildikó Tamás & Magda T. Somogyi. 2006. Etimológiai szótár: Magyar szavak és toldalékok eredete. Budapest: Tinta.Suche in Google Scholar
Zuraw, Kie & Bruce Hayes. 2017. Intersecting constraint families: An argument for harmonic grammar. Language 93. 497–548. https://doi.org/10.1353/lan.2017.0035.Suche in Google Scholar
© 2024 Walter de Gruyter GmbH, Berlin/Boston