Startseite Using hidden Markov models to find discrete targets in continuous sociophonetic data
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Using hidden Markov models to find discrete targets in continuous sociophonetic data

  • Daniel Duncan ORCID logo EMAIL logo
Veröffentlicht/Copyright: 27. Juli 2021

Abstract

Advances in sociophonetic research resulted in features once sorted into discrete bins now being measured continuously. This has implied a shift in what sociolinguists view as the abstract representation of the sociolinguistic variable. When measured discretely, variation is variation in selection: one variant is selected for production, and factors influencing language variation and change are influencing the frequency at which variants are selected. Measured continuously, variation is variation in execution: speakers have a single target for production, which they approximate with varying success. This paper suggests that both approaches can and should be considered in sociophonetic analysis. To that end, I offer the use of hidden Markov models (HMMs) as a novel approach to find speakers’ multiple targets within continuous data. Using the lot vowel among whites in Greater St. Louis as a case study, I compare 2-state and 1-state HMMs constructed at the individual speaker level. Ten of fifty-two speakers’ production is shown to involve the regular use of distinct fronted and backed variants of the vowel. This finding illustrates HMMs’ capacity to allow us to consider variation as both variant selection and execution, making them a useful tool in the analysis of sociophonetic data.


Corresponding author: Daniel Duncan, School of English Literature, Language and Linguistics, Newcastle University, Percy Building, Newcastle upon Tyne NE1 7RU, UK, E-mail:

Funding source: NSF

Award Identifier / Grant number: BCS-1651102 DDRI

Acknowledgments

This work was previously presented at the 2019 Symposium on Representations, Usage and Social Embedding in Language Change, held at the University of Manchester. Thanks to the audience there, as well as two anonymous reviewers, for helpful comments.

  1. Research funding: The data discussed here were collected as part of NSF grant BCS-1651102 DDRI.

Appendix: Example R code

In this study, I use the depmixS4 package (Visser and Speekenbrink 2010) to run hidden Markov models in R (R Core Team 2017). Here, I illustrate the code used to obtain models similar to those run in the study. The 1-state model generated by this code assumes the data to be normally distributed around the mean, while the 2-state model assumes both states are normally distributed around the state mean.

After installing the package, it must be loaded prior to use.

Data should be loaded in one’s preferred format. If the original data file has multiple phones in it, create a new data frame composed of a single-phone subset of the original.

Because there is some randomness involved in an HMM, set the random seed to ensure consistency between runs.

HMMs will be created for individual speakers. For each speaker, make an individual-level subset of the data.

Now make a 2-state HMM for each individual. ‘nstates’ determines the number of states the model assumes. While the model here simply assumes a normal distribution around the state mean, note that the formula can be adapted for more complex modeling if necessary.

In order to view the summary data, we fit the HMM to our data. Viewing the fitted model gives the model AIC, BIC, and log likelihood.

We now make a 1-state HMM and follow the same process.

In this example, the 2-state model is selected because it has the lower BIC. In this case, we run the following to view the initial state probabilities, transition matrix, and response parameters.

References

Arthur, Rob & Greg Matthews. 2017. Baseball’s ‘hot hand’ is real. FiveThirtyEight. https://fivethirtyeight.com/features/baseballs-hot-hand-is-real/ (accessed 18 June 2020).Suche in Google Scholar

Baayen, R. Harald, Douglas J. Davidson & Douglas M. Bates. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59. 390–412. https://doi.org/10.1016/j.jml.2007.12.005.Suche in Google Scholar

Baranowski, Maciej. 2015. Sociophonetics. In Robert Bayley, Richard Cameron & Ceil Lucas (eds.), The Oxford handbook of sociolinguistics, 403–424. Oxford: Oxford University Press.10.1093/oxfordhb/9780199744084.013.0020Suche in Google Scholar

Becker, Kara. 2010. Regional dialect features on the Lower East Side of New York City: Sociophonetics, ethnicity, and identity. New York: New York University dissertation.Suche in Google Scholar

Bleaman, Isaac. 2020. Implicit standardization in a minority language community: Real-time syntactic change among Hasidic Yiddish writers. Frontiers in Artificial Intelligence 3. Article 35. https://doi.org/10.3389/frai.2020.00035.Suche in Google Scholar

Boersma, Paul & David Weenink. 2017. Praat: Doing phonetics by computer, Version 6.0.28. http://www.praat.org/.Suche in Google Scholar

Driscoll, Anna & Emma Lape. 2015. Reversal of the Northern Cities Shift in Syracuse, New York. University of Pennsylvania Working Papers in Linguistics 21(2). 41–47.Suche in Google Scholar

Duncan, Daniel. 2018. Language variation and change in the geographies of suburbs. New York: New York University Dissertation.Suche in Google Scholar

Duncan, Daniel. 2019. The influence of suburban development and metropolitan fragmentation on language variation and change: Evidence from Greater St. Louis. Journal of Linguistic Geography 7(2). 82–97. https://doi.org/10.1017/jlg.2019.8.Suche in Google Scholar

Duncan, Daniel. under review. Merger reversal in St. Louis: Implementation and implications. Ms., Newcastle University.Suche in Google Scholar

Durian, David. 2007. Getting [S]tronger every day?: More on urbanization and the socio-geographic diffusion of (str) in Columbus, OH. University of Pennsylvania Working Papers in Linguistics 13(2). 65–79.Suche in Google Scholar

Friedman, Lauren. 2014. The St. Louis Corridor: Mixing, competing, and retreating dialects. University of Pennsylvania PhD Dissertation.Suche in Google Scholar

Fruehwald, Josef. 2016. The early influence of phonology on a phonetic change. Language 92(2). 376–410. https://doi.org/10.1353/lan.2016.0041.Suche in Google Scholar

Goldsmith, John & Aris Xanthos. 2008. Three models for learning phonological categories. Chicago: Department of Computer Science, University of Chicago. https://newtraell.cs.uchicago.edu/research/publications/techreports/TR-2008-08 (accessed 18 June 2020).Suche in Google Scholar

Goodheart, Jill C. 2004. I’m no hoosier: Evidence of the Northern Cities Shift in St. Louis, Missouri. Michigan State University MA Thesis.Suche in Google Scholar

Gordon, Matthew J. 2001. Small-town values and big-city vowels: A study of the Northern Cities Shift in Michigan (Publication of the American Dialects Society 84). Durham, NC: Duke University Press.Suche in Google Scholar

Gylfadottír, Duna. 2015. Streets of Philadelphia: An acoustic study of /str/-retraction in a naturalistic speech corpus. University of Pennsylvania Working Papers in Linguistics 21(2). 89–97.Suche in Google Scholar

Hay, Jennifer, Paul Warren & Katie Drager. 2006. Factors influencing speech perception in the context of a merger-in-progress. Journal of Phonetics 34. 458–484. https://doi.org/10.1016/j.wocn.2005.10.001.Suche in Google Scholar

Jaggers, Zachary S. 2018. Evidence and characterization of a glide-vowel distinction in American English. Laboratory Phonology 9(1). 1–27. Article 3. https://doi.org/10.5334/labphon.36.Suche in Google Scholar

Johnson, Daniel E. 2009. Getting off the GoldVarb standard: Introducing Rbrul for mixed‐effects variable rule analysis. Language and Linguistics Compass 3(1). 359–383. https://doi.org/10.1111/j.1749-818x.2008.00108.x.Suche in Google Scholar

Kass, Robert E. & Adrian E. Raftery. 1995. Bayes factors. Journal of the American Statistical Association 90. 773–795. https://doi.org/10.1080/01621459.1995.10476572.Suche in Google Scholar

Labov, William. 1994. Principles of linguistic change volume 1: Internal factors. Oxford: Blackwell Publishers.Suche in Google Scholar

Labov, William, Mark Karen & Corey Miller. 1991. Near-mergers and the suspension of phonemic contrast. Language Variation and Change 3. 33–74. https://doi.org/10.1017/s0954394500000442.Suche in Google Scholar

Labov, William, Sharon Ash & Charles Boberg. 2006. The atlas of North American English. New York: Mouton de Gruyter.10.1515/9783110167467Suche in Google Scholar

Leach, Hannah. 2018. Sociophonetic variation in Stoke-on-Trent’s pottery industry. University of Sheffield PhD Dissertation.Suche in Google Scholar

Lobanov, Boris M. 1971. Classification of Russian vowels spoken by different listeners. Journal of the Acoustical Society of America 49. 606–608. https://doi.org/10.1121/1.1912396.Suche in Google Scholar

Love, Jessica & Abby Walker. 2012. Football versus football: Effect of topic on /r/ realization in American and English sports fans. Language and Speech 56(4). 443–460. https://doi.org/10.1177/0023830912453132.Suche in Google Scholar

MacKenzie, Laurel. 2020. Comparing constraints on contraction using Bayesian regression modeling. Frontiers in Artificial Intelligence 3. Article 58. https://doi.org/10.3389/frai.2020.00058.Suche in Google Scholar

Mayer, Connor. 2020. An algorithm for learning phonological classes from distributional similarity. Phonology 37. 91–131. https://doi.org/10.1017/s0952675720000056.Suche in Google Scholar

Nycz, Jennifer. 2013. New contrast acquisition: Methodological issues and theoretical implications. English Language and Linguistics 17(2). 325–357. https://doi.org/10.1017/s1360674313000051.Suche in Google Scholar

R Core Team. 2017. R: A language and environment for statistical computing. https://www.R-project.org.Suche in Google Scholar

Rosenfelder, Ingrid, Josef Fruehwald, Keelan Evanini, Seyfarth Scott, Kyle Gorman, Hilary Prichard & Jiahong Yuan. 2014. FAVE (Forced Alignment and Vowel Extraction) program suite. Version 1.2.2. https://doi.org/10.5281/zenodo.22281.Suche in Google Scholar

Rutter, Ben. 2011. Acoustic analysis of a sound change in progress: The consonant cluster /stɹ/ in English. Journal of the International Phonetic Association 41. 27–40. https://doi.org/10.1017/s0025100310000307.Suche in Google Scholar

Sankoff, David, Sali A. Tagliamonte & Eric Smith. 2005. Goldvarb X: A variable rule application for Macintosh and Windows. Toronto: Department of Linguistics, University of Toronto.Suche in Google Scholar

Sneller, Betsy. 2018. Mechanisms of phonological change. University of Pennsylvania PhD Dissertation.Suche in Google Scholar

Starner, Thad & Alex Pentland. 1995. Real-time American Sign Language recognition from video using hidden Markov models. MIT Media Laboratory Perceptual Computing Section Technical Report No. 375. https://www.cc.gatech.edu/∼thad/p/031_10_SL/real-time-asl-recognition-from%20video-using-hmm-ISCV95.pdf (accessed 18 June 2020).10.1109/ISCV.1995.477012Suche in Google Scholar

Tamminga, Meredith. 2016. Persistence in phonological and morphological variation. Language Variation and Change 28. 335–356. https://doi.org/10.1017/s0954394516000119.Suche in Google Scholar

Tamminga, Meredith, Christopher Ahern & Aaron Ecay. 2016. Generalized additive mixed models for intraspeaker variation. Linguistics Vanguard 2(s1). 1–9. https://doi.org/10.1515/lingvan-2016-0030.Suche in Google Scholar

Turton, Danielle. 2017. Categorical or gradient? An ultrasound investigation of /l/-darkening and vocalization in varieties of English. Laboratory Phonology: Journal of the Association for Laboratory Phonology 8(1). 1–31. Article 13. https://doi.org/10.5334/labphon.35.Suche in Google Scholar

Villareal, Dan, Lynn Clark, Jennifer Hay & Kevin Watson. 2020. From categories to gradience: Auto-coding sociophonetic variation with random forests. Laboratory Phonology: Journal of the Association for Laboratory Phonology 11(1). 1–31. Article 6.10.5334/labphon.216Suche in Google Scholar

Visser, I. & M. Speekenbrink. 2010. depmixS4: An R Package for Hidden Markov Models. Journal of Statistical Software 36(7). 1–21. https://doi.org/10.18637/jss.v036.i07.Suche in Google Scholar

Wagner, Suzanne E., Alexander Mason, Monica Nesbitt, Erin Pevan & Matt Savage. 2016. Reversal and re-organization of the Northern Cities Shift in Michigan. University of Pennsylvania Working Papers in Linguistics 22(2). 171–179.Suche in Google Scholar

Wilbanks, Eric. 2017. Social and structural constraints on a phonetically-motivated change in progress: (str) retraction in Raleigh, NC. University of Pennsylvania Working Papers in Linguistics 23(1). 301–310.10.5070/P7121040720Suche in Google Scholar

Received: 2020-06-22
Accepted: 2020-11-09
Published Online: 2021-07-27

© 2021 Walter de Gruyter GmbH, Berlin/Boston

Artikel in diesem Heft

  1. Editorial Note
  2. Editorial note
  3. Phonetics & Phonology
  4. Fast Track: fast (nearly) automatic formant-tracking using Praat
  5. Acoustic investigation of anticipatory vowel nasalization in a Caribbean and a non-Caribbean dialect of Spanish
  6. Evidence against a link between learning phonotactics and learning phonological alternations
  7. The extent and degree of utterance-final word lengthening in spontaneous speech from 10 languages
  8. Morphology & Syntax
  9. Brand names as multimodal constructions
  10. NP-internal structure and the distribution of adjectives in Mə̀dʉ́mbὰ
  11. A quantitative investigation of the ellipsis of English relativizers
  12. Positional dependency in Murrinhpatha: expanding the typology of non-canonical morphotactics
  13. Semantics & Pragmatics
  14. Multifactorial Information Management (MIM): summing up the emerging alternative to Information Structure
  15. Language Documentation & Typology
  16. Current trends in grammar writing
  17. Psycholinguistics & Neurolinguistics
  18. Experimental filler design influences error correction rates in a word restoration paradigm
  19. Phonological and morphological roles modulate the perception of consonant variants
  20. Language Acquisition and Language Learning
  21. Sounds like a dynamic system: a unifying approach to Language
  22. Sociolinguistics and Anthropological Linguistics
  23. Using hidden Markov models to find discrete targets in continuous sociophonetic data
  24. “It’s a Whole Vibe”: testing evaluations of grammatical and ungrammatical AAE on Twitter
  25. The sociolinguistics of /l/ in Manchester
  26. Computational & Corpus Linguistics
  27. An empirical study on the contribution of formal and semantic features to the grammatical gender of nouns
  28. A computational construction grammar approach to semantic frame extraction
  29. The “negative end” of change in grammar: terminology, concepts and causes
  30. In order that – a data-driven study of symptoms and causes of obsolescence
  31. Cognitive Linguistics
  32. Iconicity ratings really do measure iconicity, and they open a new window onto the nature of language
  33. Iconicity ratings really do measure iconicity, and they open a new window onto the nature of language
  34. Repetition in Mandarin-speaking children’s dialogs: its distribution and structural dimensions
Heruntergeladen am 7.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/lingvan-2020-0057/html
Button zum nach oben scrollen