Predictability of stop consonant phonetics across talkers: Between-category and within-category dependencies among cues for place and voice

Eleanor Chodroff; Colin Wilson

doi:10.1515/lingvan-2017-0047

Article

Predictability of stop consonant phonetics across talkers: Between-category and within-category dependencies among cues for place and voice

Eleanor Chodroff and Colin Wilson

Published/Copyright: September 13, 2018

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Linguistics Vanguard Volume 4 Issue s2

Abstract

The present study investigates patterns of covariation among acoustic properties of stop consonants in a large multi-talker corpus of American English connected speech. Relations among talker means for different stops on the same dimension (between-category covariation) were considerably stronger than those for different dimensions of the same stop (within-category covariation). The existence of between-category covariation supports a uniformity principle that restricts the mapping from phonological features to phonetic targets in the sound system of each speaker. This principle was formalized with factor analysis, in which observed covariation derives from a lower-dimensional space of talker variation. Knowledge of between-category phonetic covariation could facilitate perceptual adaptation to novel talkers by providing a rational basis for generalizing idiosyncratic properties to several sounds on the basis of limited exposure.

Keywords: stop consonants; talker variability; phonetic covariation; factor analysis; predictability

Appendix

The analyses in Section 2 involved correlations of talker means; however, many previous studies have also examined correlations across individual tokens (e.g. Dmitrieva et al. 2015; Kirby and Ladd 2015, Kirby and Ladd 2016; Clayards 2018). For comparison with these studies, token-by-token correlations between phonetic cues were calculated for each stop category within and across talkers. Only stop consonants with non-outlier values for both cues were retained for these correlations. There were 71,852 stops for the COG-VOT analysis, 57,737 stops for the COG-f0 analysis, and 74,916 stops for the VOT-f0 analysis. The first correlation analysis, reported in Table A1, was conducted across all tokens (see also Dmitrieva et al. 2015; Clayards 2018). These correlations largely resembled the correlations of talker means in magnitude (especially between COG and VOT for [b], [d], and [g]); while many of these correlations reached significance, they were nevertheless quite weak. In the second analysis, correlations were limited to talkers with more than 20 tokens per stop category. The median number of talkers excluded from each analysis was four and the maximum was 55 talkers (between COG and f0 for [t^h]). Table A2 presents the median token-by-token correlation for each of the cue pairs and stop consonants, as well as the range across talkers. Consistent with findings in Kirby and Ladd (2016) for French and Italian intervocalic stops, the magnitude and direction of the by-speaker correlations varied substantially across talkers. Together, these findings indicate that, while there may exist weak relationships across talker means, the token-by-token relationships within talker-specific productions are highly variable.

Table A1:

Token-by-token correlations for each cue pair and stop category aggregated over all talkers.

	COG-VOT	COG-f0 (female)	COG-f0 (male)	VOT-f0 (female)	VOT-f0 (male)
p^h	0.18*	−0.01	0.09*	−0.05*	−0.02
b	0.34*	0.00	−0.05*	−0.03	−0.01
t^h	0.09*	0.07*	0.12*	−0.11*	−0.06*
d	0.57*	−0.11*	−0.06*	−0.06*	−0.01
k^h	0.17*	0.10*	0.09*	−0.15*	−0.13*
g	0.52*	0.03	−0.01	−0.03	0.01

An asterisk reflects p < 0.001.

Table A2:

For each stop category and cue pair separately, the median talker-specific token-by-token correlation (left column) and range of talker-specific token-by-token correlations (right column).

	COG-VOT		COG-f0		VOT-f0
	Median	Range	Median	Range	Median	Range
p^h	0.17	−0.41 to 0.62	0.09	−0.41 to 0.44	−0.04	−0.47 to 0.54
b	0.33	−0.14 to 0.77	−0.01	−0.53 to 0.64	0.00	−0.37 to 0.41
t^h	0.06	−0.46 to 0.59	0.10	−0.34 to 0.51	−0.16	−0.59 to 0.41
d	0.54	−0.18 to 0.75	−0.06	−0.50 to 0.45	−0.03	−0.35 to 0.42
k^h	0.16	−0.31 to 0.57	0.10	−0.34 to 0.52	−0.20	−0.61 to 0.39
g	0.54	−0.04 to 0.79	0.01	−0.51 to 0.39	0.00	−0.38 to 0.33

References

Assmann, P. F., T. M. Nearey & S. Bharadwaj. 2008. Analysis of a vowel database. Canadian Acoustics 36(3). 148–149.Search in Google Scholar

Boersma, P. & D. Weenink. 2016. Praat: Doing phonetics by computer [Computer program]. Version 6.0.19, retrieved from http://www.praat.org/.Search in Google Scholar

Brandschain, L., D. Graff, C. Cieri, K. Walker & C. Caruso. 2010. The Mixer 6 corpus: Resources for cross-channel and text independent speaker recognition. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner & D. Tapias (eds.), Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), 2441–2444. Malta: European Language Resources Association (ELRA).Search in Google Scholar

Brandschain, L., D. Graff & K. Walker. 2013. Mixer 6 Speech LDC2013S03. Hard Drive. Philadelphia: Linguistic Data Consortium.Search in Google Scholar

Chang, C. B., Y. Yao, E. F. Haynes & R. Rhodes. 2011. Production of phonetic and phonological contrast by heritage speakers of Mandarin. The Journal of the Acoustical Society of America 129(6). 3964–3980.10.1121/1.3569736Search in Google Scholar

Chládková, K., V. J. Podlipský & A. Chionidou. 2017. Perceptual adaptation of vowels generalizes across the phonology and does not require local context. Journal of Experimental Psychology: Human Perception and Performance 43(2). 414–427.10.1037/xhp0000333Search in Google Scholar

Chodroff, E. 2017. Structured variation in obstruent production and perception. Baltimore, MD: Johns Hopkins University dissertation.Search in Google Scholar

Chodroff, E., M. Maciejewski, J. Trmal, S. Khudanpur & J. J. Godfrey. 2016. New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification. In N. Calzolari, K. Choukri, T. Declerck, M. Grobelnik, B. Maegaard, J. Mariana, A. Moreno, J. Odijk & S. Piperidis (eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), 1323–1327. Portorož, Slovenia: European Language Resources Association (ELRA).Search in Google Scholar

Chodroff, E. & C. Wilson. 2014. Burst spectrum as a cue for the stop voicing contrast in American English. The Journal of the Acoustical Society of America 136(5). 2762–2772.10.1121/1.4896470Search in Google Scholar

Chodroff, E. & C. Wilson. 2017. Structure in talker-specific phonetic realization: Covariation of stop consonant VOT in American English. Journal of Phonetics 61. 30–47.10.1016/j.wocn.2017.01.001Search in Google Scholar

Clayards, M. A. 2018. Individual talker and token covariation in production of multiple cues to stop voicing. Phonetica 75(1). 1–23.10.1159/000448809Search in Google Scholar

Clayards, M. A., M. K. Tanenhaus, R. N. Aslin & R. A. Jacobs. 2008. Perception of speech reflects optimal use of probabilistic speech cues. Cognition 108(3). 804–809.10.1016/j.cognition.2008.04.004Search in Google Scholar

Clopper, C. G. & J. C. Paolillo. 2006. North American English vowels: A factor-analytic perspective. Literary and Linguistic Computing 21(4). 445–462.10.1093/llc/fql039Search in Google Scholar

DiCanio, C. T., H. Nam, J. D. Amith, R. C. García & D. H. Whalen. 2015. Vowel variability in elicited versus spontaneous speech: Evidence from Mixtec. Journal of Phonetics 48. 45–59.10.1016/j.wocn.2014.10.003Search in Google Scholar

Dmitrieva, O., F. Llanos, A. A. Shultz & A. L. Francis. 2015. Phonological status, not voice onset time, determines the acoustic realization of onset f0 as a secondary voicing cue in Spanish and English. Journal of Phonetics 49. 77–95.10.1016/j.wocn.2014.12.005Search in Google Scholar

Efron, B. 1987. Better bootstrap confidence intervals. Journal of the American Statistical Association 82(397). 171–185.10.21236/ADA150798Search in Google Scholar

Evans, J. W. 1996. Straightforward statistics for the behavioral sciences. Pacific Grove, CA: Brooks/Cole Publishing.Search in Google Scholar

Flege, J. E. 1991. Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a second language. The Journal of the Acoustical Society of America 89(1). 395–411.10.1121/1.400473Search in Google Scholar

Flemming, E. S. 2007. Stop place contrasts before liquids. In J. Trouvain & W. Barry (eds.), Proceedings of the 16th International Congress of Phonetic Sciences, 233–236. Saarbrücken, Germany: Saarland University.Search in Google Scholar

Forrest, K., G. Weismer, P. Milenkovic & R. N. Dougall. 1988. Statistical analysis of word-initial voiceless obstruents: Preliminary data. The Journal of the Acoustical Society of America 84(1). 115–123.10.1121/1.396977Search in Google Scholar

Foulkes, P. & G. Docherty. 2006. The social life of phonetics and phonology. Journal of Phonetics 34. 409–438.10.1016/j.wocn.2005.08.002Search in Google Scholar

Foulkes, P., G. Docherty & D. Watt. 2001. On the emergence of structured phonological variation. University of Pennsylvania Working Papers in Linguistics 7(3). 67–84.Search in Google Scholar

Fruehwald, J. 2013. The phonological influence on phonetic change. Philadelphia, PA: University of Pennsylvania dissertation.Search in Google Scholar

Fruehwald, J. 2017. The role of phonology in phonetic change. Annual Review of Linguistics 3. 25–42.10.1146/annurev-linguistics-011516-034101Search in Google Scholar

Grosjean, F. & J. L. Miller. 1994. Going in and out of languages: An example of bilingual flexibility. Psychological Science 5(4). 201–207.10.1111/j.1467-9280.1994.tb00501.xSearch in Google Scholar

Guy, G. R. & F. Hinskens. 2016. Linguistic coherence: Systems, repertoires and speech communities. Lingua 172–173. 1–9.10.1016/j.lingua.2016.01.001Search in Google Scholar

Haggard, M. P., S. Ambler & M. Callow. 1970. Pitch as a voicing cue. The Journal of the Acoustical Society of America 47(2, Part 2). 613–617.10.1121/1.1911936Search in Google Scholar

Hanson, H. M. & K. N. Stevens. 2003. Models of aspirated stops in English. In M. Solé, D. Recasen & J. Romero (eds.), Proceedings of the 15th International Congress of Phonetic Sciences, 783–786. Barcelona, Spain: Universitat Autònoma de Barcelona.Search in Google Scholar

Harshman, R., P. Ladefoged & L. M. Goldstein. 1977. Factor analysis of tongue shapes. The Journal of the Acoustical Society of America 62(3). 693–707.10.1121/1.381581Search in Google Scholar

Johnson, K. 1997. Speech perception without speaker normalization: An exemplar model. In K. Johnson & J. W. Mullennix (eds.), Talker variability in speech processing, 145–165. San Diego: Academic Press.Search in Google Scholar

Joos, M. 1948. Acoustic phonetics. Language 24(2). 5–136.10.2307/522229Search in Google Scholar

Keshet, J., M. Sonderegger & T. Knowles. 2014. AutoVOT: A tool for automatic measurement of voice onset time using discriminative structured prediction [Computer program]. Version 0.91, retrieved August 2016 from https://github.com/mlml/autovot/.Search in Google Scholar

Kirby, J. P. & D. R. Ladd. 2015. Stop voicing and f0 perturbations: Evidence from French and Italian. In The Scottish Consortium for ICPhS 2015 (ed.), Proceedings of the 18th International Congress of Phonetic Sciences, Paper number 0740. Glasgow, UK: University of Glasgow.Search in Google Scholar

Kirby, J. P. & D. R. Ladd. 2016. Effects of obstruent voicing on vowel F0: Evidence from “true voicing” languages. The Journal of Acoustical Society of America 140(4). 2400–2411.10.1121/1.4962445Search in Google Scholar

Kleinschmidt, D. F. & T. F. Jaeger. 2015. Robust speech perception: Recognizing the familiar, generalizing to the similar, and adapting to the novel. Psychological Review 122(2). 148–203.10.1037/a0038695Search in Google Scholar

Koenig, L. L. 2000. Laryngeal factors in voiceless consonant production in men, women, and 5-year-olds. Journal of Speech, Language, and Hearing Research 43(5). 1211–1228.10.1044/jslhr.4305.1211Search in Google Scholar

Koenig, L. L., C. H. Shadle, J. L. Preston & C. R. Mooshammer. 2013. Toward improved spectral measures of /s/: Results from adolescents. Journal of Speech, Language, and Hearing Research 56(4). 1175–1189.10.1044/1092-4388(2012/12-0038)Search in Google Scholar

Kuhn, R., P. Nguyen, J.-C. Junqua, L. Goldwasser, N. Niedzielski, S. Fincke, N. Field & M. Contolini. 1998. Eigenvoices for speaker adaptation. In R. H. Mannell & J. Robert-Ribes (eds.), Proceedings of the 5th International Conference on Spoken Language Processing, 1774–1777. Sydney, Australia: Australian Speech Science and Technology Association, Incorporated (ASSTA).10.21437/ICSLP.1998-740Search in Google Scholar

Labov, W. 1966. The social stratification of English in New York City, 2nd edn. New York: Cambridge University Press.Search in Google Scholar

Ladefoged, P. & D. E. Broadbent. 1957. Information conveyed by vowels. The Journal of the Acoustical Society of America 29(1). 98–104.10.1121/1.1908694Search in Google Scholar

Leinonen, T. 2008. Factor analysis of vowel pronunciation in Swedish dialects. International Journal of Humanities and Arts Computing 2(1–2). 189–204.10.3366/E175385480900038XSearch in Google Scholar

Lindblom, B. 1967. Vowel duration and a model of lip-mandible coordination. Speech Transmission Laboratory – Quarterly Progress and Status Reports 8(4). 1–29.Search in Google Scholar

Lisker, L. & A. S. Abramson. 1964. A cross-language study of voicing in initial stops: Acoustical measurements. Word 20(3). 384–422.10.1080/00437956.1964.11659830Search in Google Scholar

MacLeod, A. & C. Stoel-Gammon. 2005. Are bilinguals different? What VOT tells us about simultaneous bilinguals. Journal of Multilingual Communication Disorders 3(2). 118–127.10.1080/14769670500066313Search in Google Scholar

Maddieson, I. 1997. Phonetic universals. In J. Laver & W. J. Hardcastle (eds.), Handbook of phonetic sciences, 619–639. Oxford: Blackwells Publishers.Search in Google Scholar

Maye, J., R. N. Aslin & M. K. Tanenhaus. 2008. The weckud wetch of the wast: Lexical adaptation to a novel accent. Cognitive Science 32(3). 543–562.10.1080/03640210802035357Search in Google Scholar

McMurray, B. & A. Jongman. 2011. What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. Psychological Review 118(2). 219–246.10.1037/a0022325Search in Google Scholar

Nearey, T. M. 1978. Phonetic feature system for vowels. Edmonton, Alberta: University of Alberta dissertation.Search in Google Scholar

Nearey, T. M. 1989. Static, dynamic, and relational properties in vowel perception. The Journal of the Acoustical Society of America 85(5). 2088–2113.10.1121/1.397861Search in Google Scholar

Nearey, T. M. & P. F. Assmann. 2007. Probabilistic “sliding template” models for indirect vowel normalization. In M.-J. Solé, P. S. Beddor & M. Ohala (eds.), Experimental approaches to phonology, 246–269. New York: Oxford University Press.10.1093/oso/9780199296675.003.0016Search in Google Scholar

Newman, R. S., S. A. Clouse & J. L. Burnham. 2001. The perceptual consequences of within-talker variability in fricative production. The Journal of the Acoustical Society of America 109(3). 1181–1196.10.1121/1.1348009Search in Google Scholar

Nielsen, K. Y. 2011. Specificity and abstractness of VOT imitation. Journal of Phonetics 39. 132–142.10.1016/j.wocn.2010.12.007Search in Google Scholar

Nielsen, K. Y. & C. Wilson. 2008. A hierarchical Bayesian model of multi-level phonetic imitation. In N. Abner & J. Bishop (eds.), Proceedings of the 27th West Coast Conference on Formal Linguistics, 335–343. Los Angeles: Cascadilla Proceedings Project.Search in Google Scholar

Ohde, R. N. 1984. Fundamental frequency as an acoustic correlate of stop consonant voicing. The Journal of the Acoustical Society of America 75(1). 224–230.10.1121/1.390399Search in Google Scholar

Pols, L. C. W., H. R. C. Tromp & R. Plomp. 1973. Frequency analysis of Dutch vowels from 50 male speakers. The Journal of the Acoustical Society of America 53(4). 1093–1101.10.1121/1.1913429Search in Google Scholar

Rose, P. 2010. The effect of correlation on strength of evidence estimates in forensic voice comparison: uni- and multivariate likelihood ratio-based discrimination with Australian English vowel acoustics. International Journal of Biometrics 2(4). 316–329.10.1504/IJBM.2010.035447Search in Google Scholar

Shultz, A. A., A. L. Francis & F. Llanos. 2012. Differential cue weighting in perception and production of consonant voicing. The Journal of the Acoustical Society of America 132(2). EL95.10.1121/1.4736711Search in Google Scholar

Smiljanić, R. & A. R. Bradlow. 2008. Stability of temporal contrasts across speaking styles in English and Croatian. Journal of Phonetics 36(1). 91–113.10.1016/j.wocn.2007.02.002Search in Google Scholar

Solé, M.-J. 2007. Controlled and mechanical properties in speech. In M.-J. Solé, P. S. Beddor & M. Ohala (eds.), Experimental approaches to phonology, 302–321. Oxford: Oxford University Press.10.1093/oso/9780199296675.003.0018Search in Google Scholar

Sonderegger, M., M. Bane & P. Graff. 2017. The medium-term dynamics of accents on reality television. Language 93(3). 598–640.10.1353/lan.2017.0038Search in Google Scholar

Theodore, R. M. & J. L. Miller. 2010. Characteristics of listener sensitivity to talker-specific phonetic detail. The Journal of the Acoustical Society of America 128(4). 2090–2099.10.1121/1.4782541Search in Google Scholar

Theodore, R. M., J. L. Miller & D. DeSteno. 2009. Individual talker differences in voice-onset-time: Contextual influences. The Journal of the Acoustical Society of America 125(6). 3974–3982.10.1121/1.3106131Search in Google Scholar

Titze, I. R. 2011. Vocal fold mass is not a useful quantity for describing F0 in vocalization. Journal of Speech and Hearing Research 54(2). 520–522.10.1044/1092-4388(2010/09-0284)Search in Google Scholar

Toivonen, I., L. Blumenfeld, A. Gormley, L. Hoiting, N. Ramlakhan & A. Stone. 2015. Vowel height and duration. In U. Steindl, T. Borer, H. Fang, A. Garcia Pardo, P. Guekguezian, B. Hsu, C. O’Hara & I. C. Ouyang (eds.), Proceedings of the 32nd West Coast Conference on Formal Linguistics, 64–71. Somerville, MA: Cascadilla Proceedings Project.Search in Google Scholar

van Nierop, D. J. P. J., L. C. W. Pols & R. Plomp. 1973. Frequency analysis of Dutch vowels from 25 female speakers. Acustica 29(2). 110–118.Search in Google Scholar

Weismer, G. 1980. Control of the voicing distinction for intervocalic stops and fricatives: some data and theoretical considerations. Journal of Phonetics 8. 427–438.10.1016/S0095-4470(19)31498-6Search in Google Scholar

Whalen, D. H. & A. G. Levitt. 1995. The universality of intrinsic f0 of vowels. Journal of Phonetics 23. 349–366.10.1016/S0095-4470(95)80165-0Search in Google Scholar

Yuan, J. & M. Y. Liberman. 2008. Speaker identification on the SCOTUS corpus. Proceedings of Acoustics ’08. 5687–5790. Paris: Société Française d’Acoustique (SFA).Search in Google Scholar

Zlatin, M. A. 1974. Voicing contrast: Perceptual and productive voice onset time characteristics of adults. The Journal of the Acoustical Society of America 56(3). 981–994.10.1121/1.1903359Search in Google Scholar

Zue, V. W. 1976. Acoustic characteristics of stop consonants: A controlled study. Cambridge, MA: Massachusetts Institute of Technology dissertation.Search in Google Scholar

Supplementary Material

The online version of this article offers supplementary material (DOI: https://doi.org/10.1515/lingvan-2017-0047).

Received: 2017-10-15

Accepted: 2018-07-05

Published Online: 2018-09-13

You are currently not able to access this content.

Supplementary Material Details

Articles in the same Issue

https://doi.org/10.1515/lingvan-2017-0047

Keywords for this article

stop consonants; talker variability; phonetic covariation; factor analysis; predictability