Abstract
The present study investigates patterns of covariation among acoustic properties of stop consonants in a large multi-talker corpus of American English connected speech. Relations among talker means for different stops on the same dimension (between-category covariation) were considerably stronger than those for different dimensions of the same stop (within-category covariation). The existence of between-category covariation supports a uniformity principle that restricts the mapping from phonological features to phonetic targets in the sound system of each speaker. This principle was formalized with factor analysis, in which observed covariation derives from a lower-dimensional space of talker variation. Knowledge of between-category phonetic covariation could facilitate perceptual adaptation to novel talkers by providing a rational basis for generalizing idiosyncratic properties to several sounds on the basis of limited exposure.
Appendix
The analyses in Section 2 involved correlations of talker means; however, many previous studies have also examined correlations across individual tokens (e.g. Dmitrieva et al. 2015; Kirby and Ladd 2015, Kirby and Ladd 2016; Clayards 2018). For comparison with these studies, token-by-token correlations between phonetic cues were calculated for each stop category within and across talkers. Only stop consonants with non-outlier values for both cues were retained for these correlations. There were 71,852 stops for the COG-VOT analysis, 57,737 stops for the COG-f0 analysis, and 74,916 stops for the VOT-f0 analysis. The first correlation analysis, reported in Table A1, was conducted across all tokens (see also Dmitrieva et al. 2015; Clayards 2018). These correlations largely resembled the correlations of talker means in magnitude (especially between COG and VOT for [b], [d], and [g]); while many of these correlations reached significance, they were nevertheless quite weak. In the second analysis, correlations were limited to talkers with more than 20 tokens per stop category. The median number of talkers excluded from each analysis was four and the maximum was 55 talkers (between COG and f0 for [th]). Table A2 presents the median token-by-token correlation for each of the cue pairs and stop consonants, as well as the range across talkers. Consistent with findings in Kirby and Ladd (2016) for French and Italian intervocalic stops, the magnitude and direction of the by-speaker correlations varied substantially across talkers. Together, these findings indicate that, while there may exist weak relationships across talker means, the token-by-token relationships within talker-specific productions are highly variable.
Token-by-token correlations for each cue pair and stop category aggregated over all talkers.
| COG-VOT | COG-f0 (female) | COG-f0 (male) | VOT-f0 (female) | VOT-f0 (male) | |
|---|---|---|---|---|---|
| ph | 0.18* | −0.01 | 0.09* | −0.05* | −0.02 |
| b | 0.34* | 0.00 | −0.05* | −0.03 | −0.01 |
| th | 0.09* | 0.07* | 0.12* | −0.11* | −0.06* |
| d | 0.57* | −0.11* | −0.06* | −0.06* | −0.01 |
| kh | 0.17* | 0.10* | 0.09* | −0.15* | −0.13* |
| g | 0.52* | 0.03 | −0.01 | −0.03 | 0.01 |
An asterisk reflects p < 0.001.
For each stop category and cue pair separately, the median talker-specific token-by-token correlation (left column) and range of talker-specific token-by-token correlations (right column).
| COG-VOT | COG-f0 | VOT-f0 | ||||
|---|---|---|---|---|---|---|
| Median | Range | Median | Range | Median | Range | |
| ph | 0.17 | −0.41 to 0.62 | 0.09 | −0.41 to 0.44 | −0.04 | −0.47 to 0.54 |
| b | 0.33 | −0.14 to 0.77 | −0.01 | −0.53 to 0.64 | 0.00 | −0.37 to 0.41 |
| th | 0.06 | −0.46 to 0.59 | 0.10 | −0.34 to 0.51 | −0.16 | −0.59 to 0.41 |
| d | 0.54 | −0.18 to 0.75 | −0.06 | −0.50 to 0.45 | −0.03 | −0.35 to 0.42 |
| kh | 0.16 | −0.31 to 0.57 | 0.10 | −0.34 to 0.52 | −0.20 | −0.61 to 0.39 |
| g | 0.54 | −0.04 to 0.79 | 0.01 | −0.51 to 0.39 | 0.00 | −0.38 to 0.33 |
References
Assmann, P. F., T. M. Nearey & S. Bharadwaj. 2008. Analysis of a vowel database. Canadian Acoustics 36(3). 148–149.Search in Google Scholar
Boersma, P. & D. Weenink. 2016. Praat: Doing phonetics by computer [Computer program]. Version 6.0.19, retrieved from http://www.praat.org/.Search in Google Scholar
Brandschain, L., D. Graff, C. Cieri, K. Walker & C. Caruso. 2010. The Mixer 6 corpus: Resources for cross-channel and text independent speaker recognition. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner & D. Tapias (eds.), Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), 2441–2444. Malta: European Language Resources Association (ELRA).Search in Google Scholar
Brandschain, L., D. Graff & K. Walker. 2013. Mixer 6 Speech LDC2013S03. Hard Drive. Philadelphia: Linguistic Data Consortium.Search in Google Scholar
Chang, C. B., Y. Yao, E. F. Haynes & R. Rhodes. 2011. Production of phonetic and phonological contrast by heritage speakers of Mandarin. The Journal of the Acoustical Society of America 129(6). 3964–3980.10.1121/1.3569736Search in Google Scholar
Chládková, K., V. J. Podlipský & A. Chionidou. 2017. Perceptual adaptation of vowels generalizes across the phonology and does not require local context. Journal of Experimental Psychology: Human Perception and Performance 43(2). 414–427.10.1037/xhp0000333Search in Google Scholar
Chodroff, E. 2017. Structured variation in obstruent production and perception. Baltimore, MD: Johns Hopkins University dissertation.Search in Google Scholar
Chodroff, E., M. Maciejewski, J. Trmal, S. Khudanpur & J. J. Godfrey. 2016. New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification. In N. Calzolari, K. Choukri, T. Declerck, M. Grobelnik, B. Maegaard, J. Mariana, A. Moreno, J. Odijk & S. Piperidis (eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), 1323–1327. Portorož, Slovenia: European Language Resources Association (ELRA).Search in Google Scholar
Chodroff, E. & C. Wilson. 2014. Burst spectrum as a cue for the stop voicing contrast in American English. The Journal of the Acoustical Society of America 136(5). 2762–2772.10.1121/1.4896470Search in Google Scholar
Chodroff, E. & C. Wilson. 2017. Structure in talker-specific phonetic realization: Covariation of stop consonant VOT in American English. Journal of Phonetics 61. 30–47.10.1016/j.wocn.2017.01.001Search in Google Scholar
Clayards, M. A. 2018. Individual talker and token covariation in production of multiple cues to stop voicing. Phonetica 75(1). 1–23.10.1159/000448809Search in Google Scholar
Clayards, M. A., M. K. Tanenhaus, R. N. Aslin & R. A. Jacobs. 2008. Perception of speech reflects optimal use of probabilistic speech cues. Cognition 108(3). 804–809.10.1016/j.cognition.2008.04.004Search in Google Scholar
Clopper, C. G. & J. C. Paolillo. 2006. North American English vowels: A factor-analytic perspective. Literary and Linguistic Computing 21(4). 445–462.10.1093/llc/fql039Search in Google Scholar
DiCanio, C. T., H. Nam, J. D. Amith, R. C. García & D. H. Whalen. 2015. Vowel variability in elicited versus spontaneous speech: Evidence from Mixtec. Journal of Phonetics 48. 45–59.10.1016/j.wocn.2014.10.003Search in Google Scholar
Dmitrieva, O., F. Llanos, A. A. Shultz & A. L. Francis. 2015. Phonological status, not voice onset time, determines the acoustic realization of onset f0 as a secondary voicing cue in Spanish and English. Journal of Phonetics 49. 77–95.10.1016/j.wocn.2014.12.005Search in Google Scholar
Efron, B. 1987. Better bootstrap confidence intervals. Journal of the American Statistical Association 82(397). 171–185.10.21236/ADA150798Search in Google Scholar
Evans, J. W. 1996. Straightforward statistics for the behavioral sciences. Pacific Grove, CA: Brooks/Cole Publishing.Search in Google Scholar
Flege, J. E. 1991. Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a second language. The Journal of the Acoustical Society of America 89(1). 395–411.10.1121/1.400473Search in Google Scholar
Flemming, E. S. 2007. Stop place contrasts before liquids. In J. Trouvain & W. Barry (eds.), Proceedings of the 16th International Congress of Phonetic Sciences, 233–236. Saarbrücken, Germany: Saarland University.Search in Google Scholar
Forrest, K., G. Weismer, P. Milenkovic & R. N. Dougall. 1988. Statistical analysis of word-initial voiceless obstruents: Preliminary data. The Journal of the Acoustical Society of America 84(1). 115–123.10.1121/1.396977Search in Google Scholar
Foulkes, P. & G. Docherty. 2006. The social life of phonetics and phonology. Journal of Phonetics 34. 409–438.10.1016/j.wocn.2005.08.002Search in Google Scholar
Foulkes, P., G. Docherty & D. Watt. 2001. On the emergence of structured phonological variation. University of Pennsylvania Working Papers in Linguistics 7(3). 67–84.Search in Google Scholar
Fruehwald, J. 2013. The phonological influence on phonetic change. Philadelphia, PA: University of Pennsylvania dissertation.Search in Google Scholar
Fruehwald, J. 2017. The role of phonology in phonetic change. Annual Review of Linguistics 3. 25–42.10.1146/annurev-linguistics-011516-034101Search in Google Scholar
Grosjean, F. & J. L. Miller. 1994. Going in and out of languages: An example of bilingual flexibility. Psychological Science 5(4). 201–207.10.1111/j.1467-9280.1994.tb00501.xSearch in Google Scholar
Guy, G. R. & F. Hinskens. 2016. Linguistic coherence: Systems, repertoires and speech communities. Lingua 172–173. 1–9.10.1016/j.lingua.2016.01.001Search in Google Scholar
Haggard, M. P., S. Ambler & M. Callow. 1970. Pitch as a voicing cue. The Journal of the Acoustical Society of America 47(2, Part 2). 613–617.10.1121/1.1911936Search in Google Scholar
Hanson, H. M. & K. N. Stevens. 2003. Models of aspirated stops in English. In M. Solé, D. Recasen & J. Romero (eds.), Proceedings of the 15th International Congress of Phonetic Sciences, 783–786. Barcelona, Spain: Universitat Autònoma de Barcelona.Search in Google Scholar
Harshman, R., P. Ladefoged & L. M. Goldstein. 1977. Factor analysis of tongue shapes. The Journal of the Acoustical Society of America 62(3). 693–707.10.1121/1.381581Search in Google Scholar
Johnson, K. 1997. Speech perception without speaker normalization: An exemplar model. In K. Johnson & J. W. Mullennix (eds.), Talker variability in speech processing, 145–165. San Diego: Academic Press.Search in Google Scholar
Joos, M. 1948. Acoustic phonetics. Language 24(2). 5–136.10.2307/522229Search in Google Scholar
Keshet, J., M. Sonderegger & T. Knowles. 2014. AutoVOT: A tool for automatic measurement of voice onset time using discriminative structured prediction [Computer program]. Version 0.91, retrieved August 2016 from https://github.com/mlml/autovot/.Search in Google Scholar
Kirby, J. P. & D. R. Ladd. 2015. Stop voicing and f0 perturbations: Evidence from French and Italian. In The Scottish Consortium for ICPhS 2015 (ed.), Proceedings of the 18th International Congress of Phonetic Sciences, Paper number 0740. Glasgow, UK: University of Glasgow.Search in Google Scholar
Kirby, J. P. & D. R. Ladd. 2016. Effects of obstruent voicing on vowel F0: Evidence from “true voicing” languages. The Journal of Acoustical Society of America 140(4). 2400–2411.10.1121/1.4962445Search in Google Scholar
Kleinschmidt, D. F. & T. F. Jaeger. 2015. Robust speech perception: Recognizing the familiar, generalizing to the similar, and adapting to the novel. Psychological Review 122(2). 148–203.10.1037/a0038695Search in Google Scholar
Koenig, L. L. 2000. Laryngeal factors in voiceless consonant production in men, women, and 5-year-olds. Journal of Speech, Language, and Hearing Research 43(5). 1211–1228.10.1044/jslhr.4305.1211Search in Google Scholar
Koenig, L. L., C. H. Shadle, J. L. Preston & C. R. Mooshammer. 2013. Toward improved spectral measures of /s/: Results from adolescents. Journal of Speech, Language, and Hearing Research 56(4). 1175–1189.10.1044/1092-4388(2012/12-0038)Search in Google Scholar
Kuhn, R., P. Nguyen, J.-C. Junqua, L. Goldwasser, N. Niedzielski, S. Fincke, N. Field & M. Contolini. 1998. Eigenvoices for speaker adaptation. In R. H. Mannell & J. Robert-Ribes (eds.), Proceedings of the 5th International Conference on Spoken Language Processing, 1774–1777. Sydney, Australia: Australian Speech Science and Technology Association, Incorporated (ASSTA).10.21437/ICSLP.1998-740Search in Google Scholar
Labov, W. 1966. The social stratification of English in New York City, 2nd edn. New York: Cambridge University Press.Search in Google Scholar
Ladefoged, P. & D. E. Broadbent. 1957. Information conveyed by vowels. The Journal of the Acoustical Society of America 29(1). 98–104.10.1121/1.1908694Search in Google Scholar
Leinonen, T. 2008. Factor analysis of vowel pronunciation in Swedish dialects. International Journal of Humanities and Arts Computing 2(1–2). 189–204.10.3366/E175385480900038XSearch in Google Scholar
Lindblom, B. 1967. Vowel duration and a model of lip-mandible coordination. Speech Transmission Laboratory – Quarterly Progress and Status Reports 8(4). 1–29.Search in Google Scholar
Lisker, L. & A. S. Abramson. 1964. A cross-language study of voicing in initial stops: Acoustical measurements. Word 20(3). 384–422.10.1080/00437956.1964.11659830Search in Google Scholar
MacLeod, A. & C. Stoel-Gammon. 2005. Are bilinguals different? What VOT tells us about simultaneous bilinguals. Journal of Multilingual Communication Disorders 3(2). 118–127.10.1080/14769670500066313Search in Google Scholar
Maddieson, I. 1997. Phonetic universals. In J. Laver & W. J. Hardcastle (eds.), Handbook of phonetic sciences, 619–639. Oxford: Blackwells Publishers.Search in Google Scholar
Maye, J., R. N. Aslin & M. K. Tanenhaus. 2008. The weckud wetch of the wast: Lexical adaptation to a novel accent. Cognitive Science 32(3). 543–562.10.1080/03640210802035357Search in Google Scholar
McMurray, B. & A. Jongman. 2011. What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. Psychological Review 118(2). 219–246.10.1037/a0022325Search in Google Scholar
Nearey, T. M. 1978. Phonetic feature system for vowels. Edmonton, Alberta: University of Alberta dissertation.Search in Google Scholar
Nearey, T. M. 1989. Static, dynamic, and relational properties in vowel perception. The Journal of the Acoustical Society of America 85(5). 2088–2113.10.1121/1.397861Search in Google Scholar
Nearey, T. M. & P. F. Assmann. 2007. Probabilistic “sliding template” models for indirect vowel normalization. In M.-J. Solé, P. S. Beddor & M. Ohala (eds.), Experimental approaches to phonology, 246–269. New York: Oxford University Press.10.1093/oso/9780199296675.003.0016Search in Google Scholar
Newman, R. S., S. A. Clouse & J. L. Burnham. 2001. The perceptual consequences of within-talker variability in fricative production. The Journal of the Acoustical Society of America 109(3). 1181–1196.10.1121/1.1348009Search in Google Scholar
Nielsen, K. Y. 2011. Specificity and abstractness of VOT imitation. Journal of Phonetics 39. 132–142.10.1016/j.wocn.2010.12.007Search in Google Scholar
Nielsen, K. Y. & C. Wilson. 2008. A hierarchical Bayesian model of multi-level phonetic imitation. In N. Abner & J. Bishop (eds.), Proceedings of the 27th West Coast Conference on Formal Linguistics, 335–343. Los Angeles: Cascadilla Proceedings Project.Search in Google Scholar
Ohde, R. N. 1984. Fundamental frequency as an acoustic correlate of stop consonant voicing. The Journal of the Acoustical Society of America 75(1). 224–230.10.1121/1.390399Search in Google Scholar
Pols, L. C. W., H. R. C. Tromp & R. Plomp. 1973. Frequency analysis of Dutch vowels from 50 male speakers. The Journal of the Acoustical Society of America 53(4). 1093–1101.10.1121/1.1913429Search in Google Scholar
Rose, P. 2010. The effect of correlation on strength of evidence estimates in forensic voice comparison: uni- and multivariate likelihood ratio-based discrimination with Australian English vowel acoustics. International Journal of Biometrics 2(4). 316–329.10.1504/IJBM.2010.035447Search in Google Scholar
Shultz, A. A., A. L. Francis & F. Llanos. 2012. Differential cue weighting in perception and production of consonant voicing. The Journal of the Acoustical Society of America 132(2). EL95.10.1121/1.4736711Search in Google Scholar
Smiljanić, R. & A. R. Bradlow. 2008. Stability of temporal contrasts across speaking styles in English and Croatian. Journal of Phonetics 36(1). 91–113.10.1016/j.wocn.2007.02.002Search in Google Scholar
Solé, M.-J. 2007. Controlled and mechanical properties in speech. In M.-J. Solé, P. S. Beddor & M. Ohala (eds.), Experimental approaches to phonology, 302–321. Oxford: Oxford University Press.10.1093/oso/9780199296675.003.0018Search in Google Scholar
Sonderegger, M., M. Bane & P. Graff. 2017. The medium-term dynamics of accents on reality television. Language 93(3). 598–640.10.1353/lan.2017.0038Search in Google Scholar
Theodore, R. M. & J. L. Miller. 2010. Characteristics of listener sensitivity to talker-specific phonetic detail. The Journal of the Acoustical Society of America 128(4). 2090–2099.10.1121/1.4782541Search in Google Scholar
Theodore, R. M., J. L. Miller & D. DeSteno. 2009. Individual talker differences in voice-onset-time: Contextual influences. The Journal of the Acoustical Society of America 125(6). 3974–3982.10.1121/1.3106131Search in Google Scholar
Titze, I. R. 2011. Vocal fold mass is not a useful quantity for describing F0 in vocalization. Journal of Speech and Hearing Research 54(2). 520–522.10.1044/1092-4388(2010/09-0284)Search in Google Scholar
Toivonen, I., L. Blumenfeld, A. Gormley, L. Hoiting, N. Ramlakhan & A. Stone. 2015. Vowel height and duration. In U. Steindl, T. Borer, H. Fang, A. Garcia Pardo, P. Guekguezian, B. Hsu, C. O’Hara & I. C. Ouyang (eds.), Proceedings of the 32nd West Coast Conference on Formal Linguistics, 64–71. Somerville, MA: Cascadilla Proceedings Project.Search in Google Scholar
van Nierop, D. J. P. J., L. C. W. Pols & R. Plomp. 1973. Frequency analysis of Dutch vowels from 25 female speakers. Acustica 29(2). 110–118.Search in Google Scholar
Weismer, G. 1980. Control of the voicing distinction for intervocalic stops and fricatives: some data and theoretical considerations. Journal of Phonetics 8. 427–438.10.1016/S0095-4470(19)31498-6Search in Google Scholar
Whalen, D. H. & A. G. Levitt. 1995. The universality of intrinsic f0 of vowels. Journal of Phonetics 23. 349–366.10.1016/S0095-4470(95)80165-0Search in Google Scholar
Yuan, J. & M. Y. Liberman. 2008. Speaker identification on the SCOTUS corpus. Proceedings of Acoustics ’08. 5687–5790. Paris: Société Française d’Acoustique (SFA).Search in Google Scholar
Zlatin, M. A. 1974. Voicing contrast: Perceptual and productive voice onset time characteristics of adults. The Journal of the Acoustical Society of America 56(3). 981–994.10.1121/1.1903359Search in Google Scholar
Zue, V. W. 1976. Acoustic characteristics of stop consonants: A controlled study. Cambridge, MA: Massachusetts Institute of Technology dissertation.Search in Google Scholar
Supplementary Material
The online version of this article offers supplementary material (DOI: https://doi.org/10.1515/lingvan-2017-0047).
©2018 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Predictability and phonology: past, present and future
- Predictability and perception for native and non-native listeners
- Mergers in Bardi: contextual probability and predictors of sound change
- Predictability of stop consonant phonetics across talkers: Between-category and within-category dependencies among cues for place and voice
- Assessing predictability effects in connected read speech
- The interdependence of frequency, predictability, and informativity in the segmental domain
- Loci and locality of informational effects on phonetic implementation
- Three steps forward for predictability. Consideration of methodological robustness, indexical and prosodic factors, and replication in the laboratory
- Distributional learning is error-driven: the role of surprise in the acquisition of phonetic categories
- Truncation in message-oriented phonology: a case study using Korean vocative truncation
- Durational contrast in gemination and informativity
- Practice makes perfect: the consequences of lexical proficiency for articulation
- Patterns of probabilistic segment deletion/reduction in English and Japanese
- The role of predictability in shaping phonological patterns
Articles in the same Issue
- Predictability and phonology: past, present and future
- Predictability and perception for native and non-native listeners
- Mergers in Bardi: contextual probability and predictors of sound change
- Predictability of stop consonant phonetics across talkers: Between-category and within-category dependencies among cues for place and voice
- Assessing predictability effects in connected read speech
- The interdependence of frequency, predictability, and informativity in the segmental domain
- Loci and locality of informational effects on phonetic implementation
- Three steps forward for predictability. Consideration of methodological robustness, indexical and prosodic factors, and replication in the laboratory
- Distributional learning is error-driven: the role of surprise in the acquisition of phonetic categories
- Truncation in message-oriented phonology: a case study using Korean vocative truncation
- Durational contrast in gemination and informativity
- Practice makes perfect: the consequences of lexical proficiency for articulation
- Patterns of probabilistic segment deletion/reduction in English and Japanese
- The role of predictability in shaping phonological patterns