Individual Talker and Token Covariation in the Production of Multiple Cues to Stop Voicing
-
Meghan Clayards
Abstract
Background/Aims: Previous research found that individual talkers have consistent differences in the production of segments impacting the perception of their speech by others. Speakers also produce multiple acoustic-phonetic cues to phonological contrasts. Less is known about how multiple cues covary within a phonetic category and across talkers. We examined differences in individual talkers across cues and whether token-by-token variability is a result of intrinsic factors or speaking style by examining within-category correlations. Methods: We examined correlations for 3 cues (voice onset time, VOT, talker-relative onset fundamental frequency, f0, and talker-relative following vowel duration) to word-initial labial stop voicing in English. Results: VOT for /b/ and /p/ productions and onset f0 for /b/ productions varied significantly by talker. Token-by-token within-category variation was largely limited to speaking rate effects. VOT and f0 were negatively correlated within category for /b/ productions after controlling for speaking rate and talker mean f0, but in the opposite direction expected for an intrinsic effect. Within-category talker means were correlated across VOT and vowel duration for /p/ productions. Some talkers produced more prototypical values than others, indicating systematic talker differences. Conclusion: Relationships between cues are mediated more by categories and talkers than by intrinsic physiological relationships.Talker differences reflect systematic speaking style differences.
verified
References
1 Allen JS, Miller JL, DeSteno D (2003): Individual talker differences in voice-onset-time. J Acoust Soc Am 113:544-522.10.1121/1.1528172Search in Google Scholar PubMed
2 Baayen HR (2009): LanguageR. R package. http://CRAN.R-project.org/package=language R.Search in Google Scholar
3 Baese-Berk M, Goldrick M (2009): Mechanisms of interaction in speech production. Lang Cogn Process 24:527-554.10.1080/01690960802299378Search in Google Scholar PubMed
4 Bailey PJ, Summerfield Q (1980): Information in speech: observations on the perception of [s]-stop clusters. J Exp Psychol Hum Percept Perform 6:536-563.10.1037//0096-1523.6.3.536Search in Google Scholar PubMed
5 Bang H-Y, Sonderegger M, Kang Y, Clayards M, Yoon T-J (2015): The effect of word frequency on the time-course of tonogenesis in Seoul Korean. Proceedings of the 18th International Congress of Phonetic Sciences, Glasgow.Search in Google Scholar
6 Barr DJ, Levy R, Scheepers C, Tily HJ (2013): Random effects structure for confirmatory hypothesis testing: keep it maximal. J Mem Lang 68:255-278.10.1016/j.jml.2012.11.001Search in Google Scholar PubMed
7 Bates D, Maechler M, Bolker B, Walker S (2014): lme4: linear mixed-effects models using Eigen and S4. R package version 1.1-7. http://CRAN.R-project.org/package=lme4.Search in Google Scholar
8 Bertelson P, Vroomen J, de Gelder B (2003): Visual recalibration of auditory speech identification: a McGurk aftereffect. Psychol Sci 14:592-597.10.1046/j.0956-7976.2003.psci_1470.xSearch in Google Scholar PubMed
9 Boersma P, Weenink D (2011): Praat: doing phonetics by computer (version 4.6.09). www.praat.org.Search in Google Scholar
10 Boucher VJ (2002): Timing relations in speech and the identification of voice-onset times: a stable perceptual boundary for voicing categories across speaking rates. Percept Psychophys 64:121-130.10.3758/BF03194561Search in Google Scholar PubMed
11 Buzz E, Jaeger F, Tanenhaus MK (2014): Contextual confusability leads to targeted hyperarticulation. Proceedings of the 36th Annual Conference of the Cognitive Science Society, Quebec City.Search in Google Scholar
12 Chodroff E, Godfrey J, Khudanpur S, Wilson C (2015): Structured variability in acoustic realization: a corpus study of voice onset time in American English stops. Proceedings of the 18th International Congress of Phonetic Sciences, Glasgow.Search in Google Scholar
13 Clayards M (2008): The Ideal Listener: Making Optimal Use Of Acoustic Cues For Speech Perception; PhD thesis, University of Rochester.Search in Google Scholar
14 Clayards M, Tanenhaus MK, Aslin RN, Jacobs RA (2008): Perception of speech reflects optimal use of probabilistic speech cues. Cognition 108:804-809.10.1016/j.cognition.2008.04.004Search in Google Scholar PubMed
15 Cole J, Kim H, Choi H, Hasegawa-Johnson M (2007): Prosodic effects on acoustic cues to stop voicing and place of articulation: evidence from radio news speech. J Phonet 35:180-209.10.1016/j.wocn.2006.03.004Search in Google Scholar
16 Dmitrieva O, Llanos F, Shultz AA, Francis AL (2015): Phonological status, not voice onset time, determines the acoustic realization of onset f0 as a secondary voicing cue in Spanish and English. J Phonet 49:77-95.10.1016/j.wocn.2014.12.005Search in Google Scholar
17 Eisner F, McQueen JM (2005): The specificity of perceptual learning in speech processing. Percept Psychophys 67:224-238.10.3758/BF03206487Search in Google Scholar PubMed
18 Ferguson SH (2004): Talker differences in clear and conversational speech: vowel intelligibility for normal-hearing listeners. J Acoust Soc Am 116:2365-2373.10.1121/1.1788730Search in Google Scholar PubMed
19 Fox NP, Reilly M, Blumstein SE (2015): Phonological neighborhood competition affects spoken word production irrespective of sentential context. J Mem Lang 83:97-117.10.1016/j.jml.2015.04.002Search in Google Scholar PubMed
20 Goggin JP, Thompson CP, Strube G, Simental LR (1991): The role of language familiarity in voice identification. Mem Cogn 19:448-458.10.3758/BF03199567Search in Google Scholar PubMed
21 Goldinger S, Summers W (1989): Lexical neighborhoods in speech production: a first report. Research on Speech Perception Progress Report. Psychology Department, Speech Research Laboratory, Indiana University, vol 15, pp 331-342.Search in Google Scholar
22 Green DM, Swets JA (1966): Signal Detection Theory and Psychophysics. New York, Wiley.Search in Google Scholar
23 Hazan V, Baker R, Lee WS, Zee E (2011): Is consonant perception linked to within-category dispersion or acrosscategory distance. Proceedings of the 17th International Congress of Phonetic Sciences, Hong Kong, pp 839-842.Search in Google Scholar
24 Hazan V, Markham D. (2004): Acoustic-phonetic correlates of talker intelligibility for adults and children. J Acoust Soc Am 116:3108-3118.10.1121/1.1806826Search in Google Scholar PubMed
25 Holt L, Lotto A, Kluender K (2001): Influence of fundamental frequency on stop-consonant voicing perception: a case of learned covariation or auditory enhancement? J Acoust Soc Am 109:764-774.10.1121/1.1339825Search in Google Scholar PubMed
26 Hombert JM, Ohala JJ, Ewan WG (1979): Phonetic explanations for the development of tones. Language 55:37-58.10.2307/412518Search in Google Scholar
27 Hoole P, Honda K, Murano E, Fuchs S, Pape D (2004): Cricothyroid activity in consonant voicing and vowel intrinsic pitch. Proceedings of the Conference on Voice Physiology and Biomechanics, Marseille.Search in Google Scholar
28 Hoole P, Honda K (2011): Automaticity vs feature-enhancement in the control of segmental F0; in Clements GN, Ridouane R (eds): Where Do Phonological Features Come from: Cognitive, Physical and Developmental Bases of Distinctive Speech Categories. Amsterdam, Benjamins Publishing Company, pp 131-171.10.1075/lfab.6.06hooSearch in Google Scholar
29 House AS, Fairbanks G (1953): The influence of consonant environment upon the secondary acoustical characteristics of vowels. J Acoust Soc Am 25:105-113.10.1121/1.1906982Search in Google Scholar
30 Johnson K, Ladefoged P, Lindau M (1993): Individual differences in vowel production. J Acoust Soc Am 94:701-714.10.1121/1.406887Search in Google Scholar PubMed
31 Kessinger RH, Blumstein SE (1997): Effects of speaking rate on voice-onset time in Thai, French and English. J Phonet 25:143-168.10.1006/jpho.1996.0039Search in Google Scholar
32 Kessinger RH, Blumstein SE (1998): Effects of speaking rate on voice-onset time and vowel production: some implications for perception studies. J Phonet 26:117-128.10.1006/jpho.1997.0069Search in Google Scholar
33 Kingston J, Diehl RL (1994): Phonetic knowledge. Language 70:419-454.10.2307/416481Search in Google Scholar
34 Kingston J, Diehl RL, Kirk CJ, Castleman WA (2008): On the internal perceptual structure of distinctive features: the [voice] contrast. J Phonet 36:28-54.10.1016/j.wocn.2007.02.001Search in Google Scholar PubMed
35 Kirby J (2010): Cue Selection and Category Restructuring in Sound Change; PhD thesis, University of Chicago.Search in Google Scholar
36 Kirby J (2013): The role of probabilistic enhancement in phonologization; in Yu A (ed): Origins of Sound Patterns: Approaches to Phonologization. Oxford, Oxford University Press, pp 228-246.10.1093/acprof:oso/9780199573745.003.0011Search in Google Scholar
37 Kirby JP, Ladd DR (2015): Stop voicing and f0 perturbations: evidence from French and Italian. Proceedings of the 18th International Congress of Phonetic Sciences, Glasgow.Search in Google Scholar
38 Kirby JP, Ladd DR (2016): Effects of obstruent voicing on vowel F0: evidence from “true voicing” languages. J Acoust Soc Am 140:2400-2411.10.1121/1.4962445Search in Google Scholar PubMed
39 Kliegl R, Wei P, Dambacher M, Yan M, Zhou X (2011): Experimental effects and individual differences in linear mixed models: estimating the relationship between spatial, object, and attraction effects in visual attention. Front Psychol 1:238-238.10.3389/fpsyg.2010.00238Search in Google Scholar PubMed
40 Kohler KJ (1979): Dimensions in the perception of fortis and lenis plosives. Phonetica 36:332-343.10.1159/000259970Search in Google Scholar PubMed
41 Krause JC, Braida LD (2009): Evaluating the role of spectral and envelope characteristics in the intelligibility advantage of clear speech. J Acoust Soc Am 125:3346-3357.10.1121/1.3097491Search in Google Scholar PubMed
42 Lehiste I, Peterson GE (1961): Some basic considerations in the analysis of intonation. J Acoust Soc Am 33:419-425.10.1121/1.1908681Search in Google Scholar
43 Lindblom B, Guion S, Hura S, Moon S-J, Willerman R (1995): Is sound change adaptive? Riv Linguist 7:5-36.Search in Google Scholar
44 Lisker L (1986): ‘‘Voicing'' in English: a catalogue of acoustic features signaling /b/ versus /p/ in trochees. Lang Speech 29:3-11.10.1177/002383098602900102Search in Google Scholar PubMed
45 Lisker L, Abrahmson AS (1964): Cross-language study of voicing in initial stops. Word 20:384-422.10.1080/00437956.1964.11659830Search in Google Scholar
46 Lisker L, Abrahmson AS (1967): Some effect of context on voice onset time in English stops. Lang Speech 10:1-28.10.1177/002383096701000101Search in Google Scholar PubMed
47 Mack M (1982): Voicing-dependent vowel duration in English and French: monolingual and bilingual production. J Acoust Soc Am 71:173-178.10.1121/1.387344Search in Google Scholar
48 Mann VA, Repp BH (1980): Influence of vocalic context on perception of the /sh/-/s/ distinction. Percept Psychophys 28:213-228.10.3758/BF03204377Search in Google Scholar PubMed
49 Matuschek H, Bates D, Kliegl R, Vasishth S, Baayen H (2015): Balancing type I error and power in linear mixed models. http://arxiv.org/abs/1511.01864.Search in Google Scholar
50 McMurray B, Clayards MA, Tanenhaus MK, Aslin RN (2008): Tracking the time course of phonetic cue integration during spoken word recognition. Psychon Bull Rev 15:1064-1071.10.3758/PBR.15.6.1064Search in Google Scholar PubMed
51 McMurray B, Jongman A (2011): What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. Psychol Rev 118:219.10.1037/a0022325Search in Google Scholar PubMed
52 Miller JL, Green KP, Reeves A (1986): Speaking rate and segments: a look at the relation between speech production and speech perception for the voicing contrast. Phonetica 43:106-115.10.1159/000261764Search in Google Scholar
53 Nearey T (1997): Speech perception as pattern recognition. J Acoust Soc Am 101:3241-3254.10.1121/1.418290Search in Google Scholar PubMed
54 Nearey T, Hogan JT (1986): Phonological contrast in experimental phonetics: relating distributions of production data to perceptual categorization curves; in Ohala JJ, Jaeger JJ (eds): Experimental Phonology. Orlando, Academic Press, pp 121-162.Search in Google Scholar
55 Newman RS, Clouse SA, Burnham JL (2001): The perceptual consequences of within-talker variability in fricative production. J Acoust Soc Am 109:1181-1196.10.1121/1.1348009Search in Google Scholar PubMed
56 Noiray A, Iskarous K, Whalen DH (2014): Variability in English vowels is comparable in articulation and acoustics. Lab Phonol 5:271-288.10.1515/lp-2014-0010Search in Google Scholar PubMed
57 Norris D, McQueen JM, Cutler A (2003): Perceptual learning in speech. Cogn Psychol 47:204-238.10.1016/S0010-0285(03)00006-9Search in Google Scholar PubMed
58 Nygaard LC, Pisoni DB (1998): Talker-specific learning in speech perception. Percept Psychophys 60:355-376.10.3758/BF03206860Search in Google Scholar PubMed
59 Ohde RN (1984): Fundamental frequency as an acoustic correlate of stop consonant voicing. J Acoust Soc Am 75:224.10.1121/1.390399Search in Google Scholar PubMed
60 Port RF (1981): Linguistic timing factors in combination. J Acoust Soc Am 69:262-274.10.1121/1.385347Search in Google Scholar PubMed
61 Port RF, Dalby J (1982): Consonant/vowel ratio as a cue for voicing in English. Percept Psychophys 32:141-152.10.3758/BF03204273Search in Google Scholar PubMed
62 Raphael LJ (2005): Acoustic cues to the perception of segmental phonemes; in Pisoni D, Remez R (eds):The Handbook of Speech Perception. Wiley-Blackwell, Oxford, pp 182-206.10.1002/9780470757024.ch8Search in Google Scholar
63 R Core Team (2016): R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.Search in Google Scholar
64 Repp BH (1982): Phonetic trading relations and context effects: new experimental evidence for a speech mode of perception. Psychol Bull 92:81-110.10.1037//0033-2909.92.1.81Search in Google Scholar PubMed
65 Scobbie JM (2006): Flexibility in the face of incompatible English VOT systems; in Goldstein L, Whalen DH, Best CT (eds): Laboratory Phonology 8: Varieties of Phonological Competence. Phonology and Phonetics 4-2. Berlin, Mouton de Gruyter, pp 367-392.Search in Google Scholar
66 Shultz AA, Francis AL, Llanos F (2012): Differential cue weighting in perception and production of consonant voicing. J Acoust Soc Am 132:EL95-EL101.10.1121/1.4736711Search in Google Scholar PubMed
67 Slis IH, Cohen A (1969): On the complex regulating the voiced-voiceless distinction. Lang Speech 12:80-102.10.1177/002383096901200202Search in Google Scholar PubMed
68 Smiljanić R, Bradlow AR (2009): Speaking and hearing clearly: talker and listener factors in speaking style changes. Linguist Lang Compass 3:236-264.10.1111/j.1749-818X.2008.00112.xSearch in Google Scholar PubMed
69 Sonderegger M (2015): Trajectories of voice onset time in spontaneous speech on reality TV. Proceedings of the 18th International Congress of Phonetic Sciences, Glasgow.Search in Google Scholar
70 Sussman HM, Fruchter D, Hilbert J, Sirosh J (1998): Linear correlates in the speech signal: the orderly output constraint. Behav Brain Sci 21:241-259.10.1017/S0140525X98001174Search in Google Scholar
71 Theodore RM, Miller JL (2010): Characteristics of listener sensitivity to talker-specific phonetic detail. J Acoust Soc Am 128:2090-2099.10.1121/1.3467771Search in Google Scholar PubMed
72 Theodore RM, Miller JL, DeSteno D (2009): Individual talker differences in voice-onset-time: contextual influences. J Acoust Soc Am 125:3974-3982.10.1121/1.3106131Search in Google Scholar PubMed
73 Theodore RM, Myers EB, Lomibao JA (2015): Talker-specific influences on phonetic category structure. J Acoust Soc Am 138:1068-1078.10.1121/1.4927489Search in Google Scholar PubMed
74 Toscano JC, McMurray B (2010): Cue integration with categories: weighting acoustic cues in speech using unsupervised learning and distributional statistics. Cogn Sci 34:434-464.10.1111/j.1551-6709.2009.01077.xSearch in Google Scholar PubMed
75 Toscano JC, McMurray B (2012): Cue-integration and context effects in speech: evidence against speaking-rate normalization. Attent Percept Psychophys 74:1284-1301.10.3758/s13414-012-0306-zSearch in Google Scholar PubMed
76 Umeda N (1981): Influence of segmental factors on fundamental frequency in fluent speech. J Acoust Soc Am 70:350-355.10.1121/1.386783Search in Google Scholar
77 Wang W, Fillmore C (1961): Intrinsic cues and consonant perception. J Speech Hear Res 4:130-136.10.1044/jshr.0402.130Search in Google Scholar PubMed
© 2017 S. Karger AG, Basel
Articles in the same Issue
- Front and Back Matter
- Front & Back Matter
- Original Paper
- Individual Talker and Token Covariation in the Production of Multiple Cues to Stop Voicing
- Interactive Prosodic Marking of Focus, Boundary and Newness in Mandarin
- The Mechanism and Representation of Korean Three-Way Phonation Contrast: External Photoglottography, Intra-Oral Air Pressure, Airflow, and Acoustic Data
Articles in the same Issue
- Front and Back Matter
- Front & Back Matter
- Original Paper
- Individual Talker and Token Covariation in the Production of Multiple Cues to Stop Voicing
- Interactive Prosodic Marking of Focus, Boundary and Newness in Mandarin
- The Mechanism and Representation of Korean Three-Way Phonation Contrast: External Photoglottography, Intra-Oral Air Pressure, Airflow, and Acoustic Data