Choice and pronunciation of words: Individual differences within a homogeneous group of speakers
-
Iris Hanique
, Mirjam ErnestusIris Hanique obtained her PhD from Radboud University Nijmegen, where she studied how native Dutch listeners produce and understand reduced words. Currently, she is a consultant at a Dutch company solving planning puzzles in the metals industry.Mirjam Ernestus is a full professor of Psycholinguistics at Radboud University Nijmegen. She is a specialist in the pronunciation variation typical of casual speech: she documents this variation and studies how both native and non-native speakers of a language produce and understand this variation.and Lou Boves
Lou Boves is a full professor of Language and Speech Technology, he had published widely on speech and speaker recognition, and on text classification. This paper is an attempt to apply methods for text classification to speaker recognition.
Abstract
This paper investigates whether individual speakers forming a homogeneous group differ in their choice and pronunciation of words when engaged in casual conversation, and if so, how they differ. More specifically, it examines whether the Balanced Winnow classifier is able to distinguish between the twenty speakers of the Ernestus Corpus of Spontaneous Dutch, who all have the same social background. To examine differences in choice and pronunciation of words, instead of characteristics of the speech signal itself, classification was based on lexical and pronunciation features extracted from hand-made orthographic and automatically generated broad phonetic transcriptions. The lexical features consisted of words and two-word combinations. The pronunciation features represented pronunciation variations at the word and phone level that are typical for casual speech. The best classifier achieved a performance of 79.9% and was based on the lexical features and on the pronunciation features representing single phones and triphones. The speakers must thus differ from each other in these features. Inspection of the relevant features indicated that, among other things, the words relevant for classification generally do not contain much semantic content, and that speakers differ not only from each other in the use of these words but also in their pronunciation.
About the authors
Iris Hanique obtained her PhD from Radboud University Nijmegen, where she studied how native Dutch listeners produce and understand reduced words. Currently, she is a consultant at a Dutch company solving planning puzzles in the metals industry.
Mirjam Ernestus is a full professor of Psycholinguistics at Radboud University Nijmegen. She is a specialist in the pronunciation variation typical of casual speech: she documents this variation and studies how both native and non-native speakers of a language produce and understand this variation.
Lou Boves is a full professor of Language and Speech Technology, he had published widely on speech and speaker recognition, and on text classification. This paper is an attempt to apply methods for text classification to speaker recognition.
©2015 by De Gruyter Mouton
Articles in the same Issue
- Frontmatter
- The ‘Learner Corpus Research, Cognitive Linguistics and Second Language Acquisition’ nexus: a SWOT analysis
- A multifactorial approach to linguistic structure in L2 spoken and written registers
- The use of phrasal verbs by French-speaking EFL learners. A constructional and collostructional corpus-based approach
- Combining experimental data and corpus data: Intermediate French-speaking learners and the English present
- Conceptual tools for the description and acquisition of the German posture verb sitzen
- Choice and pronunciation of words: Individual differences within a homogeneous group of speakers
- Generating data as a proxy for unavailable corpus data: the contextualized sentence completion task
Articles in the same Issue
- Frontmatter
- The ‘Learner Corpus Research, Cognitive Linguistics and Second Language Acquisition’ nexus: a SWOT analysis
- A multifactorial approach to linguistic structure in L2 spoken and written registers
- The use of phrasal verbs by French-speaking EFL learners. A constructional and collostructional corpus-based approach
- Combining experimental data and corpus data: Intermediate French-speaking learners and the English present
- Conceptual tools for the description and acquisition of the German posture verb sitzen
- Choice and pronunciation of words: Individual differences within a homogeneous group of speakers
- Generating data as a proxy for unavailable corpus data: the contextualized sentence completion task