Automatic suprasegmental parameter extraction in learner corpora
-
Emmanuel Ferragne
Abstract
In this chapter, an attempt is made to compute automatically suprasegmental – and, in particular, rhythmic – parameters that could be used to distinguish between a group of French learners of English and a group of native speakers. As a preliminary step, an automatic segmentation algorithm is benchmarked against manual segmentation. This assessment then leads us to reject the classic duration-based rhythm metrics and adopt alternative measurements involving pitch and intensity. Finally, we use an automatic classifier to check to what extent our metrics predict a reliable boundary between learner and native speech.
Abstract
In this chapter, an attempt is made to compute automatically suprasegmental – and, in particular, rhythmic – parameters that could be used to distinguish between a group of French learners of English and a group of native speakers. As a preliminary step, an automatic segmentation algorithm is benchmarked against manual segmentation. This assessment then leads us to reject the classic duration-based rhythm metrics and adopt alternative measurements involving pitch and intensity. Finally, we use an automatic classifier to check to what extent our metrics predict a reliable boundary between learner and native speech.
Chapters in this book
- Prelim pages i
- Table of contents v
-
Section 1. Introduction
- Introduction 3
- Learner corpora 9
-
Section 2. Compilation, annotation and exchangeability of learner corpus data
- Developing corpus interoperability for phonetic investigation of learner corpora 33
- Learner corpora and second language acquisition 65
- Competing target hypotheses in the Falko corpus 101
-
Section 3. Automatic approaches to the identification of learner language features in learner corpus data
- Using learner corpora for automatic error detection and correction 127
- Automatic suprasegmental parameter extraction in learner corpora 151
- Criterial feature extraction using parallel learner corpora and machine learning 169
-
Section 4. Analysis of learner corpus data
- Phonological acquisition in the French-English interlanguage 207
- Prosody in a contrastive learner corpus 227
- A corpus-based comparison of syntactic complexity in NNS and NS university students’ writing 249
- Analysing coherence in upper-intermediate learner writing 265
- Statistical tests for the analysis of learner corpus data 287
- Index 311
Chapters in this book
- Prelim pages i
- Table of contents v
-
Section 1. Introduction
- Introduction 3
- Learner corpora 9
-
Section 2. Compilation, annotation and exchangeability of learner corpus data
- Developing corpus interoperability for phonetic investigation of learner corpora 33
- Learner corpora and second language acquisition 65
- Competing target hypotheses in the Falko corpus 101
-
Section 3. Automatic approaches to the identification of learner language features in learner corpus data
- Using learner corpora for automatic error detection and correction 127
- Automatic suprasegmental parameter extraction in learner corpora 151
- Criterial feature extraction using parallel learner corpora and machine learning 169
-
Section 4. Analysis of learner corpus data
- Phonological acquisition in the French-English interlanguage 207
- Prosody in a contrastive learner corpus 227
- A corpus-based comparison of syntactic complexity in NNS and NS university students’ writing 249
- Analysing coherence in upper-intermediate learner writing 265
- Statistical tests for the analysis of learner corpus data 287
- Index 311