Startseite Linguistik & Semiotik 11 Automatically detecting directives with SPICE Ireland
Kapitel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

11 Automatically detecting directives with SPICE Ireland

  • Gerold Schneider
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

The pragmatically annotated corpus of spoken Irish English, SPICE Ireland, offers the possibility to explore, analyze, or, as our contribution does, train systems to automatically detect, directives in English. In this study, we evaluate the automatic classification and compare directives in Irish English with directives in British English by using lexical signals in the data sets. To do so, we apply and evaluate two approaches from machine learning, document classification with logistic regression, and deep learning with fastText. Both approaches reach a similar, satisfactory, performance on the task of classifying previously unseen sentences as directive or non-directive: up to 90.5% accuracy, and up to 74.2% Kappa. The reported features deliver a large inventory of indicators for speech acts, such as please indicating imperative or what for interrogatives, but also more fine-grained indicators, such as wait and you know. The results suggest that the Irish English data contains significantly more directives than the British English data, except in formal contexts, but this may be affected by the strong bias of our automatic classification. Our error analysis shows that implicit directives are missed more often, indicating that contextual, social, situational or prosodic knowledge is vital for a small number of the instances, yet our evaluations indicate that classification performance is similar on Irish and British data.

Abstract

The pragmatically annotated corpus of spoken Irish English, SPICE Ireland, offers the possibility to explore, analyze, or, as our contribution does, train systems to automatically detect, directives in English. In this study, we evaluate the automatic classification and compare directives in Irish English with directives in British English by using lexical signals in the data sets. To do so, we apply and evaluate two approaches from machine learning, document classification with logistic regression, and deep learning with fastText. Both approaches reach a similar, satisfactory, performance on the task of classifying previously unseen sentences as directive or non-directive: up to 90.5% accuracy, and up to 74.2% Kappa. The reported features deliver a large inventory of indicators for speech acts, such as please indicating imperative or what for interrogatives, but also more fine-grained indicators, such as wait and you know. The results suggest that the Irish English data contains significantly more directives than the British English data, except in formal contexts, but this may be affected by the strong bias of our automatic classification. Our error analysis shows that implicit directives are missed more often, indicating that contextual, social, situational or prosodic knowledge is vital for a small number of the instances, yet our evaluations indicate that classification performance is similar on Irish and British data.

Heruntergeladen am 4.2.2026 von https://www.degruyterbrill.com/document/doi/10.1515/9783110791457-011/html
Button zum nach oben scrollen