Preliminaries to Finnish Word Prediction

P. Väyrynen; K. Noponen; T. Seppänen

doi:10.1515/glot-2008-0009

Abstract

Commercial word prediction software is thus far mainly available for uninflected languages such as English. In the present study, we investigate word prediction in highly inflected languages, using Finnish as an example. Despite its high degree of case inflection, about one third of word tokens in a Finnish text appear in their uninflected base form. As a result, simple prediction techniques such as word completion, originally developed for English, can be used for investigating characteristics of word prediction in inflected languages. Our preliminary results show that about 45% of characters can roughly be saved in Finnish word prediction in general for uninflected and inflected tokens. The most interesting result of our prediction experiments is, however, showing the distribution of character savings to the most common cases and their cumulative effect on the total percentage of character savings that may be achievable in Finnish word prediction. The major conclusions of the study are that word prediction in a highly inflected language such as Finnish is feasible provided that the case form used with a word appearing in a given context of use can be predicted correctly, at least in some cases, and the cognitive load of the resulting prediction system for the user is not too high when the prediction of the case form fails.

Preliminaries to Finnish Word Prediction

Abstract

Articles in the same Issue

Articles in the same Issue