Startseite A multilingual gold standard for translation spotting of German compounds and their corresponding multiword units in English, French, Italian and Spanish
Kapitel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

A multilingual gold standard for translation spotting of German compounds and their corresponding multiword units in English, French, Italian and Spanish

  • Simon Clematide , Stéphanie Lehner , Johannes Graën und Martin Volk
Weitere Titel anzeigen von John Benjamins Publishing Company

Abstract

This article describes a new word alignment gold standard for German nominal compounds and their multiword translation equivalents in English, French, Italian, and Spanish. The gold standard contains alignments for each of the ten language pairs, resulting in a total of 8,229 bidirectional alignments. It covers 362 occurrences of 137 different German compounds randomly selected from the corpus of European Parliament plenary sessions, sampled according to the criteria of frequency and morphological complexity. The standard serves for the evaluation and optimisation of automatic word alignments in the context of spotting translations of German compounds. The study also shows that in this text genre, around 80% of German noun types are morphological compounds indicating potential multiword units in their parallel equivalents.

Abstract

This article describes a new word alignment gold standard for German nominal compounds and their multiword translation equivalents in English, French, Italian, and Spanish. The gold standard contains alignments for each of the ten language pairs, resulting in a total of 8,229 bidirectional alignments. It covers 362 occurrences of 137 different German compounds randomly selected from the corpus of European Parliament plenary sessions, sampled according to the criteria of frequency and morphological complexity. The standard serves for the evaluation and optimisation of automatic word alignments in the context of spotting translations of German compounds. The study also shows that in this text genre, around 80% of German noun types are morphological compounds indicating potential multiword units in their parallel equivalents.

Heruntergeladen am 8.9.2025 von https://www.degruyterbrill.com/document/doi/10.1075/cilt.341.06cle/html
Button zum nach oben scrollen