Startseite Aligning verb + noun collocations to improve a French-Romanian FSMT system
Kapitel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Aligning verb + noun collocations to improve a French-Romanian FSMT system

  • Amalia Todiraşcu und Mirabela Navlea
Weitere Titel anzeigen von John Benjamins Publishing Company

Abstract

We present several Verb + Noun collocation integration methods using linguistic information, aiming to improve the results of a French-Romanian factored statistical machine translation system (FSMT). The system uses lemmatised, tagged and sentence-aligned legal parallel corpora. Verb + Noun collocations are frequent word associations, sometimes discontinuous, related by syntactic links and with non-compositional sense (Gledhill, 2007). Our first strategy extracts collocations from monolingual corpora, using a hybrid method which combines morphosyntactic properties and frequency criteria. The second method applies a bilingual collocation dictionary to identify collocations. Both methods transform collocations into single tokens before alignment. The third method applies a specific alignment algorithm for collocations. We evaluate the influence of these collocation alignment methods on the results of the lexical alignment and of the FSMT system.

Abstract

We present several Verb + Noun collocation integration methods using linguistic information, aiming to improve the results of a French-Romanian factored statistical machine translation system (FSMT). The system uses lemmatised, tagged and sentence-aligned legal parallel corpora. Verb + Noun collocations are frequent word associations, sometimes discontinuous, related by syntactic links and with non-compositional sense (Gledhill, 2007). Our first strategy extracts collocations from monolingual corpora, using a hybrid method which combines morphosyntactic properties and frequency criteria. The second method applies a bilingual collocation dictionary to identify collocations. Both methods transform collocations into single tokens before alignment. The third method applies a specific alignment algorithm for collocations. We evaluate the influence of these collocation alignment methods on the results of the lexical alignment and of the FSMT system.

Heruntergeladen am 7.9.2025 von https://www.degruyterbrill.com/document/doi/10.1075/cilt.341.04tod/html
Button zum nach oben scrollen