Home Aligning verb + noun collocations to improve a French-Romanian FSMT system
Chapter
Licensed
Unlicensed Requires Authentication

Aligning verb + noun collocations to improve a French-Romanian FSMT system

  • Amalia Todiraşcu and Mirabela Navlea
View more publications by John Benjamins Publishing Company

Abstract

We present several Verb + Noun collocation integration methods using linguistic information, aiming to improve the results of a French-Romanian factored statistical machine translation system (FSMT). The system uses lemmatised, tagged and sentence-aligned legal parallel corpora. Verb + Noun collocations are frequent word associations, sometimes discontinuous, related by syntactic links and with non-compositional sense (Gledhill, 2007). Our first strategy extracts collocations from monolingual corpora, using a hybrid method which combines morphosyntactic properties and frequency criteria. The second method applies a bilingual collocation dictionary to identify collocations. Both methods transform collocations into single tokens before alignment. The third method applies a specific alignment algorithm for collocations. We evaluate the influence of these collocation alignment methods on the results of the lexical alignment and of the FSMT system.

Abstract

We present several Verb + Noun collocation integration methods using linguistic information, aiming to improve the results of a French-Romanian factored statistical machine translation system (FSMT). The system uses lemmatised, tagged and sentence-aligned legal parallel corpora. Verb + Noun collocations are frequent word associations, sometimes discontinuous, related by syntactic links and with non-compositional sense (Gledhill, 2007). Our first strategy extracts collocations from monolingual corpora, using a hybrid method which combines morphosyntactic properties and frequency criteria. The second method applies a bilingual collocation dictionary to identify collocations. Both methods transform collocations into single tokens before alignment. The third method applies a specific alignment algorithm for collocations. We evaluate the influence of these collocation alignment methods on the results of the lexical alignment and of the FSMT system.

Downloaded on 18.9.2025 from https://www.degruyterbrill.com/document/doi/10.1075/cilt.341.04tod/html
Scroll to top button