Preprocessing for English-to-Arabic Statistical Machine Translation
-
Rabih Zbib
Abstract
Most research in Arabic Statistical Machine Translation (SMT) has focused on translating from Arabic into English and other languages. Translation to Arabic has drawn very little attention so far, despite being an important, as well as technically challenging task. This chapter describes the application of two preprocessing techniques to English-to-Arabic SMT: Morphological Segmentation and Syntactic Reordering. It shows how these techniques can be adapted to apply to translation into Arabic, providing significant improvements to a phrase-based system.
Abstract
Most research in Arabic Statistical Machine Translation (SMT) has focused on translating from Arabic into English and other languages. Translation to Arabic has drawn very little attention so far, despite being an important, as well as technically challenging task. This chapter describes the application of two preprocessing techniques to English-to-Arabic SMT: Morphological Segmentation and Syntactic Reordering. It shows how these techniques can be adapted to apply to translation into Arabic, providing significant improvements to a phrase-based system.
Chapters in this book
- Prelim pages i
- Table of contents v
- Preface vii
- Introduction 1
- Linguistic resources for Arabic machine translation 15
- Using morphology to improve Example-Based Machine Translation 23
- Using semantic equivalents for Arabic-to-English 49
- Arabic preprocessing for Statistical Machine Translation 73
- Preprocessing for English-to-Arabic Statistical Machine Translation 95
- Lexical syntax for Arabic SMT 109
- Automatic rule induction in Arabic to English machine translation framework 135
- Index 155
Chapters in this book
- Prelim pages i
- Table of contents v
- Preface vii
- Introduction 1
- Linguistic resources for Arabic machine translation 15
- Using morphology to improve Example-Based Machine Translation 23
- Using semantic equivalents for Arabic-to-English 49
- Arabic preprocessing for Statistical Machine Translation 73
- Preprocessing for English-to-Arabic Statistical Machine Translation 95
- Lexical syntax for Arabic SMT 109
- Automatic rule induction in Arabic to English machine translation framework 135
- Index 155