Lexical syntax for Arabic SMT
-
Hany Hassan
Abstract
The current approaches of Phrase-based Statistical Machine Translation lacks the capabilities of producing grammatical translations and handling long-range reordering. In this chapter, we presnet our work for extending Phrase-based SMT with lexical syntactic descriptions that localize global syntactic information on the word without introducing syntactic redundant ambiguity. We presente a novel model of Phrase-based SMT which integrates linguistic lexical descriptions supertags into the target language model and the target side of the translation model. Moreover, we introduce a novel Incremental Dependency-based Syntactic Language Model (IDLM) based on wide-coverage CCG incremental parsing which we integrate into a direct translation SMT system. Our proposed approach is the first to integrate full dependency parsing in SMT systems with a very attractive computational cost since it deploys the linear decoders widely used in Phrase-based SMT systems. The experimental results. show a good improvement over top-ranked state-of-the-art systems.
Abstract
The current approaches of Phrase-based Statistical Machine Translation lacks the capabilities of producing grammatical translations and handling long-range reordering. In this chapter, we presnet our work for extending Phrase-based SMT with lexical syntactic descriptions that localize global syntactic information on the word without introducing syntactic redundant ambiguity. We presente a novel model of Phrase-based SMT which integrates linguistic lexical descriptions supertags into the target language model and the target side of the translation model. Moreover, we introduce a novel Incremental Dependency-based Syntactic Language Model (IDLM) based on wide-coverage CCG incremental parsing which we integrate into a direct translation SMT system. Our proposed approach is the first to integrate full dependency parsing in SMT systems with a very attractive computational cost since it deploys the linear decoders widely used in Phrase-based SMT systems. The experimental results. show a good improvement over top-ranked state-of-the-art systems.
Chapters in this book
- Prelim pages i
- Table of contents v
- Preface vii
- Introduction 1
- Linguistic resources for Arabic machine translation 15
- Using morphology to improve Example-Based Machine Translation 23
- Using semantic equivalents for Arabic-to-English 49
- Arabic preprocessing for Statistical Machine Translation 73
- Preprocessing for English-to-Arabic Statistical Machine Translation 95
- Lexical syntax for Arabic SMT 109
- Automatic rule induction in Arabic to English machine translation framework 135
- Index 155
Chapters in this book
- Prelim pages i
- Table of contents v
- Preface vii
- Introduction 1
- Linguistic resources for Arabic machine translation 15
- Using morphology to improve Example-Based Machine Translation 23
- Using semantic equivalents for Arabic-to-English 49
- Arabic preprocessing for Statistical Machine Translation 73
- Preprocessing for English-to-Arabic Statistical Machine Translation 95
- Lexical syntax for Arabic SMT 109
- Automatic rule induction in Arabic to English machine translation framework 135
- Index 155