Chapter 5. Evaluating a bracketing protocol for multiword terms

Pilar León-Araúz; Melania Cabezas-García

Chapter

Chapter 5. Evaluating a bracketing protocol for multiword terms

Pilar León-Araúz and Melania Cabezas-García

Published by

View more publications by John Benjamins Publishing Company

To Publisher Page

This chapter is in the book Recent Advances in Multiword Units in Machine Translation and Translation Technology

Abstract

Multiword terms (MWTs) are frequently used to encapsulate and convey meaning in scientific and technical texts. However, they can also make these texts difficult to understand because the relations between constituents are not transparent. When MWTs have more than two constituents, a dependency analysis (bracketing) is often necessary to facilitate their interpretation. NLP has proposed various models to automatize bracketing operations, but none has been entirely satisfactory. This paper presents a protocol that combines various models and applies it to a set of three-constituent MWTs in order to: (i) sort rules by their disambiguation potential, based on their likelihood of retrieving results from any corpus and their ability to solve bracketing; and (ii) ascertain the influence of corpus size and type in the results obtained.

You are currently not able to access this content.

Abstract

You are currently not able to access this content.

Chapters in this book

Prelim pages i
Table of contents v
Preface vii
Section 1. Computational treatment of multiword units
Chapter 1. Multi-word units in neural machine translation 2
Chapter 2. ReGap 18
Chapter 3. Evaluating the Italian-English machine translation quality of MWUs in the domain of archaeology 40
Chapter 4. Post-editing neural machine translation in specialised languages 57
Chapter 5. Evaluating a bracketing protocol for multiword terms 79
Section 2. Corpus-based and linguistic studies in phraseology
Chapter 6. Suggestions for a new model of functional phraseme categorization for applied purposes 104
Chapter 7. Verb collocations and their semantics in the specialized language of science 124
Chapter 8. Negative–positive adjective pairing in travel journalism in English, Italian, and Polish 141
Chapter 9. The middle construction and some machine translation issues 156
Chapter 10. Semantic annotation of named rivers and its application for the prediction of multiword-term bracketing 173
Chapter 11. Irony in American-English tweets 197
Chapter 12. A comprehensive Japanese MWE lexicon 218
Chapter 13. Ontology-based formalisation of Italian clitic verbal MWEs 243
Index 263

https://doi.org/10.1075/cilt.366.05leo

Chapters in this book

Prelim pages i
Table of contents v
Preface vii
Section 1. Computational treatment of multiword units
Chapter 1. Multi-word units in neural machine translation 2
Chapter 2. ReGap 18
Chapter 3. Evaluating the Italian-English machine translation quality of MWUs in the domain of archaeology 40
Chapter 4. Post-editing neural machine translation in specialised languages 57
Chapter 5. Evaluating a bracketing protocol for multiword terms 79
Section 2. Corpus-based and linguistic studies in phraseology
Chapter 6. Suggestions for a new model of functional phraseme categorization for applied purposes 104
Chapter 7. Verb collocations and their semantics in the specialized language of science 124
Chapter 8. Negative–positive adjective pairing in travel journalism in English, Italian, and Polish 141
Chapter 9. The middle construction and some machine translation issues 156
Chapter 10. Semantic annotation of named rivers and its application for the prediction of multiword-term bracketing 173
Chapter 11. Irony in American-English tweets 197
Chapter 12. A comprehensive Japanese MWE lexicon 218
Chapter 13. Ontology-based formalisation of Italian clitic verbal MWEs 243
Index 263

Chapter 5. Evaluating a bracketing protocol for multiword terms

Abstract

Chapter PDF View

Abstract

Chapters in this book

Chapters in this book

Chapters in this book