Presented to you through Paradigm Publishing Services

John Benjamins Publishing Company

Visit our Partner Page See all our books

Chapter

Linguistic challenges for computationalists

© 2007 John Benjamins Publishing Company

You are currently not able to access this content.

© 2007 John Benjamins Publishing Company

You are currently not able to access this content.

Chapters in this book

Prelim pages i
Table of contents v
Editors' Foreword ix
Part I. Computation for linguistics
Linguistic challenges for computationalists 1
Part II. Information extraction & indexing
NLP: An information extraction perspective 17
Semantic indexing using minimum redundancy cut in ontologies 25
Indexing and querying linguistic metadata and document content 35
Term representation with Generalized Latent Semantic Analysis 45
Part III. Parsing
Multilingual dependency parsing: A pipeline approach 55
How does treebank annotation influence parsing? Or how not to compare apples and oranges 79
The SenSem project: Syntactico-semantic annotation of sentences in Spanish 89
Part IV. Anaphora & referring expressions
Generating referring expressions: Past, present and future 99
A data-driven approach to pronominal anaphora resolution for German 115
Part V. Classification
Efficient spam analysis for weblogs through URL segmentation 125
Document classification using semantic networks with an adaptive similarity measure 137
Text summarization for improved text classification 147
Exploiting linguistic cues to classify rhetorical relations 157
Part VI. Textual entailment & question answering
Tree edit distance for textual entailment 167
A genetic algorithm for optimising information retrieval with linguistic features in question answering 177
Lexico-syntactic subsumption for textual entailment 187
A knowledge-based approach to text-to-text similarity 197
Part VII. Ontologies
A simple WWW-based method for semantic word class acquisition 207
Automatic building of Wordnets 217
Part VIII. Machine translation
Lexical transfer selection using annotated parallel corpora 227
Multi-perspective evaluation of the FAME speech-to-speech translation system for Catalan, English and Spanish 237
Parallel corpora for medium density languages 247
Part IX. Corpora
The role of data in NLP: The case for dataset profiling 259
Even very frequent function words do not distribute homogeneously 267
Exploiting parallel texts to produce a multilingual sense tagged corpus for word sense disambiguation 277
Detecting dangerous coordination ambiguities using word distribution 287
List and addresses of contributors 297
Index of subjects and terms 303

Recent Advances in Natural Language Processing IV

This chapter is in the book Recent Advances in Natural Language Processing IV

https://doi.org/10.1075/cilt.292.03ner

Chapters in this book

Prelim pages i
Table of contents v
Editors' Foreword ix
Part I. Computation for linguistics
Linguistic challenges for computationalists 1
Part II. Information extraction & indexing
NLP: An information extraction perspective 17
Semantic indexing using minimum redundancy cut in ontologies 25
Indexing and querying linguistic metadata and document content 35
Term representation with Generalized Latent Semantic Analysis 45
Part III. Parsing
Multilingual dependency parsing: A pipeline approach 55
How does treebank annotation influence parsing? Or how not to compare apples and oranges 79
The SenSem project: Syntactico-semantic annotation of sentences in Spanish 89
Part IV. Anaphora & referring expressions
Generating referring expressions: Past, present and future 99
A data-driven approach to pronominal anaphora resolution for German 115
Part V. Classification
Efficient spam analysis for weblogs through URL segmentation 125
Document classification using semantic networks with an adaptive similarity measure 137
Text summarization for improved text classification 147
Exploiting linguistic cues to classify rhetorical relations 157
Part VI. Textual entailment & question answering
Tree edit distance for textual entailment 167
A genetic algorithm for optimising information retrieval with linguistic features in question answering 177
Lexico-syntactic subsumption for textual entailment 187
A knowledge-based approach to text-to-text similarity 197
Part VII. Ontologies
A simple WWW-based method for semantic word class acquisition 207
Automatic building of Wordnets 217
Part VIII. Machine translation
Lexical transfer selection using annotated parallel corpora 227
Multi-perspective evaluation of the FAME speech-to-speech translation system for Catalan, English and Spanish 237
Parallel corpora for medium density languages 247
Part IX. Corpora
The role of data in NLP: The case for dataset profiling 259
Even very frequent function words do not distribute homogeneously 267
Exploiting parallel texts to produce a multilingual sense tagged corpus for word sense disambiguation 277
Detecting dangerous coordination ambiguities using word distribution 287
List and addresses of contributors 297
Index of subjects and terms 303

Institutional Access

How does access work?

Winner of the OpenAthens UX Award 2026

Winner of the OpenAthens UX Award 2026

Have an idea on how to improve our website?

Please write us.

© 2026 De Gruyter Brill

Downloaded on 1.4.2026 from https://www.degruyterbrill.com/document/doi/10.1075/cilt.292.03ner/html