Term extraction for automatic abstracting
-
Michael P. Oakes
and Chris D. Paice
Abstract
In this paper we describe term extraction from full length journal articles in the domain of crop husbandry for the purpose of producing abstracts automatically. Initially, candidate terms are extracted which occur in one of a number of fixed lexical environments, as found by a system of contextual templates which assigns a semantic role indicator to each candidate term.
Candidate terms which can be lexically validated — that is, whose constituent words and structure conform to a simple grammar for their assigned role — receive an enhanced weight. The grammar for lexical validation was derived from a training corpus of 50 journal articles. Selected terms may be used to generate a short abstract which indicates the subject matter of the paper.
We also describe a method for compiling a list of sequences which indicate the statistical findings of an experiment, in particular the interrelationships between terms. Such word sequences, when extracted and appended to an indicative abstract, will produce an informative abstract which describes specific research findings in addition to the subject matter of the paper.
Abstract
In this paper we describe term extraction from full length journal articles in the domain of crop husbandry for the purpose of producing abstracts automatically. Initially, candidate terms are extracted which occur in one of a number of fixed lexical environments, as found by a system of contextual templates which assigns a semantic role indicator to each candidate term.
Candidate terms which can be lexically validated — that is, whose constituent words and structure conform to a simple grammar for their assigned role — receive an enhanced weight. The grammar for lexical validation was derived from a training corpus of 50 journal articles. Selected terms may be used to generate a short abstract which indicates the subject matter of the paper.
We also describe a method for compiling a list of sequences which indicate the statistical findings of an experiment, in particular the interrelationships between terms. Such word sequences, when extracted and appended to an indicative abstract, will produce an informative abstract which describes specific research findings in addition to the subject matter of the paper.
Chapters in this book
- Prelim pages i
- Table of contents vi
- Introduction viii
- A graph-based approach to the automatic generation of multilingual keyword clusters 1
- The automatic construction of faceted terminological feedback for interactive document retrieval 29
- Automatic term detection 53
- Incremental extraction of domain-specific terms from online text resources 89
- Knowledge-based terminology management in medicine 111
- Searching for and identifying conceptual relationships via a corpus-based approach to a Terminological Knowledge Base (CTKB) 127
- Qualitative terminology extraction 149
- General considerations on bilingual terminology extraction 167
- Detection of synonymy links between terms 185
- Extracting useful terms from parenthetical expressions by combining simple rules and statistical measures 209
- Software tools to support the construction of bilingual terminology lexicons 225
- Determining semantic equivalence of terms in information retrieval 245
- Term extraction using a similarity-based approach 261
- Extracting knowledge-rich contexts for terminography 279
- Experimental evaluation of ranking and selection methods in term extraction 303
- Corpus-based extension of a terminological semantic lexicon 327
- Term extraction for automatic abstracting 353
- About the contributors 371
- Subject Index 377
Chapters in this book
- Prelim pages i
- Table of contents vi
- Introduction viii
- A graph-based approach to the automatic generation of multilingual keyword clusters 1
- The automatic construction of faceted terminological feedback for interactive document retrieval 29
- Automatic term detection 53
- Incremental extraction of domain-specific terms from online text resources 89
- Knowledge-based terminology management in medicine 111
- Searching for and identifying conceptual relationships via a corpus-based approach to a Terminological Knowledge Base (CTKB) 127
- Qualitative terminology extraction 149
- General considerations on bilingual terminology extraction 167
- Detection of synonymy links between terms 185
- Extracting useful terms from parenthetical expressions by combining simple rules and statistical measures 209
- Software tools to support the construction of bilingual terminology lexicons 225
- Determining semantic equivalence of terms in information retrieval 245
- Term extraction using a similarity-based approach 261
- Extracting knowledge-rich contexts for terminography 279
- Experimental evaluation of ranking and selection methods in term extraction 303
- Corpus-based extension of a terminological semantic lexicon 327
- Term extraction for automatic abstracting 353
- About the contributors 371
- Subject Index 377