Eigennamenerkennung zwischen morphologischer Analyse und Part-of-Speech Tagging: ein automatentheoriebasierter Ansatz
-
Jörg Didakowski
Abstract
Previous rule-based approaches for Named Entity Recognition (NER) in German base NER on Part-of-Speech tagged texts. We present a new approach where NER is situated between morphological analysis and Part-of-Speech Tagging and model the NER-grammar entirely with weighted finite state transducers (WFST). We show that NER strategies like the resolution of proper noun/common noun or company-name/family-name ambiguities can be formulated as a best path function of a WFST. The frequently used second pass resolution of coreferential Named Entities can be formulated as a re-assignment of appropriate weights. A prototypical NE recognition system built on the basis of WSFT and large lexical resources was tested on a manually annotated corpus of 65,000 tokens. The results show that our system compares in recall and precision to existing rule-based approaches.
© Walter de Gruyter
Articles in the same Issue
- Introduction
- Eigennamenerkennung zwischen morphologischer Analyse und Part-of-Speech Tagging: ein automatentheoriebasierter Ansatz
- Robuste Auszeichnung Grammatischer Funktionen
- Annotation for and Robust Parsing of Discourse Structure on Unrestricted Texts
- Automatic Semantic Analysis for NLP Applications
- Repräsentation und Verknüpfung allgemeinsprachlicher und terminologischer Wortnetze in OWL
- Suggesting Error Corrections of Path Expressions and Categories for Tree-Mapping Grammars
- Produktivität und Idiomatizität von Präposition-Substantiv-Sequenzen
- Abstracting Suffixes: A Morphophonemic Approach to Polish Morphological Analysis
- Rezensionen
- Rezensionsexemplare
Articles in the same Issue
- Introduction
- Eigennamenerkennung zwischen morphologischer Analyse und Part-of-Speech Tagging: ein automatentheoriebasierter Ansatz
- Robuste Auszeichnung Grammatischer Funktionen
- Annotation for and Robust Parsing of Discourse Structure on Unrestricted Texts
- Automatic Semantic Analysis for NLP Applications
- Repräsentation und Verknüpfung allgemeinsprachlicher und terminologischer Wortnetze in OWL
- Suggesting Error Corrections of Path Expressions and Categories for Tree-Mapping Grammars
- Produktivität und Idiomatizität von Präposition-Substantiv-Sequenzen
- Abstracting Suffixes: A Morphophonemic Approach to Polish Morphological Analysis
- Rezensionen
- Rezensionsexemplare