Home Eigennamenerkennung zwischen morphologischer Analyse und Part-of-Speech Tagging: ein automatentheoriebasierter Ansatz
Article
Licensed
Unlicensed Requires Authentication

Eigennamenerkennung zwischen morphologischer Analyse und Part-of-Speech Tagging: ein automatentheoriebasierter Ansatz

  • Jörg Didakowski , Alexander Geyken and Thomas Hanneforth
Published/Copyright: December 4, 2007
Become an author with De Gruyter Brill

Abstract

Previous rule-based approaches for Named Entity Recognition (NER) in German base NER on Part-of-Speech tagged texts. We present a new approach where NER is situated between morphological analysis and Part-of-Speech Tagging and model the NER-grammar entirely with weighted finite state transducers (WFST). We show that NER strategies like the resolution of proper noun/common noun or company-name/family-name ambiguities can be formulated as a best path function of a WFST. The frequently used second pass resolution of coreferential Named Entities can be formulated as a re-assignment of appropriate weights. A prototypical NE recognition system built on the basis of WSFT and large lexical resources was tested on a manually annotated corpus of 65,000 tokens. The results show that our system compares in recall and precision to existing rule-based approaches.

Received: 2007-02-02
Revised: 2007-06-12
Published Online: 2007-12-04
Published in Print: 2007-11-20

© Walter de Gruyter

Downloaded on 23.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/ZFS.2007.016/html
Scroll to top button