Eigennamenerkennung zwischen morphologischer Analyse und Part-of-Speech Tagging: ein automatentheoriebasierter Ansatz

Jörg Didakowski; Alexander Geyken; Thomas Hanneforth

doi:10.1515/ZFS.2007.016

Article

Eigennamenerkennung zwischen morphologischer Analyse und Part-of-Speech Tagging: ein automatentheoriebasierter Ansatz

Jörg Didakowski , Alexander Geyken and Thomas Hanneforth

Published/Copyright: December 4, 2007

Published by

Become an author with De Gruyter Brill

Author Information Explore this Subject

From the journal Zeitschrift für Sprachwissenschaft Volume 26 Issue 2

Abstract

Previous rule-based approaches for Named Entity Recognition (NER) in German base NER on Part-of-Speech tagged texts. We present a new approach where NER is situated between morphological analysis and Part-of-Speech Tagging and model the NER-grammar entirely with weighted finite state transducers (WFST). We show that NER strategies like the resolution of proper noun/common noun or company-name/family-name ambiguities can be formulated as a best path function of a WFST. The frequently used second pass resolution of coreferential Named Entities can be formulated as a re-assignment of appropriate weights. A prototypical NE recognition system built on the basis of WSFT and large lexical resources was tested on a manually annotated corpus of 65,000 tokens. The results show that our system compares in recall and precision to existing rule-based approaches.

Keywords: Named Entity Recognition; weighted finite state transducers; large lexical resources

Received: 2007-02-02

Revised: 2007-06-12

Published Online: 2007-12-04

Published in Print: 2007-11-20

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/ZFS.2007.016

Keywords for this article

Named Entity Recognition; weighted finite state transducers; large lexical resources