Construction of a WordNet-based multilingual lexical ontology for Galician
-
Xavier Gómez Guinovart
Abstract
This study describes the methodology used in the development of a WordNet lexicon for the Galician language, and its applications for language processing in the fields of terminology acquisition and ontology learning and management. First, we review the Princeton WordNet lexical model, its multilingual adaptation in the EuroWordNet framework, and its implementation in the Galician WordNet building. Second, we discuss the approach and the resources used in the design of Termonet, a tool for checking and verifying in technical corpora the specialty lexicons embedded in WordNet. This tool performs an identification of the synsets in WordNet belonging to a terminological domain from the semantic relations between the nodes of the lexical network, and validates the terms by means of a semantically disambiguated specialized corpus. Third, we analyze the process of construction of a new semantic categorization of WordNet based on epinonyms and generated automatically by exploring the relations from a terminological perspective. A WordNet epinonym is a noun synset in the semantic network representing the category of the semantic domain to which other synsets will be automatically assigned by algorithms that will evaluate their proximity from a terminological point of view through the cognitive processing of the lexical-semantic relations. Last, we present some applications of the RDF Galician WordNet in the Semantic Web by means of federated queries with lexical and ontological resources available as Linked Open Data (LOD) like DBpedia, BabelNet, Wiktionary and YAGO.
Abstract
This study describes the methodology used in the development of a WordNet lexicon for the Galician language, and its applications for language processing in the fields of terminology acquisition and ontology learning and management. First, we review the Princeton WordNet lexical model, its multilingual adaptation in the EuroWordNet framework, and its implementation in the Galician WordNet building. Second, we discuss the approach and the resources used in the design of Termonet, a tool for checking and verifying in technical corpora the specialty lexicons embedded in WordNet. This tool performs an identification of the synsets in WordNet belonging to a terminological domain from the semantic relations between the nodes of the lexical network, and validates the terms by means of a semantically disambiguated specialized corpus. Third, we analyze the process of construction of a new semantic categorization of WordNet based on epinonyms and generated automatically by exploring the relations from a terminological perspective. A WordNet epinonym is a noun synset in the semantic network representing the category of the semantic domain to which other synsets will be automatically assigned by algorithms that will evaluate their proximity from a terminological point of view through the cognitive processing of the lexical-semantic relations. Last, we present some applications of the RDF Galician WordNet in the Semantic Web by means of federated queries with lexical and ontological resources available as Linked Open Data (LOD) like DBpedia, BabelNet, Wiktionary and YAGO.
Kapitel in diesem Buch
- Frontmatter I
- Contents V
- Studies on multilingual lexicography: an introduction 1
-
Section 1: Multilingual electronic lexicography in a new society
- Towards a new definition of multilingual lexicography in the era of internet 9
- Metalexicographic models for multilingual online dictionaries in emerging e-societies 29
- A dangerous cocktail: databases, information techniques and lack of vision 47
-
Section 2: Multilingual electronic dictionaries
- Multilingual Electronic Dictionary of Motion Verbs (DICEMTO): overall structure and the case of andar 67
- From the Linguaturismo glossary to the Dictionary of Food and Nutrition: proposal for a new electronic multilingual lexicography 93
- INTELITERM: In search of efficient terminology lookup tools for translators 113
- PORTLEX as a multilingual and cross-lingual online dictionary 135
- Corpus-based multilingual lexicographic resources for translators: an overview 159
- Construction of a WordNet-based multilingual lexical ontology for Galician 179
- Designing and compiling a terminological and multilingual dictionary for language teaching and learning: key issues and some reflections 197
- Multilingual LSP dictionary. Lexicographic conception of a dictionary of football language 213
Kapitel in diesem Buch
- Frontmatter I
- Contents V
- Studies on multilingual lexicography: an introduction 1
-
Section 1: Multilingual electronic lexicography in a new society
- Towards a new definition of multilingual lexicography in the era of internet 9
- Metalexicographic models for multilingual online dictionaries in emerging e-societies 29
- A dangerous cocktail: databases, information techniques and lack of vision 47
-
Section 2: Multilingual electronic dictionaries
- Multilingual Electronic Dictionary of Motion Verbs (DICEMTO): overall structure and the case of andar 67
- From the Linguaturismo glossary to the Dictionary of Food and Nutrition: proposal for a new electronic multilingual lexicography 93
- INTELITERM: In search of efficient terminology lookup tools for translators 113
- PORTLEX as a multilingual and cross-lingual online dictionary 135
- Corpus-based multilingual lexicographic resources for translators: an overview 159
- Construction of a WordNet-based multilingual lexical ontology for Galician 179
- Designing and compiling a terminological and multilingual dictionary for language teaching and learning: key issues and some reflections 197
- Multilingual LSP dictionary. Lexicographic conception of a dictionary of football language 213