21. A computational lexicography approach to phraseologisms
-
Cornelia Tschichold
Abstract
The cycle of lexicographic and linguistic work involved in compiling a computational phraseological database is divided into three phases and described in relation to the specific challenges multi-word expressions (MWEs) pose for a lexical database. Data collection is a process that is far from complete for the MWEs found in English, with the variability of some phrases making identification of all occurrences in large corpora a major challenge. Formalization of the form and variability ofMWEs is an interrelated process which can improve tools for data collection and other applications. Increased use of the phraseological lexical database in NLP applications can ultimately lead to further insights into the nature of MWEs and to improvements in the database. Due to the volume of lexicographic data on MWEs that still needs to be collected, analysed and formalized, and the cyclical nature of the work, the resulting lexical database should be reusable in as many applications as possible. WordManager-PhraseManager, the lexical resource described in the second part of the chapter, can capture the variability ofMWEs in a way that allows for maximum reusability of lexical data.
Abstract
The cycle of lexicographic and linguistic work involved in compiling a computational phraseological database is divided into three phases and described in relation to the specific challenges multi-word expressions (MWEs) pose for a lexical database. Data collection is a process that is far from complete for the MWEs found in English, with the variability of some phrases making identification of all occurrences in large corpora a major challenge. Formalization of the form and variability ofMWEs is an interrelated process which can improve tools for data collection and other applications. Increased use of the phraseological lexical database in NLP applications can ultimately lead to further insights into the nature of MWEs and to improvements in the database. Due to the volume of lexicographic data on MWEs that still needs to be collected, analysed and formalized, and the cyclical nature of the work, the resulting lexical database should be reusable in as many applications as possible. WordManager-PhraseManager, the lexical resource described in the second part of the chapter, can capture the variability ofMWEs in a way that allows for maximum reusability of lexical data.
Chapters in this book
- Prelim pages i
- Table of contents v
- List of contributors xi
- Acknowledgements xiii
- Preface xv
- Introduction: The many faces of phraseology xix
-
Part I. Phraseology: theory, typology and terminology
- 1. Phraseology and linguistic theory: A brief survey 3
- 2. Disentangling the phraseological web 27
- 3. A unified approach to semantic frames and collocational patterns 51
- 4. Processing of idioms and idiom modifications: A view from cognitive linguistics 67
- 5. A very complex criterion of fixedness: Non-compositionality 81
- 6. Reassessing the canon: 'Fixed' phrases in general reference corpora 95
-
Part II. Corpus-based analyses of phraseological units
- 7. Adjective + Noun sequences in attributive or NP-final positions: Observations on lexicalization 111
- 8. Phrasal similes in the BNC 127
- 9. Foot and Mouth: The phrasal patterns of two frequent nouns 143
- 10. The Good Lord and his works: A corpus-driven study of collocational resonance 159
- 11. Fixed expressions, extenders and metonymy in the speech of people with Alzheimer's disease 175
-
Part III. Phraseology across languages and cultures
- 12. Cross-linguistic phraseological studies: An overview 191
- 13. Figurative phraseology and culture 207
- 14. Critical observations on the culture-boundness of phraseology 229
- 15. Phraseology in a European framework: A cross-linguistic and cross-cultural research project on widespread idioms 243
- 16. Free and bound prepositions in a contrastive perspective. The case of with and avec 259
- 17. Contrastive idiom analysis: The case of Japanese and English idioms of anger 275
- 18. Automatic extraction of translation equivalents of phrasal and light verbs in English and Russian 293
-
Part IV. Phraseology in lexicography and natural language processing
- 19. Dictionaries and collocation 313
- 20. Computational phraseology: An overview 337
- 21. A computational lexicography approach to phraseologisms 361
- 22. Extracting specialized collocations using lexical functions 377
- 23. Combined statistical and grammatical criteria for the retrieval of phraseological units in an electronic corpus 391
-
Envoi
- The phrase, the whole phrase and nothing but the phrase 407
- Author index 411
- Subject index 417
Chapters in this book
- Prelim pages i
- Table of contents v
- List of contributors xi
- Acknowledgements xiii
- Preface xv
- Introduction: The many faces of phraseology xix
-
Part I. Phraseology: theory, typology and terminology
- 1. Phraseology and linguistic theory: A brief survey 3
- 2. Disentangling the phraseological web 27
- 3. A unified approach to semantic frames and collocational patterns 51
- 4. Processing of idioms and idiom modifications: A view from cognitive linguistics 67
- 5. A very complex criterion of fixedness: Non-compositionality 81
- 6. Reassessing the canon: 'Fixed' phrases in general reference corpora 95
-
Part II. Corpus-based analyses of phraseological units
- 7. Adjective + Noun sequences in attributive or NP-final positions: Observations on lexicalization 111
- 8. Phrasal similes in the BNC 127
- 9. Foot and Mouth: The phrasal patterns of two frequent nouns 143
- 10. The Good Lord and his works: A corpus-driven study of collocational resonance 159
- 11. Fixed expressions, extenders and metonymy in the speech of people with Alzheimer's disease 175
-
Part III. Phraseology across languages and cultures
- 12. Cross-linguistic phraseological studies: An overview 191
- 13. Figurative phraseology and culture 207
- 14. Critical observations on the culture-boundness of phraseology 229
- 15. Phraseology in a European framework: A cross-linguistic and cross-cultural research project on widespread idioms 243
- 16. Free and bound prepositions in a contrastive perspective. The case of with and avec 259
- 17. Contrastive idiom analysis: The case of Japanese and English idioms of anger 275
- 18. Automatic extraction of translation equivalents of phrasal and light verbs in English and Russian 293
-
Part IV. Phraseology in lexicography and natural language processing
- 19. Dictionaries and collocation 313
- 20. Computational phraseology: An overview 337
- 21. A computational lexicography approach to phraseologisms 361
- 22. Extracting specialized collocations using lexical functions 377
- 23. Combined statistical and grammatical criteria for the retrieval of phraseological units in an electronic corpus 391
-
Envoi
- The phrase, the whole phrase and nothing but the phrase 407
- Author index 411
- Subject index 417