Startseite Evaluation of keyness metrics: performance and reliability
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Evaluation of keyness metrics: performance and reliability

  • Lukas Sönning ORCID logo EMAIL logo
Veröffentlicht/Copyright: 27. April 2023

Abstract

The methodological debates surrounding keyword analysis have given rise to a wide range of keyness metrics. The present paper delineates four dimensions of keyness, which distinguish between frequency- and dispersion-related perspectives. Existing measures are then organized according to these dimensions and evaluated with regard to their performance on a specific keyword analysis task: The identification of key verbs in academic writing. To this end, the rankings produced by 32 different metrics are evaluated against an established academic word list. Further, the reliability of measures is assessed, to determine whether they produce stable rankings across repeated studies on the same pair of text varieties. We observe notable differences among metrics with regard to these criteria. Our findings provide further support for the superiority of the Wilcoxon rank sum test and text-dispersion–based measures, and allow us to identify, within each dimension of keyness, metrics that may be given preference in applied work.


Corresponding author: Lukas Sönning, English Linguistics, University of Bamberg, Bamberg, Germany, E-mail:

Acknowledgements

I would like to thank the five anonymous reviewers for their constructive and helpful comments on earlier versions of this paper.

References

Baker, Paul. 2004. Querying keywords: Questions in difference, frequency, and sense in keyword analysis. Journal of English Linguistics 32(4). 346–359. https://doi.org/10.1177/0075424204269894.Suche in Google Scholar

Baroni, Marco & Stefan Evert. 2009. Statistical methods for corpus exploitation. In Anke Lüdeling & Merja Kytö (eds.), Corpus linguistics: An international handbook, 777–803. Berlin: Mouton de Gruyter.10.1515/9783110213881.2.777Suche in Google Scholar

Bestgen, Yves. 2014. Inadequacy of the chi-squared test to examine vocabulary differences between corpora. Literary and Linguistic Computing 29(2). 164–170. https://doi.org/10.1093/llc/fqt020.Suche in Google Scholar

Brezina, Vaclav & Miriam Meyerhoff. 2014. Significant or random? A critical review of sociolinguistic generalisations based on large corpora. International Journal of Corpus Linguistics 19(1). 1–28. https://doi.org/10.1075/ijcl.19.1.01bre.Suche in Google Scholar

Carroll, John B. 1970. An alternative to Juilland’s usage coefficient for lexical frequencies and a proposal for a standard frequency index. Computer Studies in the Humanities and Verbal Behaviour 3(2). 61–65.10.1002/j.2333-8504.1970.tb00778.xSuche in Google Scholar

Church, Kenneth W. & William A. Gale. 1995. Poisson mixtures. Natural Language Engineering 1(2). 163–190. https://doi.org/10.1017/s1351324900000139.Suche in Google Scholar

Davies, Mark. 2008. The corpus of contemporary American English. Available at: www.english-corpora.org/coca.Suche in Google Scholar

Dunning, Ted. 1993. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19(1). 61–74.Suche in Google Scholar

Egbert, Jesse, Brent Burch & Douglas Biber. 2020a. Lexical dispersion and corpus design. International Journal of Corpus Linguistics 25(1). 89–115. https://doi.org/10.1075/ijcl.18010.egb.Suche in Google Scholar

Egbert, Jesse & Douglas Biber. 2019. Incorporating text dispersion into keyword analysis. Corpora 14(1). 77–104. https://doi.org/10.3366/cor.2019.0162.Suche in Google Scholar

Egbert, Jesse, Tove Larsson & Douglas Biber. 2020b. Doing linguistics with a corpus: Methodological considerations for the everyday user. Cambridge: Cambridge University Press.10.1017/9781108888790Suche in Google Scholar

Evert, Stefan. 2006. How random is a corpus? The library metaphor. Zeitschrift für Anglistik und Amerikanistik 54(2). 177–190. https://doi.org/10.1515/zaa-2006-0208.Suche in Google Scholar

Gabrielatos, Costas. 2018. Keyness analysis: Nature, metrics and techniques. In Charlotte Taylor & Anna Marchi (eds.), Corpus approaches to discourse: A critical review, 225–258. New York: Routledge.10.4324/9781315179346-11Suche in Google Scholar

Gabrielatos, Costas & Anna Marchi. 2011. Keyness: Matching metrics to definitions. http://eprints.lancs.ac.uk/51449 (accessed 29 March 2023).Suche in Google Scholar

Gries, Stefan Th. 2008. Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics 13(4). 403–437. https://doi.org/10.1075/ijcl.13.4.02gri.Suche in Google Scholar

Gries, Stefan Th. 2020. Analyzing dispersion. In Magali Paquot & Stefan Th. Gries (eds.), A practical handbook of corpus linguistics, 99–118. New York: Springer.10.1007/978-3-030-46216-1_5Suche in Google Scholar

Gries, Stefan Th. 2021. A new approach to (key) keywords analysis: Using frequency, and now also dispersion. Research in Corpus Linguistics 9(2). 1–33. https://doi.org/10.32714/ricl.09.02.02.Suche in Google Scholar

Grissom, Robert J. & John J. Kim. 2012. Effect sizes for research: Univariate and multivariate applications. New York: Routledge.10.4324/9780203803233Suche in Google Scholar

Hardie, Andrew. 2014. Log ratio – An informal introduction. http://cass.lancs.ac.uk/?p=1133 (accessed 29 March 2023).Suche in Google Scholar

Hofland, Knut & Stig Johansson. 1982. Word frequencies in British and American English. London: Longman.Suche in Google Scholar

Juilland, Alphonse G., Dorothy R. Brodin & Catherine Davidovitch. 1970. Frequency dictionary of French words. The Hague: Mouton de Gruyter.Suche in Google Scholar

Kilgarriff, Adam. 1996. Which words are particularly characteristic of a text? A survey of statistical approaches. In Lindsay J. Evett & Tony G. Rose (eds.), Language engineering for document analysis and recognition, 33–40. Nottingham: Nottingham Trent University.Suche in Google Scholar

Kilgarriff, Adam. 2001. Comparing corpora. International Journal of Corpus Linguistics 6(1). 97–133. https://doi.org/10.1075/ijcl.6.1.05kil.Suche in Google Scholar

Kilgarriff, Adam. 2005. Language is never, ever, ever, random. Corpus Linguistics and Linguistic Theory 1(2). 263–276. https://doi.org/10.1515/cllt.2005.1.2.263.Suche in Google Scholar

Kilgarriff, Adam. 2009. Simple maths for keywords. In Michaela Mahlberg, Victorina González-Díaz & Catherine Smith (eds.), Proceedings of the corpus linguistics conference, CL2009. Liverpool: University of Liverpool. http://ucrel.lancs.ac.uk/publications/CL2009/171_FullPaper.doc (accessed 29 March 2023).Suche in Google Scholar

Lijffijt, Jefrey, Terttu Nevalainen, Tanja Säily, Panagiotis Papapetrou, Kai Puolamäki & Heikki Mannila. 2014. Significance testing of word frequencies in corpora. Digital Scholarship in the Humanities 31(2). 374–397. https://doi.org/10.1093/llc/fqu064.Suche in Google Scholar

McEnery, Tony & Andrew Hardie. 2012. Corpus linguistics: Method, theory and practice. Cambridge: Cambridge University Press.10.1017/CBO9780511981395Suche in Google Scholar

Oakes, Michael P. & Malcolm Farrow. 2007. Use of the chi-squared test to examine vocabulary differences in English-language corpora representing seven different countries. Literary and Linguistic Computing 22(1). 85–100. https://doi.org/10.1093/llc/fql044.Suche in Google Scholar

Paquot, Magali. 2010. Academic vocabulary in learner writing. London: Continuum.Suche in Google Scholar

Paquot, Magali & Yves Bestgen. 2009. Distinctive words in academic writing: A comparison of three statistical tests for keyword extraction. In Andreas H. Jucker, Daniel Schreier & Marianne Hundt (eds.), Corpora: Pragmatics and discourse, 247–269. Amsterdam: Rodopi.10.1163/9789042029101_014Suche in Google Scholar

Pojanapunya, Punjaporn & Richard Watson Todd. 2018. Log-likelihood and odds ratio: Keyness statistics for different purposes of keyword analysis. Corpus Linguistics and Linguistic Theory 14(1). 133–167. https://doi.org/10.1515/cllt-2015-0030.Suche in Google Scholar

Rayson, Paul. 2003. Matrix: A statistical method and software tool for linguistic analysis through corpus comparison. Lancaster: Lancaster University dissertation.Suche in Google Scholar

Rayson, Paul, Damon Berridge & Brian Francis. 2004. Extending the Cochran rule for the comparison of word frequencies between corpora. In Gérard Purnelle, Cédrick Fairon & Anne Dister (eds.), Le poids des mots: Proceedings of the 7th International conference on statistical analysis of textual data, 2, 926–936. Louvain-la-Neuve: Presses Universitaires de Louvain.Suche in Google Scholar

Rosengren, Inger. 1971. The quantitative concept of language and its relation to the structure of frequency dictionaries. Études de Linguistique Appliquée (Nouvelle Série) 1. 103–127.Suche in Google Scholar

Scott, Mike. 1997. PC analysis of key words – and key key words. System 25(2). 233–245. https://doi.org/10.1016/s0346-251x(97)00011-0.Suche in Google Scholar

Snedecor, George W. & William G. Cochran. 1989. Statistical methods. Ames: Iowa State University Press.Suche in Google Scholar

Sönning, Lukas. 2023. Key verbs in academic writing: Dataset for “Evaluation of keyness metrics: Performance and reliability”. DataverseNO, V1. Available at: https://doi.org/10.18710/EUXSMW.Suche in Google Scholar

Wilcox, Allen R. 1973. Indices of qualitative variation and political measurement. The Western Political Quarterly 26(2). 325–343. https://doi.org/10.1177/106591297302600209.Suche in Google Scholar

Wilson, Andrew. 2013. Embracing Bayes factors for key item analysis in corpus linguistics. In Markus Bieswanger & Amei Koll-Stobbe (eds.), New approaches to the study of linguistic variability, 3–11. Frankfurt: Peter Lang.Suche in Google Scholar

Winter, Bodo & Martine Grice. 2021. Independence and generalizability in linguistics. Linguistics 59(5). 1251–1277. https://doi.org/10.1515/ling-2019-0049.Suche in Google Scholar

Woods, Anthony, Paul Fletcher & Arthur Hughes. 1986. Statistics in language studies. Cambridge: Cambridge University Press.10.1017/CBO9781139165891Suche in Google Scholar

Zhang, Jun & Kai F. Yu. 1998. What’s the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. Journal of the American Medical Association 280(19). 1690–1691. https://doi.org/10.1001/jama.280.19.1690.Suche in Google Scholar

Received: 2022-09-23
Accepted: 2023-04-05
Published Online: 2023-04-27
Published in Print: 2024-05-27

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 20.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/cllt-2022-0116/html
Button zum nach oben scrollen