Home Lexico-syntactic features of science and science popularization texts: an information-theoretic view
Article
Licensed
Unlicensed Requires Authentication

Lexico-syntactic features of science and science popularization texts: an information-theoretic view

  • Siwen Guo

    Siwen Guo is a PhD candidate at School of Foreign Languages, Beihang University. Her research interests are corpus linguistics, science popularization discourse, and English for academic purposes.

    ORCID logo
    and Maocheng Liang

    Maocheng Liang is a professor and Ph.D. supervisor at School of Foreign Languages, Beihang University. His research interests are natural language processing and corpus linguistics.

    EMAIL logo
Published/Copyright: June 18, 2025
Text & Talk
From the journal Text & Talk

Abstract

Science and science popularization texts disseminating partly overlapped scientific knowledge show distinct writing strategies. We examine their differences concerning aspects of lexis, word classes, and dependency relations by Kullback-Leibler divergence, taking “60-Second Science” texts and source research article abstracts as an illustration. The findings reveal that “60-Second Science” is more interactive in terms of quotation marks, question marks, “they”, and “you” while abstracts are more informative with hyphens linking words from different categories. Moreover, word classes show that pronouns and adverbs are typical in “60-Second Science”. However, abstracts utilize nouns and adjectives significantly more. Dependency relations demonstrate that nominal subjects and adverbial modifiers are prominent in “60-Second Science” compared to adjectival and prepositional modifiers in abstracts. The study presents a clear comparison of linguistic features between science and science popularization texts that can be applied in scientific English writing and teaching in connection with different scientific communicative events and further promote the dissemination of scientific knowledge.


Corresponding author: Maocheng Liang, School of Foreign Languages, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing 100191, China, E-mail:

Funding source: National Social Science Fund of China

Award Identifier / Grant number: 19BYY082

About the authors

Siwen Guo

Siwen Guo is a PhD candidate at School of Foreign Languages, Beihang University. Her research interests are corpus linguistics, science popularization discourse, and English for academic purposes.

Maocheng Liang

Maocheng Liang is a professor and Ph.D. supervisor at School of Foreign Languages, Beihang University. His research interests are natural language processing and corpus linguistics.

Acknowledgments

We would like to thank Professor Srikant Sarangi for his meticulous review and insightful feedback. We are also grateful to the journal editor and three anonymous reviewers for their valuable comments and constructive suggestions.

  1. Research funding: This work was supported by National Social Science Fund of China under grant (19BYY082).

Appendix

The examples are catalogued along with their corresponding websites, arranged in the sequence of their mention.

References

Benelhadj, Fatma. 2019. Discipline and genre in academic discourse: Prepositional Phrases as a focus. Journal of Pragmatics 139. 190–199. https://doi.org/10.1016/j.pragma.2018.07.010.Search in Google Scholar

Bhatia, Vijay Kumar. 1993. Analyzing genre: Language use in professional settings. London: Longman.Search in Google Scholar

Bhatia, Vijay Kumar. 2004. Worlds of written discourse: A genre-based view. London: Bloomsbury Publishing.Search in Google Scholar

Biber, Douglas. 1988. Variation across speech and writing. Cambridge: Cambridge University Press.10.1017/CBO9780511621024Search in Google Scholar

Biber, Douglas, Susan Conrad & Viviana Cortes. 2004. If you look at …: Lexical bundles in university teaching and textbooks. Applied Linguistics 25(3). 371–405. https://doi.org/10.1093/applin/25.3.371.Search in Google Scholar

Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. Longman grammar of spoken and written English. London: Longman.Search in Google Scholar

Bizzoni, Yuri, Stefania Degaetano-Ortlieb, Peter Fankhauser & Elke Teich. 2020. Linguistic variation and change in 250 years of English scientific writing: A data-driven approach. Frontiers in Artificial Intelligence 3. 1–15. https://doi.org/10.3389/frai.2020.00073.Search in Google Scholar

Calsamiglia, Helena. 2003. Popularization discourse. Discourse Studies 5(2). 139–146. https://doi.org/10.1177/1461445603005002307.Search in Google Scholar

Ciapuscio, Guiomar. 2003. Formulation and reformulation procedures in verbal interactions between experts and (semi-)laypersons. Discourse Studies 5(2). 207–233. https://doi.org/10.1177/1461445603005002310.Search in Google Scholar

de Oliveira, Janaina Minelli & Adriana Silvina Pagano. 2006. The research article and the science popularization article: A probabilistic functional grammar perspective on direct discourse representation. Discourse Studies 8(5). 627–646. https://doi.org/10.1177/1461445606064833.Search in Google Scholar

Degaetano-Ortlieb, Stefania, Hannah Kermes, Ashraf Khamis & Elke Teich. 2019. An information-theoretic approach to modeling diachronic change in scientific English. In Carla Suhr, Terttu Nevalainen & Irma Taavitsainen (eds.), From data to evidence in English language research, 258–281. Leiden: Brill.10.1163/9789004390652_012Search in Google Scholar

Degaetano-Ortlieb, Stefania & Elke Teich. 2016. Information-based modeling of diachronic linguistic change: From typicality to productivity. Paper presented at the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, Berlin, Germany, 11 August.10.18653/v1/W16-2121Search in Google Scholar

Degaetano-Ortlieb, Stefania & Elke Teich. 2022. Toward an optimal code for communication: The case of scientific English. Corpus Linguistics and Linguistic Theory 18(1). 175–207. https://doi.org/10.1515/cllt-2018-0088.Search in Google Scholar

Deng, Liming & Jing Liu. 2022. Move-Bundle connection in conclusion sections of research articles across disciplines. Applied Linguistics 44(3). 527–554. https://doi.org/10.1093/applin/amac040.Search in Google Scholar

di Carlo, Giuseppina Scotto. 2015. Stance in TED talks: Strategic use of subjective adjectives in online popularisation. Iberica 29. 201–221.Search in Google Scholar

Dijck, Maarten van. 2008. From science to popularization, and back – The science and journalism of the Belgian economist Gustave de Molinari. Science in Context 21(3). 377–402. https://doi.org/10.1017/s026988970800183x.Search in Google Scholar

Dong, Jihua & Louisa Buckingham. 2020. Stance phraseology in academic discourse: Cross-disciplinary variation in authors’ presence. Iberica 39. 191–214. https://doi.org/10.17398/2340-2784.39.191.Search in Google Scholar

Febres, Gerardo & Klaus Jaffé. 2017. Quantifying structure differences in literature using symbolic diversity and entropy criteria. Journal of Quantitative Linguistics 24(1). 16–53. https://doi.org/10.1080/09296174.2016.1169847.Search in Google Scholar

Gilmore, Alexander & Neil Millar. 2018. The language of civil engineering research articles: A corpus-based approach. English for Specific Purposes 51. 1–17. https://doi.org/10.1016/j.esp.2018.02.002.Search in Google Scholar

Güngör, Fatih & Hacer Hande Uysal. 2020. Lexical bundle use and crosslinguistic influence in academic texts. Lingua 242. 102859. https://doi.org/10.1016/j.lingua.2020.102859.Search in Google Scholar

Honnibal, Matthew & Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. 3.4.4 edn.Search in Google Scholar

Hu, Guangwei & Lang Chen. 2019. “To our great surprise”: A frame-based analysis of surprise markers in research articles. Journal of Pragmatics 143. 156–168. https://doi.org/10.1016/j.pragma.2019.02.021.Search in Google Scholar

Hu, Guangwei & Yanhua Liu. 2018. Three minute thesis presentations as an academic genre: A cross-disciplinary study of genre moves. Journal of English for Academic Purposes 35. 16–30. https://doi.org/10.1016/j.jeap.2018.06.004.Search in Google Scholar

Hyland, Ken. 1996. Writing without conviction? Hedging in science research articles. Applied Linguistics 17(4). 433–454. https://doi.org/10.1093/applin/17.4.433.Search in Google Scholar

Hyland, Ken. 2004. Disciplinary discourses: Social interactions in academic writing. Ann Arbor: The University of Michigan Press.Search in Google Scholar

Hyland, Ken. 2008a. Academic clusters: Text patterning in published and postgraduate writing. International Journal of Applied Linguistics 18(1). 41–62. https://doi.org/10.1111/j.1473-4192.2008.00178.x.Search in Google Scholar

Hyland, Ken. 2008b. As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes 27(1). 4–21. https://doi.org/10.1016/j.esp.2007.06.001.Search in Google Scholar

Hyland, Ken. 2010. Constructing proximity: Relating to readers in popular and professional science. Journal of English for Academic Purposes 9(2). 116–127. https://doi.org/10.1016/j.jeap.2010.02.003.Search in Google Scholar

Hyland, Ken & Feng Jiang. 2018. Academic lexical bundles: How are they changing? International Journal of Corpus Linguistics 23(4). 383–407. https://doi.org/10.1075/ijcl.17080.hyl.Search in Google Scholar

Hyland, Ken & Hang Zou. 2020. In the frame: Signalling structure in academic articles and blogs. Journal of Pragmatics 165. 31–44. https://doi.org/10.1016/j.pragma.2020.05.002.Search in Google Scholar

Jiang, Feng & Ken Hyland. 2017. Metadiscursive nouns: Interaction and cohesion in abstract moves. English for Specific Purposes 46. 1–14. https://doi.org/10.1016/j.esp.2016.11.001.Search in Google Scholar

Kullback, Solomon & Richard Leibler. 1951. On information and sufficiency. The Annals of Mathematical Statistics 22(1). 79–86, 78. https://doi.org/10.1214/aoms/1177729694.Search in Google Scholar

Ledouble, Hélène. 2019. Scientific popularization and media coverage of science terminological instability in the field of biological control. Terminology 25(1). 60–92. https://doi.org/10.1075/term.00028.led.Search in Google Scholar

Liu, Kanglong, Zhongzhu Liu & Lei Lei. 2022. Simplification in translated Chinese: An entropy-based approach. Lingua 275. 103364. https://doi.org/10.1016/j.lingua.2022.103364.Search in Google Scholar

Lowder, Matthew, Wonil Choi, Fernanda Ferreira & John Henderson. 2018. Lexical predictability during natural reading: Effects of surprisal and entropy reduction. Cognitive Science 42(S4). 1166–1183. https://doi.org/10.1111/cogs.12597.Search in Google Scholar

Montemurro, Marcelo & Damián Zanette. 2011. Universal entropy of word ordering across linguistic families. PLoS One 6(5). e19875. https://doi.org/10.1371/journal.pone.0019875.Search in Google Scholar

Muñoz, Verónica Lilian. 2015. The vocabulary of agriculture semi-popularization articles in English: A corpus-based study. English for Specific Purposes 39. 26–44. https://doi.org/10.1016/j.esp.2015.04.001.Search in Google Scholar

Myers, Greg. 1991. Lexical cohesion and specialized knowledge in science and popular science texts. Discourse Processes 14(1). 1–26. https://doi.org/10.1080/01638539109544772.Search in Google Scholar

Myers, Greg. 2003. Discourse studies of scientific popularization: Questioning the boundaries. Discourse Studies 5(2). 265–279. https://doi.org/10.1177/1461445603005002313.Search in Google Scholar

Nwogu, Kevin. 1991. Structure of science popularizations: A genre-analysis approach to the schema of popularized medical texts. English for Specific Purposes 10(2). 111–123. https://doi.org/10.1016/0889-4906(91)90004-g.Search in Google Scholar

Pan, Fan, Randi Reppen & Douglas Biber. 2016. Comparing patterns of L1 versus L2 English academic professionals: Lexical bundles in Telecommunications research journals. Journal of English for Academic Purposes 21. 60–71. https://doi.org/10.1016/j.jeap.2015.11.003.Search in Google Scholar

Pho, Phuong Dzung. 2008. Research article abstracts in applied linguistics and educational technology: A study of linguistic realizations of rhetorical structure and authorial stance. Discourse Studies 10(2). 231–250. https://doi.org/10.1177/1461445607087010.Search in Google Scholar

Rets, Irina, Lluisa Astruc, Tim Coughlan & Ursula Stickler. 2022. Approaches to simplifying academic texts in English: English teachers’ views and practices. English for Specific Purposes 68. 31–46. https://doi.org/10.1016/j.esp.2022.06.001.Search in Google Scholar

Riaza, Blanca & Izaskun Elorza. 2013. The emergence of external sources of attribution in a pilot study of science popularizations from the Guardian newspaper. Revista Española De Lingüística Aplicada 1. 51–70.Search in Google Scholar

Santiago, Márcio Sales & Maria da Graça Krieger. 2009. Terminology in the information field: Key word networks for medicine articles of scientific popularization. Calidoscopio 7(3). 237–242.10.4013/cld.2009.73.07Search in Google Scholar

Shannon, Claude Elwood. 1951. Prediction and entropy of printed English. The Bell System Technical Journal 30(1). 50–64. https://doi.org/10.1002/j.1538-7305.1951.tb01366.x.Search in Google Scholar

Shannon, Claude Elwood. 1948. A mathematical theory of communication. The Bell System Technical Journal 27(3). 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.Search in Google Scholar

Smith, Adams. 1987. The process of popularization – rewriting medical research papers for the layman: Discussion paper. Journal of the Royal Society of Medicine 80(10). 634–636. https://doi.org/10.1177/014107688708001013.Search in Google Scholar

Sun, Kun, Haitao Liu & Wenxin Xiong. 2021. The evolutionary pattern of language in scientific writings: A case study of Philosophical Transactions of Royal Society (1665–1869). Scientometrics 126(2). 1695–1724. https://doi.org/10.1007/s11192-020-03816-8.Search in Google Scholar

Swales, John. 1981. Aspects of article introductions. Birmingham: University of Aston.Search in Google Scholar

Swales, John. 1990. Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press.Search in Google Scholar

Swales, John. 2004. Research genres: Explorations and applications. Cambridge: Cambridge University Press.10.1017/CBO9781139524827Search in Google Scholar

Tankó, Gyula. 2017. Literary research article abstracts: An analysis of rhetorical moves and their linguistic realizations. Journal of English for Academic Purposes 27. 42–55. https://doi.org/10.1016/j.jeap.2017.04.003.Search in Google Scholar

Thoiron, Philippe. 1986. Diversity index and entropy as measures of lexical richness. Computers and the Humanities 20(3). 197–202. https://doi.org/10.1007/bf02404461.Search in Google Scholar

Valipouri, Leila & Hossein Nassaji. 2013. A corpus-based study of academic vocabulary in chemistry research articles. Journal of English for Academic Purposes 12(4). 248–263. https://doi.org/10.1016/j.jeap.2013.07.001.Search in Google Scholar

Varttala, Teppo. 1999. Remarks on the communicative functions of hedging in popular scientific and specialist research articles on Medicine. English for Specific Purposes 18(2). 177–200. https://doi.org/10.1016/s0889-4906(98)00007-6.Search in Google Scholar

Weissberg, Robert & Suzanne Buker. 1990. Writing up research: Experimental research report writing for students English. Englewood Cliffs: Prentice Hall Regents.Search in Google Scholar

Wen, Ju & Lei Lei. 2021. Linguistic positivity bias in academic writing: A large-scale diachronic study in life sciences across 50 years. Applied Linguistics 43(2). 340–364. https://doi.org/10.1093/applin/amab037.Search in Google Scholar

Xiao, Wei, Li Li & Jin Liu. 2022. To move or not to move: An entropy-based approach to the informativeness of research article abstracts across disciplines. Journal of Quantitative Linguistics 30(1). 1–26. https://doi.org/10.1080/09296174.2022.2037275.Search in Google Scholar

Yang, Yingli. 2013. Exploring linguistic and cultural variations in the use of hedges in English and Chinese scientific discourse. Journal of Pragmatics 50(1). 23–36. https://doi.org/10.1016/j.pragma.2013.01.008.Search in Google Scholar

Ye, Yunping. 2021. From abstracts to “60-second science” podcasts: Reformulation of scientific discourse. Journal of English for Academic Purposes 53. 101025. https://doi.org/10.1016/j.jeap.2021.101025.Search in Google Scholar

Zhang, Yiqiong. 2018. Retailing science: Genre hybridization in online science news stories. Text & Talk 38(2). 243–265. https://doi.org/10.1515/text-2017-0040.Search in Google Scholar

Zou, Hang & Ken Hyland. 2019. Reworking research: Interactions in academic articles and blogs. Discourse Studies 21(6). 713–733. https://doi.org/10.1177/1461445619866983.Search in Google Scholar

Received: 2024-01-31
Accepted: 2025-06-04
Published Online: 2025-06-18

© 2025 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 10.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/text-2024-0023/html
Scroll to top button