Startseite Investigating genre distinctions through discourse distance and discourse network
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Investigating genre distinctions through discourse distance and discourse network

  • Kun Sun ORCID logo , Rong Wang und Wenxin Xiong
Veröffentlicht/Copyright: 25. Februar 2021

Abstract

The notion of genre has been widely explored using quantitative methods from both lexical and syntactical perspectives. However, discourse structure has rarely been used to examine genre. Mostly concerned with the interrelation of discourse units, discourse structure can play a crucial role in genre analysis. Nevertheless, few quantitative studies have explored genre distinctions from a discourse structure perspective. Here, we use two English discourse corpora (RST-DT and GUM) to investigate discourse structure from a novel viewpoint. The RST-DT is divided into four small subcorpora distinguished according to genre, and another corpus (GUM) containing seven genres are used for cross-verification. An RST (rhetorical structure theory) tree is converted into dependency representations by taking information from RST annotations to calculate the discourse distance through a process similar to that used to calculate syntactic dependency distance. Moreover, the data on dependency representations deriving from the two corpora are readily convertible into network data. Afterwards, we examine different genres in the two corpora by combining discourse distance and discourse network. The two methods are mutually complementary in comprehensively revealing the distinctiveness of various genres. Accordingly, we propose an effective quantitative method for assessing genre differences using discourse distance and discourse network. This quantitative study can help us better understand the nature of genre.


Corresponding author: Kun Sun, Department of Linguistics, University of Tübingen, Tübingen, Germany, E-mail:

Award Identifier / Grant number: 742545

Funding source: Important Humanities and Social Science Research Project of Zhejiang Higher Education

Award Identifier / Grant number: 2018QN071

Funding source: Beijing Municipal Natural Science Foundation

Award Identifier / Grant number: 16YYB018

Acknowledgments

We would like to thank the three anonymous reviewers (particularly the first reviewer) for their insightful and constructive comments on the paper. We also express our sincere gratitude to the Editor-in-Chief for her great helps and generosity in improving this paper. The first author thanks his little son for his cooperation during this difficult time.

  1. Research funding: This work was supported by the ERC (European Research Council) advanced grant (No. 742545). “ The second and third authors were funded by “Important Humanities and Social Science Research Project of Zhejiang Higher Education (Fund No. 2018QN071)” and “Beijing Municipal Natural Science Foundation (Fund No.16YYB018 )” respectively.”

References

Asher, Nicholas & Alex Lascarides. 2003. Logics of conversation. Cambridge: Cambridge University Press.Suche in Google Scholar

Barabási Albert-László. 2016. Network science. Cambridge: Cambridge University Press.Suche in Google Scholar

Bax, Stephen. 2010. Discourse and genre: Using language in context. London: Palgrave Macmillan.Suche in Google Scholar

Beliankou, Andrei, Reinhard Köhler & Sven Naumann. 2012. Quantitative properties of argumentation motifs. In Methods and applications of quantitative linguistics, selected papers of the 8th international conference on quantitative linguistics, 35–43. Belgrade: University of Belgrade.Suche in Google Scholar

Berzlánovich, Ildikó & Gisela Redeker. 2012. Genre-dependent interaction of coherence and lexical cohesion in written discourse. Corpus Linguistics and Linguistic Theory 8(1). 183–208. https://doi.org/10.1515/cllt-2012-0008.Suche in Google Scholar

Biber, Douglas & Susan Conrad. 2019. Register, genre, and style. Cambridge: Cambridge University Press.10.1017/9781108686136Suche in Google Scholar

Bürkner, Paul-Christian. 2017. brms: An r package for bayesian multilevel models using stan. Journal of Statistical Software 80(1). 1–28. https://doi.org/10.18637/jss.v080.i01.Suche in Google Scholar

Carlson, Lynn & Daniel Marcu. 2001. Discourse tagging reference manual. Technical Report ISI-TR-545. University of Southern California Information Sciences Institute.Suche in Google Scholar

Carlson, Lynn, Daniel Marcu & Mary E. Okurowski. 2002. RST discourse treebank (RST-DT). LDC2002T07. Philadelphia: Linguistic Data Consortium.Suche in Google Scholar

Cong, Jin & Haitao Liu. 2014. Approaching human language with complex networks. Physics of Life Reviews 11(4). 598–618. https://doi.org/10.1016/j.plrev.2014.04.004.Suche in Google Scholar

Csardi, Gabor & Tamas Nepusz. 2006. The igraph software package for complex network research. InterJournal, Complex Systems 1695(5). 1–9.Suche in Google Scholar

Das, Debopam & Maite Taboada. 2018. Signalling of coherence relations in discourse, beyond discourse markers. Discourse Processes 55(8). 743–770. https://doi.org/10.1080/0163853x.2017.1379327.Suche in Google Scholar

Eder, Maciej, Rybicki Jan & Mike Kestemont. 2016. Stylometry with R: A package for computational text analysis. R Journal 8(1). 107–121. https://doi.org/10.32614/rj-2016-007.Suche in Google Scholar

Ferrer-i-Cancho, Ramon. 2004. Euclidean distance between syntactically linked words. Physical Review E 70(5). 056135.10.1103/PhysRevE.70.056135Suche in Google Scholar

Ferstl, Evelyn E., Jane Neumann, Carsten Bogler & D. Yves von Cramon. 2008. The extended language network: a meta-analysis of neuroimaging studies on text comprehension. Human Brain Mapping 29(5). 581–593. https://doi.org/10.1002/hbm.20422.Suche in Google Scholar

Fludernik, Monika. 2000. Genres, text types, or discourse modes? Narrative modalities and generic categorization. Style 34(2). 274–292.Suche in Google Scholar

Futrell, Richard, Kyle Mahowald & Edward Gibson. 2015. Large-scale evidence of dependency length minimization in 37 languages. Proceedings of the National Academy of Sciences 112(33). 10336–10341. https://doi.org/10.1073/pnas.1502134112.Suche in Google Scholar

Gelman, Andrew. 2005. Analysis of variance—why it is more important than ever. The Annals of Statistics 33(1). 1–53. https://doi.org/10.1214/009053604000001048.Suche in Google Scholar

Gelman, Andrew, Ben Goodrich, Jonah Gabry & Vehtari Aki. 2019. R-squared for Bayesian regression models. The American Statistician 73(3). 307–309. https://doi.org/10.1080/00031305.2018.1549100.Suche in Google Scholar

Gerani, Shima, Giuseppe Carenini & Raymond T. Ng. 2019. Modeling content and structure for abstractive review summarization. Computer Speech & Language 53. 302–331. https://doi.org/10.1016/j.csl.2016.06.005.Suche in Google Scholar

Gerani, Shima, M. Yashar Mehdad, Giuseppe Carenini, Raymond T. Ng & Bita Nejat. 2014. Abstractive summarization of product reviews using discourse structure. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1602–1613. Doha, Qatar: Association for Computational Linguistics.10.3115/v1/D14-1168Suche in Google Scholar

Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies. Cognition 68(1). 1–76. https://doi.org/10.1016/s0010-0277(98)00034-1.Suche in Google Scholar

Givón, Thomas & Masayoshi Shibatani. 2009. Syntactic complexity: Diachrony, acquisition, neurocognition, evolution. Amsterdam: John Benjamins.10.1075/tsl.85Suche in Google Scholar

Gruber, Helmut & Peter Muntigl. 2005. Generic and rhetorical structures of texts: Two sides of the same coin? Folia Linguistica 39(1–2). 75–113. https://doi.org/10.1515/flin.2005.39.1-2.75.Suche in Google Scholar

Hayashi, Katsuhiko, Tsutomu Hirao & Masaaki Nagata. 2016. Empirical comparison of dependency conversions for rst discourse trees. In Proceedings of the 17th annual meeting of the special interest group on discourse and dialogue, 128–136. Los Angeles: Association for Computational Linguistics.10.18653/v1/W16-3616Suche in Google Scholar

Hirao, Tsutomu, Yasuhisa Yoshida, Masaaki Nishino, Norihito Yasuda & Masaaki Nagata. 2013. Single-document summarization as a tree knapsack problem. In Proceedings of the 2013 conference on empirical methods in natural language processing, 1515–1520. Seattle, USA: Association for Computational Linguistics.Suche in Google Scholar

Housen, Alex, Bastien De Clercq, Folkert Kuiken & Ineke Vedder. 2019. Multiple approaches to complexity in second language research. Second Language Research 35(1). 3–21. https://doi.org/10.1177/0267658318809765.Suche in Google Scholar

Hudson, Richard. 2007. Language networks: The new word grammar. Oxford: Oxford University Press.10.1093/oso/9780199267309.001.0001Suche in Google Scholar

Hyland, Ken. 2012. Genre and discourse analysis in language for specific purposes. In Carol Chapelle (ed.), The encyclopedia of applied linguistics. Oxford: Wiley-Blackwell.10.1002/9781405198431.wbeal0452Suche in Google Scholar

Iruskieta, Mikel, Iria da Cunha & Maite Taboada. 2015. A qualitative comparison method for rhetorical structures: identifying different discourse structures in multilingual corpora. Language Resources and Evaluation 49(2). 263–309. https://doi.org/10.1007/s10579-014-9271-6.Suche in Google Scholar

Juzwiak, Chris. 2009. Stepping stones: a guided approach to writing sentences and paragraphs. Boston: Bedford/St. Martins.Suche in Google Scholar

Kolaczyk, Eric D. & Gábor Csárdi. 2014. Statistical analysis of network data with R. Heidelberg: Springer.10.1007/978-1-4939-0983-4Suche in Google Scholar

Kolodzy, Janet. 2006. Convergence journalism: Writing and reporting across the news media. Lanham, Maryland: Rowman & Littlefield.Suche in Google Scholar

Lee, David Y. W. 2001. Genres, registers, text types, domain, and styles: Clarifying the concepts and navigating a path through the BNC jungle. Language Learning & Technology 5(3). 37–72.Suche in Google Scholar

Li, Sujian, Liang Wang, Ziqiang Cao & Wenjie Li. 2014. Text-level discourse dependency parsing. In Proceedings of the 52nd annual meeting of the Association for Computational Linguistics, 25–35. Baltimore, Maryland: Association for Computational Linguistics.10.3115/v1/P14-1003Suche in Google Scholar

Liu, Haitao. 2008. Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science 9(2). 159–191.10.17791/jcs.2008.9.2.159Suche in Google Scholar

Liu, Haitao, Chunshan Xu & Junying Liang. 2017. Dependency distance: A new perspective on syntactic patterns in natural languages. Physics of Life Reviews 21. 171–193. https://doi.org/10.1016/j.plrev.2017.03.002.Suche in Google Scholar

Mann, William C. & Sandra A. Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text 8(3). 243–281. https://doi.org/10.1515/text.1.1988.8.3.243.Suche in Google Scholar

Mehler, Alexander, Andy Lücking, Sven Banisch, Philippe Blanchard & Barbara Job. 2016. Towards a theoretical framework for analyzing complex linguistic networks. Heidelberg: Springer.10.1007/978-3-662-47238-5Suche in Google Scholar

Morey, Mathieu, Philippe Muller & Nicholas Asher. 2018. A dependency perspective on rst discourse parsing and evaluation. Computational Linguistics 44(2). 198–235. https://doi.org/10.1162/coli_a_00314.Suche in Google Scholar

Newman, Mark. 2018. Networks. New York: Oxford University Press.Suche in Google Scholar

Nuzzo, Regina. 2014. Statistical errors: P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume. Nature 506(7487). 150–153. https://doi.org/10.1038/506150a.Suche in Google Scholar

Palmer, Alexis & Annemarie Friedrich. 2014. Genre distinctions and discourse modes: Text types differ in their situation type distributions. In Workshop on frontiers and connections between argumentation theory and natural language processing. Italy: Forlì-Cesena, July 21–25.Suche in Google Scholar

Phillips, Collin, Nina Kazanina, & Shani H. Abada. 2005. ERP effects of the processing of syntactic long-distance dependencies. Cognitive Brain Research 22(3). 407–428. https://doi.org/10.1016/j.cogbrainres.2004.09.012.Suche in Google Scholar

Pons, Pascal & Matthieu Latapy. 2005. Computing communities in large networks using random walks. In Pinar Yolum, Tunga Güngör, Fikret Gürgen & Can Özturan (eds.), Computer and information sciences – ISCIS 2005, 284–293. Heidelberg: Springer.10.1007/11569596_31Suche in Google Scholar

Sagae, Kenji. 2009. Analysis of discourse structure with syntactic dependencies and data driven shift-reduce parsing. In Proceedings of the 11th international conference on parsing technologies, 81–84. Paris: Association for Computational Linguistics.10.3115/1697236.1697253Suche in Google Scholar

Sanders, Ted & Carel van Wijk. 1996. Pisa—A procedure for analyzing the structure of explanatory texts. Text 16(1). 91–132. https://doi.org/10.1515/text.1.1996.16.1.91.Suche in Google Scholar

Sanders, Ted J., Demberg Vera, Jet Hoek, Merel C. J. Scholman, Fatemeh Torabi Asr, Sandrine Zufferey & Jacqueline Evers-Vermeul. 2018. Unifying dimensions in coherence relations: How various annotation frameworks are related. Corpus Linguistics and Linguistic Theory. https://doi.org/10.1515/cllt-2016-0078.Suche in Google Scholar

Siew, Cynthia S., Dirk U. Wulff, Nicole M. Beckage & Yoed N. Kenett. 2019. Cognitive network science: A review of research on cognition through the lens of network representations, processes, and dynamics. Complexity 2019. 24. https://doi.org/10.1155/2019/2108423.Suche in Google Scholar

Smith, Carlota S. 2003. Modes of discourse: The local structure of texts. Cambridge: Cambridge University Press.10.1017/CBO9780511615108Suche in Google Scholar

Stede, Manfred, Stergos Afantenos, Andreas Peldszus, Nicholas Asher & Jérémy Perret. 2016. Parallel discourse annotations on a corpus of short texts. In Proceedings of the tenth international conference on Language Resources and Evaluation (LREC’16), 1051–1058. Portorož, Slovenia: European Language Resources Association.Suche in Google Scholar

Sun, Kun & Wenxin Xiong. 2019. A computational model for measuring discourse complexity. Discourse Studies 21(6). 690–712. https://doi.org/10.1177/1461445619866985.Suche in Google Scholar

Sun, Kun & Lili Zhang. 2018. Quantitative aspects of PDTB-style discourse relations across languages. Journal of Quantitative Linguistics 25(4). 342–371.10.1080/09296174.2017.1390934Suche in Google Scholar

Swales, John. 1990. Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press.Suche in Google Scholar

Taboada, Maite & Julia Lavid. 2003. Rhetorical and thematic patterns in scheduling dialogues: A generic characterization. Functions of Language 10(2). 147–178. https://doi.org/10.1075/fol.10.2.02tab.Suche in Google Scholar

Taboada, Maite & William C. Mann. 2006. Rhetorical structure theory: Looking back and moving ahead. Discourse Studies 8(3). 423–459. https://doi.org/10.1177/1461445606061881.Suche in Google Scholar

Temperley, David. 2007. Minimization of dependency length in written English. Cognition 105(2). 300–333. https://doi.org/10.1016/j.cognition.2006.09.011.Suche in Google Scholar

Upton, Thomas A. 2002. Understanding direct mail letters as a genre. International Journal of Corpus Linguistics 7(1). 65–85. https://doi.org/10.1075/ijcl.7.1.04upt.Suche in Google Scholar

Van Dijk, Teun A. 1985. Structures of news in the press. In Teun A. van Dijk (ed.), Discourse and communication: New approaches to the analysis of mass media discourse and communication, 69–93. Berlin: De Gruyter.10.1515/9783110852141Suche in Google Scholar

Van Dijk, Teun A. 2019. Macrostructures: An interdisciplinary study of global structures in discourse, interaction, and cognition. London: Routledge.10.4324/9780429025532Suche in Google Scholar

Wang, Yaqin & Haitao Liu. 2017. The effects of genre on dependency distance and dependency direction. Language Sciences 59. 135–147. https://doi.org/10.1016/j.langsci.2016.09.006.Suche in Google Scholar

Webber, Bonnie. 2009. Genre distinctions for discourse in the Penn treebank. In Proceedings of the joint conference of the 47th annual meeting of the ACL, 674–682. Singapore: Association for Computational Linguistics.10.3115/1690219.1690240Suche in Google Scholar

Williams, Sandra & Ehud Reiter. 2003. A corpus analysis of discourse relations for natural language generation. In Proceedings of corpus linguistics, 28–31. U.K.: Lancaster University.Suche in Google Scholar

Yang, Zhao, René Algesheimer & Tessone J Claudio. 2016. A comparative analysis of community detection algorithms on artificial networks. Scientific Reports 6. 30750. https://doi.org/10.1038/srep30750.Suche in Google Scholar

Zeldes, Amir. 2016. rstWeb – A browser-based annotation interface for rhetorical structure theory and discourse relations. In Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics, 1–5. San Diego, CA: Association for Computational Linguistics.10.18653/v1/N16-3001Suche in Google Scholar

Zeldes, Amir. 2017. The gum corpus: Creating multilayer resources in the classroom. Language Resources and Evaluation 51(3). 581–612. https://doi.org/10.1007/s10579-016-9343-x.Suche in Google Scholar

Zeldes, Amir. 2018. Multilayer corpus studies. London: Routledge.10.4324/9781315112473Suche in Google Scholar

Zhang, Hongxin & Haitao Liu. 2016. Rhetorical relations revisited across distinct levels of discourse unit granularity. Discourse Studies 18(4). 454–472. https://doi.org/10.1177/1461445616647891.Suche in Google Scholar

Zinsser, William. 2006. On writing well: The classic guide to writing nonfiction. New York, NY: HarperCollins.Suche in Google Scholar


Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/cllt-2020-0064).


Received: 2020-02-12
Accepted: 2021-02-05
Published Online: 2021-02-25

© 2021 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 5.10.2025 von https://www.degruyterbrill.com/document/doi/10.1515/cllt-2020-0064/html?lang=de
Button zum nach oben scrollen