Home Linguistics & Semiotics A Multidimensional Model of Syntactic Dependency Trees for Authorship Attribution
Chapter
Licensed
Unlicensed Requires Authentication

A Multidimensional Model of Syntactic Dependency Trees for Authorship Attribution

  • Alexander Mehler , Wahed Hemati , Tolga Uslu and Andy Lücking
Become an author with De Gruyter Brill

Abstract

In this chapter we introduce a multidimensional model of syntactic dependency trees. Our ultimate goal is to generate fingerprints of such trees to predict the author of the underlying sentences. The chapter makes a first attempt to create such fingerprints for sentence categorization via the detour of text categorization. We show that at text level, aggregated dependency structures actually provide information about authorship. At the same time, we show that this does not hold for topic detection. We evaluate our model using a quarter of a million sentences collected in two corpora: the first is sampled from literary texts, the second from Wikipedia articles. As a second finding of our approach, we show that quantitative models of dependency structure do not yet allow for detecting syntactic alignment in written communication. We conclude that this is mainly due to effects of lexical alignment on syntactic alignment.

Abstract

In this chapter we introduce a multidimensional model of syntactic dependency trees. Our ultimate goal is to generate fingerprints of such trees to predict the author of the underlying sentences. The chapter makes a first attempt to create such fingerprints for sentence categorization via the detour of text categorization. We show that at text level, aggregated dependency structures actually provide information about authorship. At the same time, we show that this does not hold for topic detection. We evaluate our model using a quarter of a million sentences collected in two corpora: the first is sampled from literary texts, the second from Wikipedia articles. As a second finding of our approach, we show that quantitative models of dependency structure do not yet allow for detecting syntactic alignment in written communication. We conclude that this is mainly due to effects of lexical alignment on syntactic alignment.

Downloaded on 2.10.2025 from https://www.degruyterbrill.com/document/doi/10.1515/9783110573565-016/html
Scroll to top button