A Multidimensional Model of Syntactic Dependency Trees for Authorship Attribution
-
Alexander Mehler
Abstract
In this chapter we introduce a multidimensional model of syntactic dependency trees. Our ultimate goal is to generate fingerprints of such trees to predict the author of the underlying sentences. The chapter makes a first attempt to create such fingerprints for sentence categorization via the detour of text categorization. We show that at text level, aggregated dependency structures actually provide information about authorship. At the same time, we show that this does not hold for topic detection. We evaluate our model using a quarter of a million sentences collected in two corpora: the first is sampled from literary texts, the second from Wikipedia articles. As a second finding of our approach, we show that quantitative models of dependency structure do not yet allow for detecting syntactic alignment in written communication. We conclude that this is mainly due to effects of lexical alignment on syntactic alignment.
Abstract
In this chapter we introduce a multidimensional model of syntactic dependency trees. Our ultimate goal is to generate fingerprints of such trees to predict the author of the underlying sentences. The chapter makes a first attempt to create such fingerprints for sentence categorization via the detour of text categorization. We show that at text level, aggregated dependency structures actually provide information about authorship. At the same time, we show that this does not hold for topic detection. We evaluate our model using a quarter of a million sentences collected in two corpora: the first is sampled from literary texts, the second from Wikipedia articles. As a second finding of our approach, we show that quantitative models of dependency structure do not yet allow for detecting syntactic alignment in written communication. We conclude that this is mainly due to effects of lexical alignment on syntactic alignment.
Chapters in this book
- Frontmatter I
- Preface V
- Contents XI
- Dependency, Corpora and Cognition 1
- Interrelations among Dependency Tree Widths, Heights and Sentence Lengths 31
- Quantitative Analysis of Syntactic Dependency in Czech 53
- Dissortativity in a Bipartite Network of Dependency Relations and Communicative Functions 71
- Empirical Analyses of Valency Structures 93
- Regular Dynamic Patterns of Verbal Valency Ellipsis in Modern Spoken Chinese 101
- Negentropy of Dependency Types and Parts of Speech in the Clause 119
- Dynamic Valency and Dependency Distance 145
- Minimization and Probability Distribution of Dependency Distance in the Process of Second Language Acquisition 167
- Influences of Dependency Distance on the Syntactic Development of Deaf and Hard-ofhearing Students 191
- Positional Aspects of Dependency Distance 213
- Dependency Distance and Direction of English Relative Clauses 239
- Differences between English Subject Postmodifiers and Object Post-modifiers: From the Perspective of Dependency Distance 261
- How Do Universal Dependencies Distinguish Language Groups? 277
- A Quantitative Analysis on a Literary Genre Essay’s Syntactic Features 295
- A Multidimensional Model of Syntactic Dependency Trees for Authorship Attribution 315
- Subject Index 349
- Author Index 357
- List of Contributors 365
Chapters in this book
- Frontmatter I
- Preface V
- Contents XI
- Dependency, Corpora and Cognition 1
- Interrelations among Dependency Tree Widths, Heights and Sentence Lengths 31
- Quantitative Analysis of Syntactic Dependency in Czech 53
- Dissortativity in a Bipartite Network of Dependency Relations and Communicative Functions 71
- Empirical Analyses of Valency Structures 93
- Regular Dynamic Patterns of Verbal Valency Ellipsis in Modern Spoken Chinese 101
- Negentropy of Dependency Types and Parts of Speech in the Clause 119
- Dynamic Valency and Dependency Distance 145
- Minimization and Probability Distribution of Dependency Distance in the Process of Second Language Acquisition 167
- Influences of Dependency Distance on the Syntactic Development of Deaf and Hard-ofhearing Students 191
- Positional Aspects of Dependency Distance 213
- Dependency Distance and Direction of English Relative Clauses 239
- Differences between English Subject Postmodifiers and Object Post-modifiers: From the Perspective of Dependency Distance 261
- How Do Universal Dependencies Distinguish Language Groups? 277
- A Quantitative Analysis on a Literary Genre Essay’s Syntactic Features 295
- A Multidimensional Model of Syntactic Dependency Trees for Authorship Attribution 315
- Subject Index 349
- Author Index 357
- List of Contributors 365