John Benjamins Publishing Company
Chapter 11. Blogging around the world
-
und
Abstract
The borderless nature of blogging raises the question whether the traditional regionally defined varieties of English continue to hold true (see Crystal 2011). In order to investigate the extent to which the language published online without external intervention is similar around the world, this chapter investigates repetitive patterns, or 3-grams, found in blogs in the 583-million-word GloWbE corpus (Davies 2013). The data shows two types of repetitive word sequences: universal, or those that are frequent in all or most of the nineteen geographic locations represented in the corpus, and localised, or those unique to specific regions. We explore multiple ways of approaching the regional distribution of universal and localised 3-grams, such as statistical similarity measures (Jaccard coefficient and hierarchical clustering) and network visualisations. Three correlated research issues are addressed by this study: (1) the ratio of 3-grams in blogs from various World Englishes, which will shed light onto the degree of formulaicity in Web Englishes around the world; (2) the overlaps between various locations in terms of preferred sequences, which may point to local or global standardization hubs on the level of sentence and text construction; (3) finally, the status of model-providing varieties for internet communication, especially American English, in view of the most frequent 3-grams from other locations (cf. Mair 2013).
Abstract
The borderless nature of blogging raises the question whether the traditional regionally defined varieties of English continue to hold true (see Crystal 2011). In order to investigate the extent to which the language published online without external intervention is similar around the world, this chapter investigates repetitive patterns, or 3-grams, found in blogs in the 583-million-word GloWbE corpus (Davies 2013). The data shows two types of repetitive word sequences: universal, or those that are frequent in all or most of the nineteen geographic locations represented in the corpus, and localised, or those unique to specific regions. We explore multiple ways of approaching the regional distribution of universal and localised 3-grams, such as statistical similarity measures (Jaccard coefficient and hierarchical clustering) and network visualisations. Three correlated research issues are addressed by this study: (1) the ratio of 3-grams in blogs from various World Englishes, which will shed light onto the degree of formulaicity in Web Englishes around the world; (2) the overlaps between various locations in terms of preferred sequences, which may point to local or global standardization hubs on the level of sentence and text construction; (3) finally, the status of model-providing varieties for internet communication, especially American English, in view of the most frequent 3-grams from other locations (cf. Mair 2013).
Kapitel in diesem Buch
- Prelim pages i
- Table of contents v
- Acknowledgements vii
- Chapter 1. Present applications and future directions in pattern-driven approaches to corpus linguistics 1
-
Part I. Methodological explorations
- Chapter 2. From lexical bundles to surprisal and language models 15
- Chapter 3. Fine-tuning lexical bundles 57
- Chapter 4. Lexical obsolescence and loss in English: 1700–2000 81
-
Part II. Patterns in utilitarian texts
- Chapter 5. Constance and variability 107
- Chapter 6. Between corpus-based and corpus-driven approaches to textual recurrence 131
- Chapter 7. Lexical bundles in Early Modern and Present-day English Acts of Parliament 159
-
Part III. Patterns in online texts
- Chapter 8. Lexical bundles in Wikipedia articles and related texts 189
- Chapter 9. Join us for this 213
- Chapter 10. I don’t want to and don’t get me wrong 251
- Chapter 11. Blogging around the world 277
- Index 311
Kapitel in diesem Buch
- Prelim pages i
- Table of contents v
- Acknowledgements vii
- Chapter 1. Present applications and future directions in pattern-driven approaches to corpus linguistics 1
-
Part I. Methodological explorations
- Chapter 2. From lexical bundles to surprisal and language models 15
- Chapter 3. Fine-tuning lexical bundles 57
- Chapter 4. Lexical obsolescence and loss in English: 1700–2000 81
-
Part II. Patterns in utilitarian texts
- Chapter 5. Constance and variability 107
- Chapter 6. Between corpus-based and corpus-driven approaches to textual recurrence 131
- Chapter 7. Lexical bundles in Early Modern and Present-day English Acts of Parliament 159
-
Part III. Patterns in online texts
- Chapter 8. Lexical bundles in Wikipedia articles and related texts 189
- Chapter 9. Join us for this 213
- Chapter 10. I don’t want to and don’t get me wrong 251
- Chapter 11. Blogging around the world 277
- Index 311