Identifying aboutgrams in engineering texts
-
Martin Warren
Abstract
This paper uses a new computer-mediated methodology, concgramming, to identify the aboutness of a text. Concgrams are the raw products of the concgramming process and consist of up to five co-occurring words irrespective of whether constituency variation (i.e. AB, A*B where * represents an intervening word) and/or positional variation (i.e. AB, BA) is present. Two engineering research articles are concgrammed to identify the most frequently occurring two-word lexical concgrams. The most frequent two-word lexical concgrams for each text are examined to determine whether the words simply co-occur or are meaningfully associated. Once this has been done, a provisional list of “aboutgrams” is drawn up which is tentatively taken to represent the aboutness of each text. These lists are then referred to a specialised corpus of engineering texts and then a general reference corpus. Those aboutgrams on the lists which are consistently more frequent in the text than in the two corpora are then put forward as representing the aboutness of the text. In the study, the lists of aboutgrams are compared with single word frequency lists to evaluate the advantages to be gained from determining aboutness by means of phraseology rather than key words. The conclusion is that aboutgrams are a better means for uncovering the aboutness of the texts.
Abstract
This paper uses a new computer-mediated methodology, concgramming, to identify the aboutness of a text. Concgrams are the raw products of the concgramming process and consist of up to five co-occurring words irrespective of whether constituency variation (i.e. AB, A*B where * represents an intervening word) and/or positional variation (i.e. AB, BA) is present. Two engineering research articles are concgrammed to identify the most frequently occurring two-word lexical concgrams. The most frequent two-word lexical concgrams for each text are examined to determine whether the words simply co-occur or are meaningfully associated. Once this has been done, a provisional list of “aboutgrams” is drawn up which is tentatively taken to represent the aboutness of each text. These lists are then referred to a specialised corpus of engineering texts and then a general reference corpus. Those aboutgrams on the lists which are consistently more frequent in the text than in the two corpora are then put forward as representing the aboutness of the text. In the study, the lists of aboutgrams are compared with single word frequency lists to evaluate the advantages to be gained from determining aboutness by means of phraseology rather than key words. The conclusion is that aboutgrams are a better means for uncovering the aboutness of the texts.
Chapters in this book
- Prelim pages i
- Table of contents v
- Perspectives on keywords and keyness 1
-
Section I. Exploring keyness
- Three concepts of keywords 21
- Problems in investigating keyness, or clearing the undergrowth and marking out trails… 43
- Closed-class keywords and corpus-driven discourse analysis 59
- Hyperlinks 79
- Web Semantics vs the Semantic Web? 93
-
Section II. Keyness in specialised discourse
- Identifying aboutgrams in engineering texts 113
- Keywords and phrases in political speeches 127
- Key words and key phrases in a corpus of travel writing 147
- History v. marketing 169
- Metaphorical keyness in specialised corpora 185
-
Section III. Critical and educational perspectives
- A contrastive analysis of keywords in newspaper articles on the “Kyoto Protocol” 207
- Keywords in Korean national consciousness 219
- General spoken language and school language 235
- Index 249
Chapters in this book
- Prelim pages i
- Table of contents v
- Perspectives on keywords and keyness 1
-
Section I. Exploring keyness
- Three concepts of keywords 21
- Problems in investigating keyness, or clearing the undergrowth and marking out trails… 43
- Closed-class keywords and corpus-driven discourse analysis 59
- Hyperlinks 79
- Web Semantics vs the Semantic Web? 93
-
Section II. Keyness in specialised discourse
- Identifying aboutgrams in engineering texts 113
- Keywords and phrases in political speeches 127
- Key words and key phrases in a corpus of travel writing 147
- History v. marketing 169
- Metaphorical keyness in specialised corpora 185
-
Section III. Critical and educational perspectives
- A contrastive analysis of keywords in newspaper articles on the “Kyoto Protocol” 207
- Keywords in Korean national consciousness 219
- General spoken language and school language 235
- Index 249