Series
Studies in Corpus Linguistics
Book
Requires Authentication
Unlicensed
Licensed
Volume 121 in this series
The present volume is intended as a reference book on Wikipedia corpus studies, from corpus construction to exploration and analysis. Wikipedia is a complex object, difficult to manipulate for linguists and corpus researchers. In addition to the encyclopedic articles consulted by millions of users, it contains vast spaces of written discussions, aka talk pages, where Wikipedia authors negotiate the collaborative editing of articles, make evaluations, or discuss related topics. The proposed volume covers Wikipedia articles, their revision histories, and discussions, with a focus on discussions, which have not been studied extensively so far and have also been neglected in previous corpus building efforts. Wikipedia discussions are instances of computer-mediated communication (CMC), thus constituting a completely different, interaction-oriented linguistic genre. Sophisticated tools and methods of linguistic annotation and corpus exploration are needed to exploit the huge and valuable corpus resources that can be constructed from the Wikipedia discussions. The present volume aims at encouraging and facilitating Wikipedia corpus studies, providing standards, recommendations, and innovative methods to build and explore Wikipedia corpora, and presenting corpus studies that make the most of the peculiarities of Wikipedia.
Book
Requires Authentication
Unlicensed
Licensed
Volume 120 in this series
Discover the intricate dynamics of L2 prosody with this pioneering study, which examines how advanced learners from Czech, German, and Spanish backgrounds engage with British and American English intonation. By employing a multidimensional approach - spanning phonetic, phonological, discourse-pragmatic, and sociolinguistic perspectives - this book provides a comprehensive overview of L2 prosodic features, highlighting patterns of intonational phrasing, f0 range, and the use of tones and uptalk. Building on foundational works by Pierrehumbert, Mennen, and Gut, this work bridges significant gaps in the field by comparing different L1 and L2 varieties, integrating diverse linguistic variables, and proposing a multifactorial model of L2 prosody. Relevant for linguists, language educators, and researchers in SLA, the findings offer valuable insights for reducing foreign accents and enhancing intelligibility, making it an essential resource for improving language teaching methodologies and learner outcomes. Dive into this essential guide and elevate your understanding of L2 prosody and its impact on effective communication.
Book
Requires Authentication
Unlicensed
Licensed
Volume 119 in this series
This volume illustrates new trends in corpus linguistics and shows how corpus approaches can be used to investigate new datasets and emerging areas in linguistics and related fields. It addresses innovative research questions, for example how prosodic analyses can increase the accuracy of syntactic segmentation, how tolerant English language teachers are about language variation, or how natural language can be translated into corpus query language. The thematic scope encompasses four types of ‘boundary crossings’. These include the incorporation of innovative scientific methods, specifically new statistical techniques, acoustic analysis and stylistic investigations. Additionally, temporal boundaries are crossed through the use of new methods and corpora to study diachronic data. New methodologies are also explored through the analysis of prosody, variety-specific approaches, and teacher attitudes. Finally, corpus users can cross boundaries by employing a more user-friendly corpus query language.
Book
Requires Authentication
Unlicensed
Licensed
Volume 118 in this series
This book contributes to the discussion of challenges faced in different areas of corpus linguistics, namely the compilation, annotation, and analysis of linguistic corpora. In a field of growing corpus sizes and expanding possibilities of gathering data, some old issues persist, while at the same time new problems have emerged. As the compilation and study of language corpora gets increasingly sophisticated and complex, continuous attention on ways of dealing with the data in question and challenges in text selection and interpretation is needed. The contributions to this volume address problems relating to a variety of areas in corpus linguistic study, including corpus annotation, data variability, learner language, social media texts, and database utilization. The authors provide critical overviews and research-based analyses, discuss the nature of some of the common pitfalls, and offer solutions to existing problems.
Book
Requires Authentication
Unlicensed
Licensed
Volume 117 in this series
This book provides a comprehensive description of the situational and linguistic characteristics of undergraduate student writing, considering both assignment type and discipline. Drawing on a corpus of more than 900 undergraduate student assignments from four disciplinary groups (Arts and Humanities, Social Sciences, Physical Sciences, and Life Sciences), the book combines corpus-based analyses of linguistic features with analyses of communicative purposes and text characteristics. Variation in University Writing takes a new approach to register variation by grouping assignments by their communicative purpose (to argue, to explain, to compare, to describe, to narrate a personal event, to give a procedural recount, to give personal advice, and to propose), rather than register categories. A multidimensional analysis provides a detailed description of the linguistic patterns of undergraduate writing. The findings presented in this book will be of interest to teachers of writing, instructors of English for Academic Purposes (EAP), and researchers of university writing.
Book
Requires Authentication
Unlicensed
Licensed
Volume 116 in this series
This book provides a systematic, empirical account of the language typically presented in English as a Foreign Language (EFL) textbooks, based on a large corpus of EFL textbooks used in secondary schools. A modified version of the Multi-Dimensional Analysis (MDA) framework serves to examine linguistic variation both within textbooks and compared to corpora representing ‘real-life’ English as used outside the EFL classroom. The results highlight the characteristics of Textbook English that define it as a distinct variety of English. In light of the study's pedagogical implications, this book proposes a range of corpus-based approaches to improve the naturalness of textbook texts. It also contributes to advancing quantitative corpus linguistics methodology: its detailed online supplements aim for methodological transparency and reproducibility in line with the principles of Open Science. This book will be of interest to linguistics and language education students and researchers, as well as EFL teachers, textbook authors and editors, and those involved in curriculum development and teacher training.
Book
Requires Authentication
Unlicensed
Licensed
Volume 115 in this series
This book is an attempt to revisit the main specifically corpus-linguistic statistics/measures the field has been relying on for decades: frequency, dispersion, association, and keyness. The book first discusses the purpose of these measures and how they have been measured. Then, the book makes three main proposals: First, that many measures of dispersion, association, and keyness are too confounded with frequency and how to 'take frequency out of them' to obtain conceptually cleaner and more interpretable measures. Second, that many existing measures can be replaced by the simple information-theoretic measure of the Kullback-Leibler divergence and that it, too, can have frequency 'removed' from it. Third, that corpus linguistics should abandon the tradition of trying to describe its findings with a single number and adopt a tupleization approach instead, where we use several separate dimensions of information for description and interpretation. The book is written in an informal, hands-on style and comes with its own R package featuring functions, example data, and several thousand lines of code exemplifying all applications.
Book
Requires Authentication
Unlicensed
Licensed
Volume 114 in this series
This book examines developments in the use of the present perfect and the preterite in Late Modern and contemporary English, with a focus on American and British English. Drawing on neo-Gricean pragmatics, it proposes a novel and principled analysis of the verb forms’ context-independent meanings and context-dependent inferences. State-of-the-art corpus linguistic methods are used to track their functional changes over two and a half centuries. The book presents new evidence of grammatical change and offers a compelling, contact-based account of regional variation. It brings together the insights of various fields, including formal semantics, historical linguistics, linguistic typology, and variationist sociolinguistics.
Book
Requires Authentication
Unlicensed
Licensed
Volume 113 in this series
Cross-linguistic research is a fruitful field of language inquiry that has benefited enormously from the use of corpora. As sources of linguistic data of various kinds and as tools for language processing, corpora have shaped the development of cross-linguistic research, enabling both language description and practical applications. This volume contains twelve studies that emphasize the usefulness and usability of parallel corpora in accurately exploring the structure and use of seven under-researched languages and language varieties. The first part emphasizes the role of corpus-based descriptive analyses at the lexicogrammatical and discursive levels, as a first step on the way towards concrete applications like translation or language teaching. The second part focuses on the role of parallel-corpus-based language processing techniques and applications that facilitate professional communication. This book will be of interest to scholars in contrastive linguistics, translation studies, discourse analysis, language teaching, and natural language processing.
Book
Requires Authentication
Unlicensed
Licensed
Volume 112 in this series
This book represents a detailed discussion and corpus analysis of Theme in English and German originals and translations. The empirical results are based on thousands of clauses from four different registers, cover a variety of linguistic aspects including multiple Themes, marked Themes, participant roles, agency, and identifiability, and are tested statistically using regression analyses. The book sheds light on one of the most elusive concepts of the systemic functional linguistics framework, Theme, by comparing it with different approaches, related concepts, and realizations in different languages and by examining empirically different Theme models, contrastive differences, and translation effects. Given that Theme in English and German is realized formally by being the first clause constituent and is thus, effectively, a syntactic phenomenon, this monograph is not only relevant for functional linguists, but any interested in English and German word order differences and their effects on translations.
Book
Requires Authentication
Unlicensed
Licensed
Volume 111 in this series
As the legislative bodies of democratic nations, parliaments play a fundamental role in society. Consequently the linguistic practices observed in parliamentary discourse are of importance to everyone. This volume brings together leading researchers in areas of corpus linguistics, big data, parliamentary discourse, and historical linguistics in a truly interdisciplinary exploration at the vanguard of big data and corpus methods with the aim to investigate the intersection between linguistic and social change. Making use of both quantitative and qualitative methods, the studies included in this volume range from a focus on explicitly linguistic phenomena to topics that contribute to our understanding of language and society more generally. It breaks new ground in its critical reflection on the conceptual and methodological challenges of using large corpora of parliamentary discourse to study both the specialised language of parliamentary speech and the societies that the parliaments in question represent and govern.
Book
Requires Authentication
Unlicensed
Licensed
Volume 110 in this series
Corpus Dialectology combines the fields of corpus linguistics and dialectological mapping. It concerns documentation of linguistic variation and mapping of linguistic spaces and boundaries, while ascribing renewed importance to the methodology and the material itself, especially data processing and statistical analysis. This approach considers phenomena that have received little attention to date, such as migration, language contact, mobility and educational level, as well as the differentiation between rural and urban spaces. Transparently described and intersubjectively comprehensible encodings permit the enhancement of dialectometry in the context of Digital Humanities and further development of linguistic theories of variation and change, as well as different levels of structure (phonology, morphosyntax, semantics). This book contains nine chapters on ongoing corpus dialectological research projects. They discuss current issues of data collection, for example the validity of crowdsourced data, explore challenges and possibilities of data analysis and offer theoretical reflections on virtual Romance geolinguistics.
Book
Requires Authentication
Unlicensed
Licensed
Volume 109 in this series
Corpora and Rhetorically Informed Text Analysis explores applications of rhetorically informed approaches to corpus research. Bringing together contributions from scholars in a variety of fields, it takes up questions of how theories and traditions in rhetorical analysis can be integrated with corpus techniques in order to enrich our understanding of language use, variation, and history. The studies included in this volume shed light on areas as diverse as student academic writing, political discourse, and the digital humanities. These studies all make use of a dictionary-based tagger called DocuScope, which recognizes tens-of-millions of words and phrases and slots them into categories based on their rhetorical functions. While DocuScope provides a through-line that both links the studies’ various analytical procedures and primes their rhetorical insights, the volume is about more than the explanatory power of a single tool. It demonstrates how rhetorically informed approaches can complement more established corpus methodologies, underscoring their combined potential.
Book
Requires Authentication
Unlicensed
Licensed
Volume 108 in this series
This collected volume showcases cutting-edge research in the rapidly developing area of sign language corpus linguistics in various sign language contexts across the globe. Each chapter provides a detailed account of particular national corpora and methodological considerations in their construction. Part 1 focuses on corpus-based linguistic findings, covering aspects of morphology, syntax, multilingualism, and regional and diachronic variation. Part 2 explores innovative solutions to challenges in building and annotating sign language corpora, touching on the construction of comparable sign language corpora, collaboration challenges at the national level, phonological arrangement of digital lexicons, and (semi-)automatic annotation. This unique volume documenting the growth in breadth and depth within the discipline of sign language corpus linguistics is a key resource for researchers, teachers, and postgraduate students in the field of sign language linguistics, and will also provide valuable insights for other researchers interested in corpus linguistics, Construction Grammar, and gesture studies.
Book
Requires Authentication
Unlicensed
Licensed
Volume 107 in this series
This volume examines rhetorical conventions employed in mechanical engineering research to understand the knowledge-making principles of the discipline, as well as their expression within the research article. In particular, the study analyses the organisational patterns of mechanical engineering research articles using Swales’s conceptualisation of moves and steps. In addition, the research identifies the phraseology associated with specific moves and steps. The study draws on a corpus of 120 mechanical engineering research articles, equally distributed across two sub-disciplines (mechanical systems and thermal-fluids engineering), three research traditions (experimental, theoretical and mixed methods), and two publication periods (2002–2006 and 2012–2016). It adopts an integrated methodology, intertwining various approaches and perspectives including corpus linguistics, move analysis, discourse analysis and interviews to address two main strands of research enquiry: (i) What are the properties of the rhetorical structures in terms of range, frequency, and length for each section of mechanical engineering research articles? (ii) What effect does sub-discipline, research tradition and publication date have on the rhetorical structure of research articles?
Book
Requires Authentication
Unlicensed
Licensed
Volume 106 in this series
This book explores how language is used to create characters in fictional television series. To do so, it draws on multiple case studies from the United States and Australia. Brought together in this book for the first time, these case studies constitute more than the sum of their parts. They highlight different aspects of televisual characterisation and showcase the use of different data, methods, and approaches in its analysis. Uniquely, the book takes a mixed-method approach and will thus not only appeal to corpus linguists but also researchers in sociolinguistics, stylistics, and pragmatics. All corpus linguistic techniques are clearly introduced and explained, and the book is thus accessible to both experienced researchers as well as novice researchers and students. It will be essential reading in linguistics, literature, stylistics, and media/television studies.Winner of the Screenwriting Research Network 2023 Best Monograph award!
Book
Requires Authentication
Unlicensed
Licensed
Volume 105 in this series
This volume presents a snapshot of the current state of the art of research in English corpus linguistics. It contains selected papers from the 40th ICAME conference in 2019 and features contributions from experts in synchronic, diachronic, and contrastive linguistics, as well as in sociolinguistics, phonetics, discourse analysis, and learner language. The volume showcases the particular strengths of research in the ICAME tradition. The papers in this volume offer new insights from the reanalysis of new data types, methodological refinements and advancements of quantitative analysis, and from taking new perspectives on ongoing debates in their respective fields.
Book
Requires Authentication
Unlicensed
Licensed
Volume 104 in this series
This volume illustrates the high potential of learner corpus investigations for research into the CAF triad by presenting eleven original learner corpus-based studies which are set within solid theoretical frameworks, examine learner corpora with state-of-the-art analytical techniques and yield highly interesting findings. The volume’s major strength lies in the range of issues it undertakes and in its interdisciplinary thematic novelty. The chapters collectively address all three dimensions of L2 performance related to different linguistic subsystems (i.e. lexical, phraseological and grammatical complexity and accuracy, along with fluency) as well as the interactions among these constructs. The studies are based on data drawn from carefully compiled learner corpora which are analysed with the help of diverse corpus-based methods. The theoretical discussions and the empirical results shall contribute to the advancement of the fields of SLA and writing and speech research and shall inspire further investigations in the area of the CAF triad.
Book
Requires Authentication
Unlicensed
Licensed
Volume 103 in this series
As the first collective volume to focus exclusively on corpus-based approaches to register variation, this book provides an exhaustive account of the range and depth of possibilities that the domain of register variation in English has to offer. It illustrates register variation analysis in different theoretical frameworks, such as Probabilistic Grammar, Systemic Functional Linguistics, and Information Theory, and proposes a new framework within the Text Linguistic Approach: the continuous-situational analytical framework. Several of the contributions apply Multi-Dimensional Analysis to corpus data in order to unveil register (dis)similarities, while others rely on logistic regression models and periodization techniques based on Kullback-Leibler divergence. The volume includes both inter-register and intra-register variation analysis of a wide spectrum of varieties, speakers and periods: British and American English, learner varieties, L2 varieties, and also contains diachronic studies covering early and late Modern English. This broad scope should be a source of inspiration for anyone interested in historical and ongoing register variation in a vast range of varieties of English worldwide.
Book
Requires Authentication
Unlicensed
Licensed
Volume 102 in this series
In over 30 years of data-driven learning (DDL) research, there has been a growing sophistication in the ways we collect, analyse, and put corpus data to use. This volume takes a three-fold perspective on DDL. It first looks at DDL and its role in informing language learning theory and how it might shed light on the language development process; secondly it addresses how DDL can help us characterise learner language and inform teaching accordingly, and thirdly it showcases practical applications for the use of DDL in classrooms. The contributors to this volume examine a variety of instructional settings and languages across the world. They reflect on theoretical, methodological and classroom implications using both novel and established language learning theories, natural language processing (NLP), longitudinal research designs, and a variety of language learning targets. The present volume is an invitation from some of the leading researchers in DDL to reflect on the research avenues that will define the field in the coming years.
Book
Requires Authentication
Unlicensed
Licensed
Volume 101 in this series
This volume comprises a collection of contrastive studies on language and time. Languages represented include Czech, French, German, Mandarin, Norwegian and Swedish, all of which are contrasted with English. While the amount of published research on temporal relations in general is considerable, less work has been carried out on comparing how we talk about time in various languages and how languages change over time. Several methodological challenges are addressed and solutions proposed, such as how to deal with poor quality historical data and how to identify n-grams in typologically different languages for purposes of comparison. The results of the various studies show how multilingual corpora can increase our knowledge of language-specific features as well as linguistic, typological and cultural differences and similarities across languages.
Book
Requires Authentication
Unlicensed
Licensed
Volume 100 in this series
This book takes an integrated approach to the fields of Corpus Linguistics, Construction Grammar, and World Englishes through a thorough constructional and corpus-based examination of the patterning of the versatile high-frequency verb make in British English and New Englishes. It contributes to Construction Grammar theory by adopting a verb-based, rather than construction-based, perspective on argument structure. This allows the probing of the interface between verb-independent generalizations and item-specificity from an underexplored angle that offers new insights into the shape of the constructicon. From a variationist perspective, it seeks to (i) identify features of New Englishes and gauge whether these features exhibit traces of conventionalization, and (ii) assess whether the degree of institutionalization of the New Englishes correlates with linguistic behavior, both from a social and cognitive perspective, thereby contributing to the budding effort to integrate the cognitive and social dimensions into the modeling of linguistic variation in World Englishes.
Book
Requires Authentication
Unlicensed
Licensed
Volume 99 in this series
Situated at the interface between corpus linguistics and Systemic Functional Linguistics, this volume focuses on conjunctive markers expressing contrast in English and French. The frequency and placement patterns of the markers are analysed using large corpora of texts from two written registers: newspaper editorials and research articles. The corpus study revisits the long-standing but largely unsubstantiated claim that French requires more explicit markers of cohesive conjunction than English and shows that the opposite is in fact the case. Novel insights into the placement preferences of English and French conjunctive markers are provided by a new approach to theme and rheme that attaches more importance to the rheme than previous studies. The study demonstrates the significant benefits of a combined corpus and Systemic Functional Linguistics approach to the cross-linguistic analysis of cohesion.
Book
Requires Authentication
Unlicensed
Licensed
Volume 98 in this series
From Twitter to Reddit, Facebook, and WhatsApp – social media is a part of modern everyday life. Studying the language used on social media platforms presents great opportunities as well as challenges to corpus linguists. The contributions in Corpus Approaches to Social Media address technical, ethical, and methodological issues by showcasing in-depth social media studies as conducted by corpus scholars. The chapters are based on a variety of social media platforms and include corpus perspectives on the language of online communities, linguistic variation in short media texts, and the role of images in computer-mediated communication. A particularly strong point of the collection are the detailed accounts of the methodological aspects of working with social media corpora. The volume features research applying traditional corpus linguistic methods to social media data as well as novel and innovative research methods for the analysis of multimodal material and atypical corpus texts.
Book
Requires Authentication
Unlicensed
Licensed
Volume 97 in this series
This volume provides a diachronic and synchronic overview of linguistic variability and change in involved, speech-related and spoken texts in English. While previous works on the topic have focused on more limited time periods, this book covers data from the 16th century up to the present day. The studies offer new insights into historical and present-day corpus pragmatics by identifying and exploring features of orality in a variety of registers. For readers who are new to the field, the range of approaches will provide a helpful overview; for readers who are already familiar with the field, the volume will shed light on the complexity of factors such as register, sociolinguistic variability and language attitude, thus making it a useful resource and stepping stone for further exploration. The volume celebrates the groundbreaking contributions of Professor Merja Kytö in making accessible speech-related corpus material and leading the way in its exploration.
Book
Requires Authentication
Unlicensed
Licensed
Volume 96 in this series
This book showcases eleven studies dealing with corpora and the changing society. The theme of the volume reflects the fact that changes in society lead to changes in language and vice versa. Focusing on the English language, be it from Old English to the present, or a shorter time span in the immediate past, the contributors in this volume use a variety of corpus methods to address the two patterns of change. The cross-fertilization of cultural studies and corpus linguistics, we hope, is beneficial for both parties, as corpus linguistics offers a vast array of materials and methods to investigate cultural and societal change, while cultural studies provide the theoretical background on which to build our research. The studies included in the present volume illustrate the potential avenues and the merits of combining changing language and changing societies.
Book
Requires Authentication
Unlicensed
Licensed
Volume 95 in this series
This volume showcases some of the latest research on academic writing by leading and up-and-coming corpus linguists. The studies included in the volume are based on a wide range of corpora spanning first and second language academic writing at different levels of writing expertise, containing texts from a variety of academic disciplines (and sub-disciplines) and of different academic registers. Particularly novel aspects of the collection are the inclusion of research that combines rhetorical moves with multi-dimensional analysis, studies that cover both fixed and variable phraseological items (lexical bundles, phrase-frames, constructions), and work that is based on corpora of English as an academic lingua franca. Going beyond merely summarizing their findings, the authors also discuss what their research means for academic writing practice and pedagogical settings. The volume will be of interest to researchers, students, and teachers who would like to expand their knowledge of how academic writing functions and what it looks like in a variety of contexts.
Book
Requires Authentication
Unlicensed
Licensed
Volume 94 in this series
What is the best way to analyze spontaneous spoken language? In their search for the basic units of spoken language the authors of this volume opt for a corpus-driven approach. They share a strong conviction that prosodic structure is essential for the study of spoken discourse and each bring their own theoretical and practical experience to the table. In the first part of the book they segment spoken material from a range of different languages (Russian, Hebrew, Central Pomo (an indigenous language from California), French, Japanese, Italian, and Brazilian Portuguese). In the second part of the book each author analyzes the same two spoken English samples, but looking at them from different perspectives, using different methods of analysis as reflected in their respective analyses in Part I. This approach allows for common tendencies of segmentation to emerge, both prosodic and segmental.
Book
Requires Authentication
Unlicensed
Licensed
Volume 93 in this series
This book explores the affordances of disciplinary corpora for the teaching and learning of the language of dentistry, within the field of English for Specific Academic Purposes (ESAP). We extract disciplinary register features and vocabulary from three key genres of the dentistry discipline (published experimental research articles, case reports, and novice/professional research reports within the Dental Public Health domain), before integrating these features into ESAP pedagogy in the form of corpus-based ESAP materials that promote student-led direct engagement with disciplinary corpora – an approach known as 'data-driven learning'. This book is a timely and relevant addition to the field of corpus linguistics and ESAP, and is especially targeted at ESAP professionals who are required to teach disciplinary discourses but who may struggle to know what to teach as non-experts of the target discipline.
Book
Requires Authentication
Unlicensed
Licensed
Volume 92 in this series
While native corpora and corpus linguistic tools and methods have been used and applied for quite some time in the development of learning and teaching materials, learner corpora are only just beginning to impact the field of language teaching, testing and assessment. This volume helps to close this still existing gap and highlights the great potential of learner corpus research for language pedagogy by presenting a selection of 11 original studies on learner corpora, conducted by established experts as well as by excellent young researchers. The papers included in the volume present new corpora and methods; studies on written as well as spoken learner corpora and on using data-driven learning scenarios in the classroom.
All papers include sections on practical and concrete language-pedagogical applications. This volume will be of significant interest to researchers working in corpus linguistics, learner corpus research, second language acquisition and English for Academic and Specific Purposes, as well to language teachers and materials developers.
All papers include sections on practical and concrete language-pedagogical applications. This volume will be of significant interest to researchers working in corpus linguistics, learner corpus research, second language acquisition and English for Academic and Specific Purposes, as well to language teachers and materials developers.
Book
Requires Authentication
Unlicensed
Licensed
Volume 91 in this series
This volume provides a comprehensive overview of the research carried out over the past thirty years in the vast field of legal discourse. The focus is on how such research has been influenced and shaped by developments in corpus linguistics and register analysis, and by the emergence from the mid 1990s of historical pragmatics as a branch of pragmatics concerned with the scrutiny of historical texts in their context of writing. The five chapters in Part I (together with the introductory chapter) offer a wide spectrum of the latest approaches to the synchronic analysis of cross-genre and cross-linguistic variation in legal discourse. Part II addresses diachronic variation, illustrating how a diversity of methods, such as multi-dimensional analysis, move analysis, collocation analysis, and Darwinian models of language evolution can uncover new understandings of diachronic linguistic phenomena.Recipient of the 2021 Book Award from the Spanish Association for Applied Linguistics (AESLA)
Book
Requires Authentication
Unlicensed
Licensed
Volume 90 in this series
This volume assesses the state of the art of parallel corpus research as a whole, reporting on advances in both recent developments of parallel corpora – with some particular references to comparable corpora as well– and in ways of exploiting them for a variety of purposes. The first part of the book is devoted to new roles that parallel corpora can and should assume in translation studies and in contrastive linguistics, to the usefulness and usability of parallel corpora, and to advances in parallel corpus alignment, annotation and retrieval. There follows an up-to-date presentation of a number of parallel corpus projects currently being carried out in Europe, some of them multimodal, with certain chapters illustrating case studies developed on the basis of the corpora at hand. In most of these chapters, attention is paid to specific technical issues of corpus building. The third part of the book reflects on specific applications and on the creation of bilingual resources from parallel corpora. This volume will be welcomed by scholars, postgraduate and PhD students in the fields of contrastive linguistics, translation studies, lexicography, language teaching and learning, machine translation, and natural language processing.
Book
Requires Authentication
Unlicensed
Licensed
Volume 89 in this series
This monograph describes the development of Rhapsodie, a 33,000-word syntactic and prosodic treebank of spoken French created with the aim of modeling the interface between prosody, syntax and discourse in spoken French. Theoretical foundations and methodological choices are presented and discussed, and compared with other contemporary approaches. Why is a data-driven instead of a corpus-based approach necessary when one wants to model and analyze discourse without neglecting the features typical of everyday speech, in order to capture not only what we say but also how we say it? How can one show that verbal exchange operates as a collaborative enterprise and how can the specific syntactic and prosodic markers of this collaboration be merged? The description proposed in this collective book is of interest for specialists of spoken French studies, and also for scholars who would like to extend Rhapsodie-like annotation schemes to other languages.
Book
Requires Authentication
Unlicensed
Licensed
Volume 88 in this series
Corpus linguistics has become one of the most widely used methodologies across the different linguistic subdisciplines; especially the study of world-wide varieties of English uses corpus-based investigations as one of the chief methodologies. This volume comprises descriptions of the many new corpus initiatives both within and outside Africa that aim to compile various corpora of African Englishes. Moreover, it contains cutting-edge corpus-based research on African Englishes and the use of corpora in pedagogic contexts within African institutions. This volume thus serves both as a practical introduction to corpus compilation (Part I of the book), corpus-based research (Part II) and the application of corpora in language teaching (Part III), and is intended both for those researchers not yet familiar with corpus linguistics and as a reference work for all international researchers investigating the linguistic properties of African Englishes.
Book
Requires Authentication
Unlicensed
Licensed
Volume 87 in this series
With an ever-growing body of corpus linguistic tools, resources and applications, it becomes increasingly important to reflect critically on the underlying assumptions that corpus linguistics is based on. Focusing on meaning and methods, this book tackles fundamental concepts and approaches that define the discourse of the field. Internationally renowned contributors address topics that range from the history of corpus linguistics to contrastive perspectives between languages, to interpreting patterns in corpora as evidence of both mainstream discourses and individual voices within them. This collection not only adds to our understanding of the fundamentals of corpus linguistics, it also brings innovative meanings to the corpus linguistics discourse. It has been edited in honour of Wolfgang Teubert, who for decades has been a significant voice in this discourse.
Book
Requires Authentication
Unlicensed
Licensed
Volume 86 in this series
Focusing on the multi-faceted topic of Eurolects, this volume brings together knowledge and methodologies from various disciplines, including sociolinguistics, legal linguistics, corpus linguistics, and translation studies. The legislative varieties of eleven EU official and working languages (Dutch, English, Finnish, French, German, Greek, Italian, Latvian, Maltese, Polish, Spanish) are analyzed using corpus methodologies in order to investigate the variational dynamics and translation-induced patterns of the different languages. The underlying assumption is that, within the sociolinguistic continua of the EU languages, it is possible to single out specific legislative varieties (Eurolects) that originate at a supra-national level. This research hypothesis is strongly supported by the empirical findings derived from detailed corpus analyses of each language. This work represents the first systematic and comprehensive linguistic research conducted on a wide range of EU languages using the same protocol and applying corpus methodologies to the extensive Eurolect Observatory Multilingual Corpus.
Book
Requires Authentication
Unlicensed
Licensed
Volume 85 in this series
This volume provides a state-of-the-art overview of the intersecting fields of corpus linguistics, historical linguistics, and genre-based studies of language usage. Papers in this collection are devoted to presenting relevant methods pertinent to corpus-based studies of the connection between genre and language change, linguistic changes that occur in particular genres, and specific diachronic phenomena that are influenced by genre factors to greater and lesser degrees. Data are drawn from a number of languages, and the scope of the studies presented here is both short- and long-term, covering cases of recent change as well as more long-term alterations.
Book
Requires Authentication
Unlicensed
Licensed
Volume 84 in this series
This book introduces a methodology and research tool (DART) that make it possible to carry out advanced corpus pragmatics research using dialogue corpora enriched with pragmatics-relevant annotations. It first explores the general use of spoken corpora for pragmatics research, as well as issues revolving around their representation and annotation, and then goes on to describe the resources required for such an annotation process. Based on data from three different corpora, ranging from highly constrained, task-oriented, ones (SPAADIA Trainline & Trains 93) to unconstrained dialogues (Switchboard), it next presents an in-depth discussion and illustration of the potential contributions of syntax, semantics, and semantico-pragmatics towards pragmatic force. This is followed by a description of the largely automatic annotation process itself, and finally an analysis of how a set of more than 110 potential speech acts defined in DART contributes towards establishing the specific communicative characteristics of the three corpora.
Book
Requires Authentication
Unlicensed
Licensed
Volume 83 in this series
This monograph deals with variable tag questions. These are utterances with a variable interrogative tag, like It's peculiar writing, isn't it, and the semi-variable tag innit, such as Nice, innit. The aim is to provide a corpus-based, comprehensive semantic-pragmatic typology of British English tag questions. Compared to existing descriptions, the proposed typology is novel in three ways. Firstly, whereas almost all existing typologies are single-layered classifications, the functions of tag questions are categorized into two parallel dimensions of interpersonal meaning: the speech function and the stance layer. Secondly, semantic generalizations are proposed for clusters of grammatical, intonational and conversational properties. Thirdly, the bottom-up description is based on a sizeable amount of authentic, spontaneous conversations, which are analysed both qualitatively and quantitatively.
Book
Requires Authentication
Unlicensed
Licensed
Volume 82 in this series
The use of corpora has conventionally been envisioned as being either corpus-based or corpus-driven. While the formal definition of the latter term has been widely accepted since it was established by Tognini-Bonelli (2001), it is often applied to studies that do not, in fact, fullfil the fundamental requirement of a theory-neutral starting point. This volume proposes the term pattern-driven as a more precise alternative. The chapters illustrate a variety of methods that fall under this broad methodology, such as the extraction of lexical bundles, POS-grams and semantic frames, and demonstrate how these approaches can uncover new understandings of both synchronic and diachronic linguistic phenomena.
Book
Requires Authentication
Unlicensed
Licensed
Volume 81 in this series
This book is a research monograph divided into two parts. The first part describes the methods used to build the first sizeable corpus of informal conversational data collected from bilingual speakers of Welsh and English: Siarad. The second part describes the linguistic analysis of data from this corpus (available at bangortalk.org.uk). The information in Part One will be useful as a ‘how to’ manual on building a bilingual spoken corpus, including methods of data collection, transcription, glossing and analysis. The findings reported in Part Two throw new light on the debate regarding code-switching vs. borrowing, the application of the Matrix Language Framework (MLF) to the grammar of Welsh-English code-switching, the extralinguistic factors influencing variation in quantity of code-switching, and the extent to which the grammar of Welsh is changing in contact with English. Additional findings by other researchers using the corpus are also reported, and possible future directions are discussed.
Book
Requires Authentication
Unlicensed
Licensed
Volume 80 in this series
Language Acquisition in CLIL and Non-CLIL Settings builds a bridge between Second Language Acquisition and Learner Corpus Research (LCR) methodologies to take the evaluation of Content and Language Integrated Learning (CLIL) to a new level. The study innovates in two main ways. First, it is based on a highly diversified L2 database which includes learner corpus data as well as experimental data from the same learners. These linguistic components of the database are complemented with extensive information on learner variables, including cognitive and affective factors, which are rarely studied in LCR. Second, the study relies on multifactorial statistical analyses to assess the effectiveness of CLIL itself as well as the impact of the selectivity inherent in the CLIL system, which has frequently been ignored. The linguistic focus of the study is the English passive, which is investigated in CLIL and non-CLIL teaching materials, and subsequently related to learner output.
Book
Requires Authentication
Unlicensed
Licensed
Volume 79 in this series
Published in 2005, Michael Hoey’s Lexical Priming – A new theory of words and language introduced a completely new theory of language based on how words are used in the real world. In the ten years that have passed, the theory has since gained traction in the field of corpus-linguistics. This volume brings together some of the most important contributions to the theory, in areas such as language teaching and learning, discourse analysis, stylistics as well as the design of language learning software. Crucially, this book introduces aspects of the language that have so far been given less focus in lexical priming, such as spoken language, figurative language, forced primings, priming as predictor of genre, and historical primings. The volume also focuses on applying the lexical priming theory to languages other than English including Mandarin Chinese and Finnish.
Book
Requires Authentication
Unlicensed
Licensed
Volume 78 in this series
The aim of this book is to present a comprehensive picture of the current state of Spanish learner corpus research (SLCR), which makes it unique, since no other monograph has focused on collecting research dealing with learner corpora of any language other than English. In addition to an introductory appraisal of current SLCR, as well as a wake-up call reminding us that learner corpus design still needs to be improved, this volume features a selection of original studies ranging from general issues concerning learner corpora compilation to more specific aspects such as phonetic, lexical, grammatical and pragmatic features of the interlanguage of learners of Spanish, as reflected in corpus data. This volume will undoubtedly be of significant interest to researchers involved in corpus linguistics, second language acquisition research, as well as to professionals in the field of Spanish as a second language, including teachers, and creators and publishers of teaching materials.
Book
Requires Authentication
Unlicensed
Licensed
Volume 77 in this series
This book examines delexical verb + noun collocations such as make a decision, give rise to and take care of in Swedish and Chinese learner English. Using a methodological framework that combines learner corpus research with a contrastive perspective, the study is one of the very few in the field to incorporate corpora of the learner’s L1 to investigate the effects of L1 influence. The book provides a highly detailed and multi-faceted analysis of delexical verb + noun collocations in terms of frequency of occurrence, lexical preferences and morphosyntactic patterns. Quantitative and qualitative results on overuse, underuse and errors are presented with linguistically and pedagogically relevant interpretations that include cultural and discourse aspects. More importantly, the book throws light on how L2 learners may alternate between the open-choice principle and the idiom principle as well as the extent and nature of L1 influence on their collocational use.
Book
Requires Authentication
Unlicensed
Licensed
Volume 76 in this series
Discourse Reflexivity in Linear Unit Grammar: The case of IMDb message boards represents a significant landmark. Not only is it the first in-depth corpus-based study to be based on Linear Unit Grammar, it is also the first study to present a unified model of both Linear Unit Grammar and Linear Unit Discourse Analysis. To illustrate this model, the book focuses on the role of discourse reflexivity in the linear structure of online message board discourse from the Internet Movie Database (IMDb) webpage. It is shown that discourse reflexivity plays a central role in the linear structure and antagonism characteristic of this type of discourse. This book will particularly appeal to those who have an interest in carrying forward the innovations in the description of grammar, lexis and discourse proposed by John Sinclair in his lifetime as well as to those with a specific interest in discourse reflexivity and computer-mediated communication.
Book
Requires Authentication
Unlicensed
Licensed
Volume 75 in this series
Emotive Interjections in British English: A corpus-based study on variation in acquisition, function and usage constitutes the first in-depth corpus-based study on the use of emotive interjections in Present Day British English. In a novel approach, it systematically distinguishes between child and adult speakers, providing new insights into how they use Ow!, Ouch!, Ugh!, Yuck!, Whoops!, Whoopsadaisy! and Wow! in everyday spoken language. It studies in detail their acquisition by children and pinpoints changes and developments in their use throughout early childhood. The study highlights particularities displayed by child and adult speakers in general and identifies crucial differences regarding how adults use emotive interjections depending on whether they are interacting with children or other adults. This book thus offers an exhaustive overview on the functions of emotive interjections based on thorough empirical research and will appeal to linguists concerned with pragmatics, child language acquisition, the expression of emotion and interjections.
Book
Requires Authentication
Unlicensed
Licensed
Volume 74 in this series
A full-length study of monocollocable words, i.e. words whose usage is severely restricted to one or a few combinations only (such as English ado in without much/further ado), that brings together corpus-based data from the four languages along with studies analysing, along both general and language-specific lines, monocollocable words in terms of their frequency, lexical as well as morphosyntactic behaviour, and various facets of their peripheral status. Each of the four langauges covered, namely, English, Italian, German and Czech also offers a short introduction of the respective languages written in English, Italian, German and Czech. A rare contribution to our knowledge of an as yet little studied field, the book will attract the attention of, and stimulate a new interest in, all who are ready to acknowledge that collocation is a core phenomenon of language – lexicologists, lexicographers with a focus on phraseology, language typologists, linguists with a contrastive and historical agenda, and language teachers alike.
Book
Requires Authentication
Unlicensed
Licensed
Volume 73 in this series
Corpus linguistics has had a revolutionary impact on grammar and discourse research. Not only has it opened up entirely new theoretical perspectives and methodological possibilities for both fields, but it has also to a considerable extent erased the boundaries that have traditionally been drawn between them. This book showcases a variety of current corpus-based approaches to the study of grammar and discourse, and makes a case for seeing grammar and discourse as fundamentally inter-related phenomena. The book features contributions from leading experts in cognitive linguistics, construction grammar, critical discourse studies, genre and register analysis, phraseology, language learning and teaching, languages for specific purposes, second language acquisition, sociolinguistics, systemic functional linguistics and text linguistics. An essential reference point for future research, Corpora, Grammar and Discourse has been edited in honour of Susan Hunston, whose own work has consistently pushed at the boundaries of corpus-based research on grammar and discourse for over three decades.
Book
Requires Authentication
Unlicensed
Licensed
Volume 72 in this series
The Discourse of Nurse-Patient Interactions: Contrasting the communicative styles of U.S. and international nurses is the first book to quantitatively examine a wide range of linguistic features in a corpus of interactions between nurses and standardized patients. The main goal of this book is to compare the discourse of U.S. (L1 English speaking) and international (L2 English speaking) nurses. The research design relies on a mixed method approach, including both quantitative and qualitative discourse analysis of lexico-grammatical, interactional, prosodic, fluency, and non-verbal features; assessments of interactional effectiveness; and qualitative interviews with nurses. The book offers a detailed description of the situational characteristics of the interactions and compares the discourse of nurses and patients in order to contextualize differences in the communicative styles of the two nurse groups. The results provide new insight into the way that sociocultural and linguistic aspects of nurse discourse contribute to the delivery of patient-centered care.
Book
Requires Authentication
Unlicensed
Licensed
Volume 71 in this series
Linguistic Variation in Research Articles investigates the linguistic characteristics of academic research articles, going beyond a traditional analysis of the generically-defined research article to take into account varied realizations of research articles within and across disciplines. It combines corpus-based analyses of 70+ linguistic features with analyses of the situational, or non-linguistic, characteristics of the Academic Journal Registers Corpus: 270 research articles from 6 diverse disciplines (philosophy, history, political science, applied linguistics, biology, physics) and representing three sub-registers (theoretical, quantitative, and qualitative research). Comprehensive analyses include a lexical/grammatical survey, an exploration of structural complexity, and a Multi-Dimensional analysis, all interpreted relative to the situational analysis of the corpus. The finding that linguistic variation in research articles does not occur along a single parameter like discipline is discussed relative to our understanding of disciplinary practices, the multidimensional nature of variation in research articles, and resulting methodological considerations for corpus studies of disciplinary writing.
Book
Requires Authentication
Unlicensed
Licensed
Volume 70 in this series
The aim of this volume is to highlight the benefits and potential of using learner corpora for the testing and assessment of L2 proficiency in both speaking and writing, reflecting the growing importance of learner corpora in applied linguistics and second language acquisition research. Identifying several desiderata for future research and practice, the volume presents a selection of original studies, covering a variety of different languages. It features studies that present very thoroughly compiled new corpus resources which are tailor-made and ready for analysis in LTA, new tools for the automatic assessment of proficiency levels, and new methods of (self-)assessment with the help of learner corpora. Other studies suggest innovative research methodologies of how proficiency can be operationalized through learner corpus data. The volume is of particular interest to researchers in (applied) corpus linguistics, learner corpus research, language testing and assessment, as well as for materials developers and language teachers.
Book
Requires Authentication
Unlicensed
Licensed
Volume 69 in this series
In recent years, corpora have found their way into language instruction, albeit often indirectly, through their role in syllabus and course design and in the production of teaching materials and other resources. An alternative and more innovative use is for teachers and students alike to explore corpus data directly as part of the learning process. This volume addresses this latter application of corpora by providing research insights firmly based in the classroom context and reporting on several state-of-the-art projects around the world where learners have direct access to corpus resources and tools and utilize them to improve their control of the language systems and skills or their professional expertise as translators. Its aim is to present recent advances in data-driven learning, addressing issues involving different types of corpora, for different learner profiles, in different ways for different purposes, and using a variety of different research methodologies and perspectives.
Book
Requires Authentication
Unlicensed
Licensed
Volume 68 in this series
This volume presents new findings based on the analysis of spoken corpora in thirteen different Afro-Asiatic languages – a unique endeavor in the domain of lesser-described languages. It will be of interest to corpus linguists, general linguists, typologists, and linguists specializing in Afro-Asiatic languages. In addition to the rarity of corpus studies based on endangered and lesser-described languages, the volume is remarkable due to its focus on the role of prosody in interaction with several other phenomena, including code-switching and borrowing. Phonology, syntax, and information structure are explored, and the issue of the elaboration of strategies for the typological comparison of corpora is addressed in several papers. The volume also contains a presentation of software development conducted within the scope of the CorpAfroAs project and based upon the widely used ELAN. The sound-indexed, and morphosyntactically-annotated corpora, with their OLAC metadata and several other deliverables can be accessed and searched at http://dx.doi.org/10.1075/scl.68.website.
Book
Requires Authentication
Unlicensed
Licensed
Volume 67 in this series
The contributions to this volume apply and extend the techniques of corpus linguistics and diachronic linguistics to the challenge of describing and explaining grammatical change in varieties of English world-wide. The book is divided into two parts, with ten chapters on ‘Inner Circle’ varieties such as Australian, Canadian, and Irish English, and eight on ‘Outer Circle’ varieties such as Philippine, Indian, and Nigerian English. Contributors examine a range of topics including the progressive aspect, modal auxiliaries, do-support, verb morphology, and quotatives, using a wide variety of corpus resources. Overarching research questions addressed include the following: Do diachronic tendencies observed in a particular variety converge with, diverge from, or run in parallel with, those in the parent variety? What are the possible causes of changes observed (e.g. English teaching traditions, Americanisation, internal changes in registers)? This book will appeal to linguists, particularly those interested in grammatical description, corpus linguistics and World Englishes.
Book
Requires Authentication
Unlicensed
Licensed
Volume 66 in this series
This volume comprises nine contributions that were written by up-and-coming corpus-based researchers with varied areas of expertise, who were all disciples of Douglas Biber sometime in the past two decades. These papers cover a wide variety of linguistic analyses and describe the principles of the Flagstaff school: a careful procedure for language corpora collection with special consideration for corpus size, representativeness, sampling and systematic analysis; the use of computer programming abilities that allow the posing of corpus-based research questions never asked before; and a strong emphasis on the combination of quantitative methods based on sound and innovative statistical procedures complemented with comprehensive qualitative functional analyses of the language. This volume has been edited in honor of Douglas Biber, a pioneer of the American school of corpus-based research.
Book
Requires Authentication
Unlicensed
Licensed
Volume 65 in this series
This book presents an investigation of lexical bundles in native and non-native scientific writing in English, whose aim is to produce a frequency-derived, statistically- and qualitatively-refined list of the most pedagogically useful lexical bundles in scientific prose: one that can be sorted and filtered by frequency, key word, structure and function, and includes contextual information such as variations, authentic examples and usage notes. The first part of the volumediscusses the creation of this list based on a multimillion-word corpus of biomedical research writing and reveals the structure and functions of lexical bundles and their role in effective scientific communication. A comparative analysis of a non-native corpus highlights non-native scientists’ difficulties in employing lexical bundles. The second part of the volume explores pedagogical applications and provides a series of teaching activities that illustrate how EAP teachers or materials designers can use the list of lexical bundles in their practice.
Book
Requires Authentication
Unlicensed
Licensed
Volume 64 in this series
This book focuses on binomials (word pairs such as heart and soul, rich and poor, or if and when), and in particular on the degree of reversibility that English binomials demonstrate. Detailed and innovative corpus linguistic analyses investigate the correlates of the degree of reversibility, linguistic constraints that influence the ordering and reversibility of binomials and the diachronic development of reversibility. In addition, judgment data are analyzed for their convergence and divergence with corpus data regarding degrees of reversibility. The book thus establishes reversibility as a complex characteristic of the binomial construction, at the same time throwing light on general questions in phraseology, lexicalization, language structure and language processing.
Book
Requires Authentication
Unlicensed
Licensed
Volume 63 in this series
The studies in this volume approach English grammatical patterns in novel ways by interrogating corpora, focusing on patterns in the verb phrase (tense, aspect and modality), the noun phrase (intensification and focus marking), complementation structures and clause combining. Some studies interrogate historical corpora to reconstruct the diachronic development of patterns such as light verb constructions, verb-particle combinations, the be a-verbing progressive and absolute constructions. Other studies analyse synchronic datasets to typify the functions in discourse of, amongst others, tag questions and it-clefts, or to elucidate some long-standing problems in the syntactic analysis of verbal or adjectival complementation patterns, thanks to the empirical detail only corpora can provide. The volume documents the practices that have been developed to guarantee optimal representativeness of corpus data, to formulate definitions of patterns that can be operationalized in extractions, and to build dimensions of variation such as text type and register into rich grammatical descriptions.
Book
Requires Authentication
Unlicensed
Licensed
Volume 62 in this series
The English language is changing every day and it is us – the individual speakers and writers – that drive those changes in small ways by choosing to use certain strings of words over others. This book discusses and describes some of the choices made by speakers from South Korea by examining the similarities and differences between two Korean communities: one in England and one in South Korea. The book has two overall aims. Firstly, it is intended to begin a discussion about phraseology and Lexical Priming and how these theoretical concepts relate and play out in the context of a New English. Secondly, it provides a model of how a language variety can be explored by detailed analysis of short strings. It delves into a range of areas from World Englishes to phraseology and formulaic language and would be suitable for students, teachers and researchers in all these areas.
Book
Requires Authentication
Unlicensed
Licensed
Volume 61 in this series
The authors of this book share a common interest in the following topics: the importance of corpora compilation for the empirical study of human language; the importance of pragmatic categories such as emotion, attitude, illocution and information structure in linguistic theory; and a passionate belief in the central role of prosody for the analysis of speech. Four distinct sections (spoken corpora compilation; spoken corpora annotation; prosody; and syntax and information structure) give the book the structure in which the authors present innovative methodologies that focus on the compilation of third generation spoken corpora; multilevel spoken corpora annotation and its functions; and additionally a debate is initiated about the reference unit in the study of spoken language via information structure. The book is accompanied by a web site with a rich array of audio/video files. The web site can be found at the following address: DOI: 10.1075/scl.61.media
Book
Requires Authentication
Unlicensed
Licensed
Volume 60 in this series
Approximately a quarter of a century ago, the Multi-Dimensional (MD) approach—one of the most powerful (and controversial) methods in Corpus Linguistics—saw its first book-length treatment. In its eleven chapters, this volume presents all new contributions covering a wide range of written and spoken registers, such as movies, music, magazine texts, student writing, social media, letters to the editor, and reports, in different languages (English, Spanish, Portuguese) and contexts (engineering, journalism, the classroom, the entertainment industry, the Internet, etc.). The book also includes a personal account of the development of the method by its creator, Doug Biber, an introduction to MD statistics, as well as an application of MD analysis to corpus design. The book should be essential reading to anyone with an interest in how texts, genres, and registers are used in society, what their lexis and grammar look like, and how they are interrelated.
Book
Requires Authentication
Unlicensed
Licensed
Volume 59 in this series
This book is a critical appraisal of recent developments in corpus linguistics for the analysis of written and spoken learner data. The twelve papers cover an introductory critical appraisal of learner corpus data compilation and development (section 1); issues in data compilation, annotation and exchangeability (section 2); automatic approaches to data identification and analysis (section 3); and analysis of learner corpus data in the light of recent models of data analysis and interpretation, especially recent automatic approaches for the identification of learner language features (section 4). This collection is aimed at students and researchers of corpus linguistics, second language acquisition studies and quantitative linguistics. It will significantly advance learner corpus research in terms of methodological innovation and will fill in an important gap in the development of multidisciplinary approaches (for learner corpus studies).
Book
Requires Authentication
Unlicensed
Licensed
Volume 58 in this series
Combining the fields of phraseology and contrastive analysis, this book describes how patterns, defined as recurrent word-combinations with semantic unity, behave cross-linguistically. As the contrastive approach adopted in the book relies on translations and a bidirectional corpus model, the first part offers an in-depth discussion of contrastive linguistics, with special emphasis on using translations as tertium comparationis and a parallel corpus as the main source of material. Central to the contrastive analysis is the use of corpus-linguistic methods in the identification of patterns, while a deeper understanding of the phraseological nature of the patterns is closely related to the concept of extended units of meaning. The second part of the book presents five case studies, using an easy-to-follow step-by-step method to illustrate the phraseological-contrastive approach at work. The studies show that patterns weave an intricate web of meanings across languages and demonstrate the potential of exploring patterns in contrast.
Book
Requires Authentication
Unlicensed
Licensed
Volume 57 in this series
A hallmark of corpus linguistics is the study of patterns of language use. The studies presented in this volume all use corpora to investigate patterns of lexis from various perspectives. The first section, “Sequence and Order”, presents theoretical and practical aspects of the linguist’s task of uncovering the principles that determine such patterns. The next section, “Competing Constructions”, discusses the relationship between lexical patterns with similar meanings in the light of diachronic, regional and register variation. New developments in terms of lexicogrammatical meaning and patterning are dealt with in the section “Emerging Patterns”. The final section, “Correlating patterns and meaning”, discusses ways in which meaning can be studied in corpus data despite the lack of narrowly defined search terms. Though situated at different points on a continuum between lexical and grammatical emphasis, the studies all confirm the inseparability of lexis and grammar.
Book
Requires Authentication
Unlicensed
Licensed
Volume 56 in this series
The corpus-based studies in this volume explore biomedical research writing in English from a variety of perspectives. The articles in this collection delve into the lexicographic issues involved in building an electronic database of collocations and lexical bundles, offer insight on the teaching and learning of prototypical multiword units of meaning in biomedical discourse, and view written scientific English through the lens of such diverse fields as phraseology, metaphor, gender and discourse analysis. The research presented in this book forms the theoretical and methodological foundation of SciE-Lex, a lexical database of collocations and prefabricated expressions designed to help scientists write scientific papers in English accurately. The concluding chapter on FrameNet addresses frame semantics, whose application to the cross-linguistic study of scientific language will open new and promising avenues of research in the study of specialized languages.
Book
Requires Authentication
Unlicensed
Licensed
Volume 55 in this series
This work is designed, firstly, to both provoke theoretical discussion and serve as a practical guide for researchers and students in the field of corpus linguistics and, secondly, to offer a wide-ranging introduction to corpus techniques for practitioners of discourse studies. It delves into a wide variety of language topics and areas including metaphor, irony, evaluation, (im)politeness, stylistics, language change and sociopolitical issues. Each chapter begins with an outline of an area, followed by case studies which attempt both to shed light on particular themes in this area and to demonstrate the methodologies which might be fruitfully employed to investigate them. The chapters conclude with suggestions on activities which the readers may wish to undertake themselves. An Appendix contains a list of currently available resources for corpus research which were used or mentioned in the book.
Book
Requires Authentication
Unlicensed
Licensed
Volume 54 in this series
Contrastive studies have experienced a dramatic revival in the last decades. By combining the methodological advantages of computer corpus linguistics and the possibility of contrasting texts in two or more languages, the structure and use of languages can be explored with greater accuracy, detail and empirical strength than before. The approach has also proved to have fruitful practical applications in a number of areas such as language teaching, lexicography, translation studies and computer-aided translation. This volume contains twelve studies comparing linguistic phenomena in English and seven other languages. The topics range from comparisons of specific lexical categories and word combinations to syntactic constructions and discourse phenomena such as cohesion and thematic structure. The studies highlight similarities and differences in the use, semantics and functions of the compared items, as well as the emergence of new meanings and language change. The emphasis varies from purely linguistic studies to those focusing on practical applications.
Book
Requires Authentication
Unlicensed
Licensed
Volume 53 in this series
This book takes a new and holistic approach to fluency in English speech and differentiates between productive, perceptive, and nonverbal fluency. The in-depth corpus-based description of productive fluency points out major differences of how fluency is established in native and nonnative speech. It also reveals areas in which even highly advanced learners of English still deviate strongly from the native target norm and in which they have already approximated to it. Based on these findings, selected learners are subjected to native speakers' ratings of seven perceptive fluency variables in order to test which variables are most responsible for a perception of oral proficiency on the sides of the listeners. Finally, language-pedagogical implications derived from these findings for the improvement of fluency in learner language are presented. This book is conceptually and methodologically relevant for corpus-linguistics, learner corpus research and foreign language teaching and learning.
Book
Requires Authentication
Unlicensed
Licensed
Volume 52 in this series
These specially-commissioned studies cover corpus-informed approaches to researching, teaching and learning English for Specific Purposes (ESP). The corpora used range from very large published corpora to small tailor-made collections of written and spoken text, as well as parallel and contrastive corpora, in both the hard and softer sciences. Designed to tackle the problems faced by a variety of first- and second-language ESP users (specialised translators, undergraduates, junior and experienced researchers, and language trainers), the breadth of approaches enables treatment of issues central to ESP and corpus research, from corpus compilation and analysis to new applications and data-driven learning. The first full-length book on applied corpus use in France, Corpus-Informed Research and Learning in ESP will be of interest not only to those working in the French context, but to a wide variety of language professionals – teachers, researchers or course designers – in many countries looking at ESP from different linguistic, cultural and educational perspectives.
Book
Requires Authentication
Unlicensed
Licensed
Volume 51 in this series
This is a comprehensive guidebook to the quantitative methods needed for Corpus-Based Translation Studies (CBTS). It provides a systematic description of the various statistical tests used in Corpus Linguistics which can be used in translation research. In Part 1, Theoretical Explorations, the interplay between quantitative and qualitative methodologies is explored. Part 2, Essential Corpus Studies, describes how to undertake quantitative studies, with a suitable level of technical and relevant case studies. Part 3, Quantitative Explorations of Literary Translations, looks at translations of classic works by Cao Xueqin, James Joyce and other authors. Finally, Part 4 on Translation Lexis uses a variety of techniques new to translation studies, including multivariate analysis and game theory. This book is aimed at students and researchers of corpus linguistics, translation studies and quantitative linguistics. It will significantly advance current translation studies in terms of methodological innovation and will fill in an important gap in the development of quantitative methods for interdisciplinary translation studies.
Book
Requires Authentication
Unlicensed
Licensed
Volume 50 in this series
This book brings together a variety of approaches to English corpus linguistics and shows how corpus methodologies can contribute to the linking of diachronic and synchronic studies. The articles in this volume investigate historical changes in the English language as well as specific aspects of Middle and Modern English and, moreover, of English dialects. The contributions also discuss the development of English corpus linguistics generally and its potential in the future. Special focus is given to the continuity between Middle and Modern English – much in line with the linking in previous studies of Middle English and Old English under the generic term “medievalism”. This volume highlights the continual development of English from the medieval to modern period.
Book
Requires Authentication
Unlicensed
Licensed
Volume 49 in this series
This book describes new methodological and technological approaches to corpus building and presents recent research based on the Norwegian Newspaper Corpus. This is a large monitor corpus of contemporary Norwegian language, compiled through daily harvesting of web newspapers. The book gives an overview of the corpus and its system architecture, and presents tools used for tasks such as text harvesting, annotation, topic classification and extraction and frequency profiling of new words and phrases. Among the innovative technologies is Corpuscle, a corpus query engine and management system which is flexible enough to handle very large corpora in an efficient way. The individual research contributions based on the corpus explore different aspects of Norwegian, including the occurrence of anglicisms, neologisms and terminology, and the use of metonymy and metaphor in newspaper language. The book also describes an innovative method of applying correspondence analysis and implicational analysis to investigate interdependencies between morphosyntactic variants.
Book
Requires Authentication
Unlicensed
Licensed
Volume 48 in this series
Perspectives on Corpus Linguistics is a collection of interviews with fourteen well-known researchers in the field of linguistics. Each interview consists of a set of ten questions: the first seven are common to all contributors while the last three are connected to the research experience of each guest. In the general questions, the invited scholars explore (sometimes controversial) topics such as the concept of representativeness, the role of intuition and the status of Corpus Linguistics. In the specific questions, they provide a thorough discussion of materials and methods in corpus research as well as theoretical and applied perspectives on the use of corpora in language studies. Whether experts or novices, the volume should be of interest to all those who want to learn about corpus linguistics and carry out research in this fascinating and growing area.
Book
Requires Authentication
Unlicensed
Licensed
Volume 47 in this series
The present collection of articles represents research efforts in the field of specialised languages, including the analysis of research articles in disciplines as diverse as Biomedicine and Computing, on the one hand, and overlapping disciplines such as in Social Sciences, on the other, all with high relevance to English for Academic Purposes, and English for specific Purposes. The volume offers empirical evidence obtained from corpus-based analyses of language, both from diachronic as well as synchronic perspectives, on topics such as the role of mother tongue in professional writing, the analysis of conference abstracts as a genre, or the analysis of visual data transfer. This collection addresses issues such as the implementation of lexicons for specialised language learning, and the development of ontologies to research language patterns. The volume thus provides a rich repertoire of research methodologies, in-depth analyses of specialised discourses, and the identification and discussion of relevant pedagogic issues.Winner of the 4th Edition of the 'Enrique Alcaraz Research Award'
Book
Requires Authentication
Unlicensed
Licensed
Volume 46 in this series
This book contains the first in-depth corpus-based description of structural nativization at the lexis-grammar interface in Indian English, the largest institutionalized second-language variety of English world-wide. For a set of three ditransitive verbs give, send and offer –collocational patterns, verb-complementational preferences and correlations between collocational and verb-complementational routines are described. The present study is based on the comparison of the Indian and the British components of the International Corpus of English as well as a 100-million-word web-derived corpus of acrolectal Indian newspaper language and corresponding parts of the British National Corpus. The present corpus-based ‘thick description’ of lexicogrammatical routines provides new perspectives on the emergence of new routines and patternings in Indian English and is conceptually and methodologically relevant for research into varieties of English worldwide.
Book
Requires Authentication
Unlicensed
Licensed
Volume 45 in this series
The eleven contributions to this volume, written by expert corpus linguists, tackle corpora from a wide range of perspectives and aim to shed light on the numerous linguistic and pedagogical uses to which corpora can be put. They present cutting-edge research in the authors’ respective domain of expertise and suggest directions for future research. The main focus of the book is on learner corpora, but it also includes reflections on the role of other types of corpora, such as native corpora, expert users corpora, parallel corpora or corpora of New Englishes. For readers who are already familiar with corpora, this volume offers an informed account of the key role that corpus data play in applied linguistics today. As for readers who are new to corpus linguistics, the overview of approaches, methods and domains of applications presented will undoubtedly help them develop their own taste for corpora. This volume has been edited in honour of Sylviane Granger, who has been one of the pioneers of learner corpus research.
Book
Requires Authentication
Unlicensed
Licensed
Volume 44 in this series
The articles in this volume are intended to bridge what Sridhar and Sridhar (1986) have called the 'paradigm gap' between traditional SLA research on the one hand and research into institutionalised second-language varieties in former colonial territories on the other. Since both learner Englishes and second-language varieties are typically non-native forms of English that emerge in language contact situations, it is high time that they are described and compared on an empirical basis in order to draw conceptual and theoretical conclusions with regard to their form, function and acquisition. The present collection of articles places special emphasis on empirical evidence obtained from large-scale analyses of computerised corpora of learner Englishes (such as the International Corpus of Learner English) and of second-language varieties of English (such as the International Corpus of English). It addresses questions such as ‘Are the phenomena we find in ESL and EFL varieties features or errors?’ or ‘How common and wide-spread are features across contact varieties of English?’
Book
Requires Authentication
Unlicensed
Licensed
Volume 43 in this series
Primarily focused on idioms and other figurative phraseology, Colouring Meaning describes how the meanings of established phrases are enhanced, refocused and modified in everyday language use. Unlike many studies of creativity in language, this book-length survey addresses the matter at several levels, from the purely linguistic level of collocation, through its abstractions in colligation and semantic preference, to semantic prosody and connotation. This journey through both linguistic and cognitive levels involves the examination of habitual language and its exploitations, both mundane and colourful, explaining the phenomena observed in terms of current psycholinguistic research as well as corpus linguistics theory and analysis. The relationships between meaning in text and meaning in the mind are discussed at length and extensively illustrated with worked case studies to offer the reader a comprehensive overview of metaphorical and other secondary meanings as they emerge in real-world communicative situations.
Book
Requires Authentication
Unlicensed
Licensed
Volume 42 in this series
This is the first empirical study to focus on adjectives complemented by that-clauses. The in-depth analysis of more than 50,000 cases taken from the British National Corpus gives comprehensive insights into hitherto neglected relations of lexis and grammar. The result of this corpus-driven study is a novel classification of adjectives based on co-occurrence patterns and corroborated with the help of statistical means. The inductive analysis of corpus data offers new perspectives on and innovative descriptions of well-known phenomena of English grammar, such as extraposition or the resultative construction so…that. It is based on a new methodological approach, which looks at mutual relations of both lexis and grammar in unprecedented ways.
Book
Requires Authentication
Unlicensed
Licensed
Volume 41 in this series
This is corpus linguistics with a text linguistic focus. The volume concerns lexical inequality, the fact that some words and phrases share the quality of being key – and thereby reflect or promote important themes – in some textual contexts, while others do not. The patterning of words which differ in their centrality to text meaning is of increasing interest to corpus linguistics. At the same time software resources are yielding increasingly more detailed ways of identifying and studying the linkages between key words and phrases in text databases. This volume brings together work from some of the leading researchers in this field. It presents thirteen studies organized in three sections, the first containing a series of studies exploring the nature of keyness itself, then a set of five studies looking at keyness in specific discourse contexts, and then three studies with an educational focus.
Book
Requires Authentication
Unlicensed
Licensed
Volume 40 in this series
This volume offers a description and a deep examination of discourse genres across four disciplines (Psychology, Social Work, Industrial Chemistry, and Construction Engineering), in academic and professional settings. The study is based on one of the largest available corpus on disciplinary written discourse in Spanish (PUCV-2006 Corpus of Spanish containing almost 60 million words). Twelve chapters range from the theoretical guiding principles of the research in terms of genre conception, the detailed description of each corpus (academic and professional), computational analysis from multi-dimensional perspectives, and the qualitative analysis of two specialized genres (University Textbook and Disciplinary Text) in terms of their rhetorical macro-moves and moves. Theoretically speaking, a multi-dimensional perspective (social, linguistic and cognitive) is emphasized and special attention to the cognitive nature of discourse genres is supported.
Book
Requires Authentication
Unlicensed
Licensed
Volume 39 in this series
English causative constructions with cause, get, have and make are often mistakenly presented as (quasi-)synonymous and more or less interchangeable. This book demonstrates the value of corpus linguistics in identifying the syntactic, semantic, lexical and stylistic features that are distinctive for each of these constructions. It also underlines the usefulness of providing corpus studies with a solid theoretical foundation by showing how corpus linguistics can be fruitfully combined with cognitive linguistics, which is used both as a starting point for the analysis (top-down approach) and as a framework within which to interpret the corpus results (bottom-up approach). From a methodological point of view, the study illustrates the complementarity of corpus and elicitation data, and offers tools and methods that could be used to investigate other syntactic structures. Finally, the book also has a pedagogical dimension in that it examines how the research findings can be applied to foreign language teaching.
Book
Requires Authentication
Unlicensed
Licensed
Volume 38 in this series
Age is by far the most underdeveloped of the sociolinguistic variables in terms of research literature. To-date, research on age has been patchy and has generally focused on the early life-stages such as childhood and adolescence, ignoring, for the most part, healthy adulthood as a stage worthy of scrutiny. This book examines the discourse of adulthood and accounts for sociolinguistic variation, with regards to age and gender, through the exploration of a 90,000 word age-and gender-differentiated spoken corpus of Irish English. The book explores both the distribution and use of a number of high frequency pragmatic features of spoken discourse that appear as key items in the corpus. Part 1 of the book provides an introduction, a theoretical overview of age as a sociolinguistic variable and a description on how to compile a small spoken corpus for sociolinguistic research. Part 2 consists of five chapters which investigate and explore key features such as hedges, vague category markers, intensifiers, boosters and high-frequent items of taboo language in relation to the variables, age and gender. The book is of interest to undergraduates or postgraduates taking formal courses in sociolinguistics, applied linguistics, pragmatics or discourse analysis. It is also of interest to students and researchers interested in using corpus linguistics in sociolinguistic research.
Book
Requires Authentication
Unlicensed
Licensed
Volume 37 in this series
Register Variation in Indian English constitutes the first large-scale empirical investigation of an international variety of English. Using a combination of the corpus compiled for this project and relevant sections of ICE-India as its database, this work tests existing descriptions and characterizations of English in India, and provides the first empirical account of register variation in Indian English (or indeed, any international variety of English). Included in this survey are linguistic features that have been examined before and others that have not. From an empirical standpoint, it comments on the process of Indianization of the English used in India. The book will be of interest to readers beyond specialists of Indian English as it is one of very few studies to undertake a large-scale corpus analysis for the purpose of dialect research. The book provides a model on which future studies of international Englishes can be based.
Book
Requires Authentication
Unlicensed
Licensed
Volume 36 in this series
This book explores a virtually untapped, yet fascinating research area: television dialogue. It reports on a study comparing the language of the American situation comedy Friends to natural conversation. Transcripts of the television show and the American English conversation portion of the Longman Grammar Corpus provide the data for this corpus-based investigation, which combines Douglas Biber’s multidimensional methodology with a frequency-based analysis of close to 100 linguistic features. As a natural offshoot of the research design, this study offers a comprehensive description of the most common linguistic features characterizing natural conversation. Illustrated with numerous dialogue extracts from Friends and conversation, topics such as vague, emotional, and informal language are discussed. This book will be an important resource not only for researchers and students specializing in discourse analysis, register variation, and corpus linguistics, but also anyone interested in conversational language and television dialogue.
Book
Requires Authentication
Unlicensed
Licensed
Volume 35 in this series
This volume showcases studies that recognize and provide evidence for the inseparability of lexis and grammar. The contributors explore in what ways these two areas, often treated separately in linguistic theory and description, form an organic whole. The papers in Section I (Setting the Scene) introduce some of the key methodological approaches and theoretical positions at the lexis-grammar interface, while Section II (Considering the Particulars) contains papers that report on case studies and show concrete applications of the central methods and theories. Exploring the Lexis-Grammar Interface is a stimulating collection of papers for anyone who wishes to learn more about and get fresh state-of-the-art perspectives on language patterning.
Book
Requires Authentication
Unlicensed
Licensed
Volume 34 in this series
The Language of Outsourced Call Centers is the first book to explore a large-scale corpus representing the typical kinds of interactions and communicative tasks in outsourced call centers located in the Philippines and serving American customers. The specific goals of this book are to conduct a corpus-based register comparison between outsourced call center interactions, face-to-face American conversations, and spontaneous telephone exchanges; and to study the dynamics of cross-cultural communication between Filipino call center agents and American callers, as well as other demographic groups of participants in outsourced call center transactions, e.g., gender of speakers, agents’ experience and performance, and types of transactional tasks. The research design relies on a number of analytical approaches, including corpus linguistics and discourse analysis, and combines quantitative and qualitative examination of linguistic data in the investigation of the frequency distribution and functional characteristics of a range of lexico/syntactic features of outsourced call center discourse.
Book
Requires Authentication
Unlicensed
Licensed
Volume 33 in this series
The articles in this edited volume represent a broad coverage of areas. They discuss the role and effectiveness of corpora and corpus-linguistic techniques for language teaching but also deal with broader issues such as the relationship between corpora and second language teaching and how the different perspectives of foreign language teachers and applied linguists can be reconciled. A number of concrete examples are given of how authentic corpus material can be used for different learning activities in the classroom. It is also shown how specific learner problems for example in the area of phraseology can be studied on the basis of learner corpora and textbook corpora. On the basis of learner corpora of speech and writing it is further shown that even advanced learners of English are uncertain about stylistic and text type differences.
Book
Requires Authentication
Unlicensed
Licensed
Volume 32 in this series
The book is the first to apply David Brazil’s Discourse Intonation systems (prominence, tone, key and termination) to the study of a corpus of authentic, naturally-occurring spoken discourses. The Hong Kong Corpus of Spoken English (prosodic) is made up of approximately one million words consisting of four sub-corpora of equal size, namely academic, conversation, business and public. The participants are all adults and typically have either Cantonese or English as their first language. The four Discourse Intonation systems are described in terms of how the system works and how they are manifested in the corpus, both across the sub-corpora and also across speakers in the corpus. The book is accompanied with a CD containing the prosodically transcribed corpus together with iConc which is the software designed and written specifically to interrogate the HKCSE (prosodic). The issues raised and discussed are all of importance in Conversation Analysis, Corpus Linguistics, Discourse Analysis, Discourse Intonation, Pragmatics, and Intercultural Communication.
Book
Requires Authentication
Unlicensed
Licensed
Volume 31 in this series
This book brings together contributions from a diverse collection of scholars who explore different ways of combining corpus linguistics and discourse analysis, studying discourse at the prosodic, lexical, and textual levels. Both spoken and written discourse are investigated in a variety of settings, including academia, the workplace, news, and entertainment. Not only does the volume offer a rich sample of English-language discourse from around the world, including international, learner, and non-standard varieties of English, but it also covers a range of topics and methods. This book will be of particular interest to researchers and students specializing in discourse studies, English linguistics, and corpus linguistics.
Book
Requires Authentication
Unlicensed
Licensed
Volume 30 in this series
Corpus and Context explores the relationship between corpus linguistics and pragmatics by discussing possible frameworks for analysing utterance function on the basis of spoken corpora. The book articulates the challenges and opportunities associated with a change of focus in corpus research, from lexical to functional units, from concordance lines to extended stretches of discourse, and from the purely textual to multi-modal analysis of spoken corpus data. Drawing on a number of spoken corpora including the five million word Cambridge and Nottingham Corpus of Discourse in English (CANCODE, funded by CUP (c)), a specific speech act function is being explored using different approaches and different levels of analysis. This involves a close analysis of contextual variables in relation to lexico-grammatical and discoursal patterns that emerge from the corpus data, as well as a wider discussion of the role of context in spoken corpus research.
Book
Requires Authentication
Unlicensed
Licensed
Volume 29 in this series
This book reports research on the Problem-Solution rhetorical pattern, which has to date received very little attention in corpus-based studies. Insights from genre analysis and systemic-functional grammar are also applied to the analysis of the Problem-Solution pattern, thus moving towards a more multi-faceted analysis of corpus data. The pattern is investigated in two specialized corpora of technically-oriented report writing, a professional corpus and a student corpus, using a key word and key-key word analysis. Phraseological analyses of key words in both corpora are presented. Data show that students’ writing lacks a range of lexico-grammatical patternings for expressing the Problem and Solution elements of the pattern. The book concludes with some pedagogic implications and applications of the findings. Suggested concordancing activities are discussed within the context of key issues in the field of data-driven learning.
Book
Requires Authentication
Unlicensed
Licensed
Volume 28 in this series
Discourse on the Move is the first book-length exploration of how corpus-based methods can be used for discourse analysis, applied to the description of discourse organization. The primary goal is to bring these two analytical perspectives together: undertaking a detailed discourse analysis of each individual text, but doing so in terms that can be generalized across all texts of a corpus. The book explores two major approaches to this task: ‘top-down’ and ‘bottom-up’. In the ‘top-down’ approach, the functional components of a genre are determined first, and then all texts in a corpus are analyzed in terms of those components. In contrast, textual components emerge from the corpus analysis in the bottom-up approach, and the discourse organization of individual texts is then analyzed in terms of linguistically-defined textual categories. Both approaches are illustrated through case studies of discourse structure in particular genres: fund-raising letters, biology/biochemistry research articles, and university classroom teaching.
Book
Requires Authentication
Unlicensed
Licensed
Volume 27 in this series
While parentheticals attract constant attention, they very rarely constitute the main subject of monographs. This book provides a comprehensive account of reduced parenthetical clauses (RPCs) in three Romance languages. Typical French RPCs are je crois, disons, je dirais, je pense, je sais pas, and je trouve. The research draws on 22 corpora of spoken French, Italian, and Spanish comprising a total amount of 3,975,500 words. Its results consist in a typology of the relevant expressions in the three languages, in the understanding of their pragmatic function and of the factors influencing their use, and in the description of their syntactic and prosodic properties. Other findings are that RPCs are not restricted to statements but also occur in questions and that belief verbs are not as frequent as commonly assumed. Although the book is about Romance parentheticals, its conclusions are relevant for other languages.
Book
Requires Authentication
Unlicensed
Licensed
Volume 26 in this series
Through electronic corpora we can observe patterns which we were unaware of before or only vaguely glimpsed. The availability of multilingual corpora has led to a renewal of contrastive studies. We gain new insight into similarities and differences between languages, at the same time as the characteristics of each language are brought into relief. The present book focuses on the work in building and using the English-Norwegian Parallel Corpus and the Oslo Multilingual Corpus. Case studies are reported on lexis, grammar, and discourse. A concluding chapter sums up problems and prospects of corpus-based contrastive studies, including applications in lexicography, translator training, and foreign-language teaching. Though the main focus is on English and Norwegian, the approach should be of interest more generally for corpus-based contrastive research and for language studies in general. Seeing through corpora we can see through language.
Book
Requires Authentication
Unlicensed
Licensed
Volume 25 in this series
People have a natural propensity to understand language text as a succession of smallish chunks, whether they are reading, writing, speaking or listening. Linguists have found that this propensity can shed light on the nature and structure of language, and there are many studies which attempt to harness the potential of natural chunking.This book explores the role of chunking in the description of discourse, especially spoken discourse. It appears that chunking offers a sound but flexible platform on which can be built a descriptive model which is more open and comprehensive than more familiar approaches to structural description. The model remains linear, in that it avoids hierarchies, and it concentrates on the combinatorial patterns of text.
The linear approach turns out to have many advantages, bringing together under one descriptive method a wide variety of different styles of speech and writing. It is complementary to established grammars, but it raises pertinent questions about many of their assumptions.
The linear approach turns out to have many advantages, bringing together under one descriptive method a wide variety of different styles of speech and writing. It is complementary to established grammars, but it raises pertinent questions about many of their assumptions.
Book
Requires Authentication
Unlicensed
Licensed
Volume 24 in this series
The pervasive phenomenon of metadiscourse – commentary on the ongoing discourse – is beginning to take its rightful place among the major topics of discourse studies. This book makes simultaneous contributions to the theory of metadiscourse, corpus-based methods of studying such phenomena, and our knowledge of metadiscourse use in written English. After comprehensively reviewing previous research, it introduces a more rigorous and empirical approach to metadiscourse studies. Ädel presents a new model of metadiscourse based on Jakobson’s functions of language, and other conceptual tools, including explicit features for defining metadiscourse, a taxonomy of the functions it serves, and maps of the boundaries between it and related phenomena. A large-scale study of writing by L1 and L2 university students is presented, in which the L2 speakers’ overuse of metadiscourse strongly marks them as lacking in communicative competence. This work is of interest both to linguists and to educators concerned with writing in English.
Book
Requires Authentication
Unlicensed
Licensed
Volume 23 in this series
University students must cope with a bewildering array of registers, not only to learn academic content, but also to understand course expectations and requirements. While many previous studies have investigated academic writing, we know comparatively little about academic speech; and no linguistic study to date has investigated the range of academic and advising/management registers that students encounter. This book is a first step towards filling this gap. Based on analysis of the T2K-SWAL Corpus, the book describes university registers from several different perspectives, including: vocabulary patterns; the use of lexico-grammatical and syntactic features; the expression of stance; the use of extended collocations ('lexical bundles'); and a Multi-Dimensional analysis of the overall patterns of register variation. All linguistic patterns are interpreted in functional terms, resulting in an overall characterization of the typical kinds of language that students encounter in university registers: academic and non-academic; spoken and written.
Book
Requires Authentication
Unlicensed
Licensed
Volume 22 in this series
Textual Patterns introduces corpus resources, tools and analytic frameworks of central relevance to language teachers and teacher educators. Specifically it shows how key word analysis, combined with the systematic study of vocabulary and genre, can form the basis for a corpus informed approach to language teaching. The first part of the book gives the reader a strong grounding in the way in which language teachers can use corpus analysis tools (wordlists, concordances, key words) to describe language patterns in general and text patterns in particular. The second section presents a series of case studies which show how a key word / corpus informed approach to language education can work in practice. The case studies include: General language education (i.e. students in national education systems and those following international examination programmes), foreign languages for academic purposes, literature in language education, business and professional communication, and cultural studies in language education.
Book
Requires Authentication
Unlicensed
Licensed
Volume 21 in this series
This book investigates the effects of corpus work on the process of foreign language learning in ESP settings. It suggests that observing learners at work with corpus data can stimulate discussion and re-thinking of the pedagogical implications of both the theoretical and empirical aspects of corpus linguistics. The ideas presented here are developed from the Data-Driven Learning approach introduced by Tim Johns in the early nineties. The experience of watching students perform corpus analysis provides the basis for the two main observations in the book: a) corpus work provides students with a useful source of information about ESP language features, b) the process of "search-and-discovery" implied in the method of corpus analysis may facilitate language learning and promote autonomy in learning language use. The discussion is carried out on the basis of a series of corpus-based "explorations" by students and provides suggestions for developing new tasks and tools for language learners.
Book
Requires Authentication
Unlicensed
Licensed
Volume 20 in this series
This book proposes an innovative approach to general nouns. General nouns are defined as high-frequency nouns that are characterised by their textual functions. Although the concept is motivated by Halliday & Hasan (1976), the corpus theoretical approach adopted in the present study is fundamentally different and set in a linguistic framework that prioritises lexis. The study investigates 20 nouns that are very frequent in mainstream English, as represented by the Bank of English Corpus. The corpus-driven approach to the data involves a critical discussion of descriptive tools, such as patterns, semantic prosodies, and primings of lexical items, and the concept of 'local textual functions' is put forward to characterise the functions of the nouns in texts. The study not only suggests a characterisation of general nouns, but also stresses that functions of lexical items and properties of texts are closely linked. This link requires new ways of describing language.
Book
Requires Authentication
Unlicensed
Licensed
Volume 19 in this series
This book focuses on theoretical and descriptive issues and techniques in the study of text and discourse. Drawing on a large number of corpora containing academic language, from spoken language to published research papers, the authors approach their subject from multiple angles: The academic language of biology, literature, philosophy, economics, agriculture, linguistics and applied linguistics. The analysis of intertextual features these papers show leads to penetrating results.
Book
Requires Authentication
Unlicensed
Licensed
Volume 18 in this series
This book presents a large-scale corpus-driven study of progressives in 'real' English and 'school' English, combining an analysis of general linguistic interest with a pedagogically motivated one. A systematic comparative analysis of more than 10,000 progressive forms taken from the largest existing corpora of spoken British English and from a small corpus of EFL textbook texts highlights numerous differences between actual language use and textbook language concerning the distribution of progressives, their preferred contexts, favoured functions, and typical lexical-grammatical patterns. On the basis of these differences, a number of pedagogical implications are derived, the integration of which then leads to a first draft of an innovative concept of teaching progressives - a concept which responds to three key criteria in pedagogical description: typicality, authenticity, and communicative utility. The analysis also demonstrates that many existing accounts of the progressive are inappropriate in several respects and that not enough attention is being paid to lexical-grammatical relations.! Winner of the "Wissenschaftspreis Hannover 2006" for outstanding research monographs !
Book
Requires Authentication
Unlicensed
Licensed
Volume 17 in this series
Corpus-aided language pedagogy is one of the central application areas of corpus methodologies, and a test bed for theories of language and learning. This volume provides an overview of current trends, offering methodological and theoretical position statements along with results from empirical studies. The relationship between corpora and learning is examined from complementary perspectives — the study of learner language, the didactic use of corpus findings, and the interaction between corpora and their users. Reflections on current theory and technology open and close the volume.With its focus on the learner and the learning setting, Corpora and Language Learners is addressed to corpus linguists with an interest in learner language, applied linguists wishing to expand their understanding of corpora and their pedagogic potential, and language teachers wishing to critically assess the relevance of work in this field.
This volume grew out of selected presentations at the 5th Teaching and Language Corpora conference in Bertinoro, Italy.
This volume grew out of selected presentations at the 5th Teaching and Language Corpora conference in Bertinoro, Italy.
Book
Requires Authentication
Unlicensed
Licensed
Volume 16 in this series
This book explores the structure and use of academic and professional discourse through the lens of corpus linguistics. The goal of this book is to show how insights from corpus linguistic analyses can help us better understand how we use academic and professional language and help us find ways to better train newcomers to the genres used in various professional contexts. The contributions to this book show that specialized corpora of specific genres from a variety of fields allow us to make more relevant observations about the function and use of language for particular purposes. The specialized corpora examined include written and spoken academic genres, written and spoken business and legal genres, and written philanthropic genres. The book showcases a variety of approaches to analyzing the discourse of specialized corpora, and each chapter concludes with a reflection on the practical and pedagogical implications of the analysis.
Book
Requires Authentication
Unlicensed
Licensed
Volume 15 in this series
The C-ORAL-ROM book and DVD provide a unique set of comparable corpora of spontaneous speech for the main Romance languages, French, Italian, Portuguese and Spanish. The corpora are accompanied by comparative linguistic studies, models and standard linguistic measures of spoken language variability. Each corpus is built to the same design using identical sampling techniques, and each corpus is presented in multimedia format, allowing simultaneous access to aligned acoustic and textual information. Texts are headed with information about provenance, participants, etc. and the transcriptions show changes of speaker. Speech acts are tagged according to the evidence of prosodic criteria. Each corpus totals 300,000 words and presents formal and informal speech in a variety of contexts of use, dialogue structure and text genres, semantic domains and speech act typologies. The corpora have great statistical relevance for spoken language structures and can address key issues in human language technology such as speech recognition in unrestricted discourse, the suitability of speech synthesis in natural prosody, and multilingual applications of the spoken language interface. The work provides new data and innovative theoretical perspectives that are relevant for corpus linguistics, romance linguistics, syntactic theory, speech and prosody research, and second language acquisition.
The original C-ORAL-ROM DVD was made to run under Windows XP when Windows 7 and 8 were not yet in existence. A new version of WINPITCH-C-ORAL-ROM makes it possible to run the C-ORAL-ROM DVD under Windows 7 and 8. It can be downloaded from www.winpitch.com/
The original C-ORAL-ROM DVD was made to run under Windows XP when Windows 7 and 8 were not yet in existence. A new version of WINPITCH-C-ORAL-ROM makes it possible to run the C-ORAL-ROM DVD under Windows 7 and 8. It can be downloaded from www.winpitch.com/
Book
Requires Authentication
Unlicensed
Licensed
Volume 14 in this series
Collocations are both pervasive in language and difficult for language learners, even at an advanced level. In this book, these difficulties are for the first time comprehensively investigated. On the basis of a learner corpus, idiosyncratic collocation use by learners is uncovered, the building material of learner collocations examined, and the factors that contribute to the difficulty of certain groups of collocations identified. An extensive discussion of the implications of the results for the foreign language classroom is also presented, and the contentious issue of the relation of corpus linguistic research and language teaching is thus extended to learner corpus analysis.
Book
Requires Authentication
Unlicensed
Licensed
Volume 13 in this series
Grammaticalization is an important concept in general and typological linguistics and a prominent type of explanation in historical linguistics. For historical corpus linguists, grammaticalization theory provides a frame of orientation in their effort to analyze and systematize a fast-accumulating mass of data. Students of grammaticalization have become increasingly aware of the potential of existing corpora and established corpus-linguistic methodology for their work. This book continues and develops the dialogue between the two fields. All the contributions are based on extensive use of various electronic corpora. Relating corpus practices to recent theoretical concerns of grammaticalization studies they deal with grammaticalization and historical sociolinguistics, lexicalization and grammaticalization, layering, frequency, grammaticalization and dialects, degrammaticalization and grammaticalization in a contrastive perspective. The papers show that a synthesis of corpus methodology and grammaticalization studies leads to new and interesting insights about the mechanisms of language change and the communicative functions of language.
Book
Requires Authentication
Unlicensed
Licensed
Volume 12 in this series
After decades of being overlooked, corpus evidence is becoming an important component of the teaching and learning of languages. Above all, the profession needs guidance in the practicalities of using corpora, interpreting the results and applying them to the problems and opportunities of the classroom. This book is intensely practical, written mainly by a new generation of language teachers who are acknowledged experts in central aspects of the discipline. It offers advice on what to do in the classroom, how to cope with teachers' queries about language, what corpora to use including learner corpora and spoken corpora and how to handle the variability of language; it reports on some current research and explains how the access software is constructed, including an opportunity for the practitioner to write small but useful programs; and it takes a look into the future of corpora in language teaching.
Book
Requires Authentication
Unlicensed
Licensed
Volume 11 in this series
Definition is a basic activity of language, of particular importance to linguists because of its use of language to describe itself. Beyond this inherent significance as a crucial element of language study, definitions also provide a rich potential source of the information needed for Natural Language Processing systems. This book describes an investigation of the subset of general language used in definition sentences and the development of a taxonomy of definition types, a grammar of definition sentences and parsing software which can extract their functional components. The work is based on definition sentences used in one of the dictionaries from the Cobuild range, and the book includes a brief history of the development of monolingual English dictionaries, an assessment of the concepts of sublanguages and local grammars and a full exploration of the results of the analysis and of the present and future applications of the taxonomy, grammar and parser.
Book
Requires Authentication
Unlicensed
Licensed
Volume 10 in this series
There are few aspects of language which are more problematic than its discourse particles. The present study of discourse particles draws upon data from the London-Lund Corpus to show how the methods and tools of corpora can sharpen their description. The first part of the book provides a picture of the state of the art in discourse particle studies and introduces the theory and methodology for the analysis in the second part of the book. Discourse particles are analysed as elements which have been grammaticalised and as a result have certain properties and uses. The importance of linguistic and contextual cues such as text type, position in the discourse, prosody and collocation for analysing discourse particles is illustrated.
The following chapters deal with specific discourse particles (now, oh, just, sort of, and that sort of thing, actually) on the basis of their empirical analysis in the London-Lund Corpus. Examples and extended extracts from many different text types are provided to illustrate what discourse particles are doing in discourse.
The following chapters deal with specific discourse particles (now, oh, just, sort of, and that sort of thing, actually) on the basis of their empirical analysis in the London-Lund Corpus. Examples and extended extracts from many different text types are provided to illustrate what discourse particles are doing in discourse.
Book
Requires Authentication
Unlicensed
Licensed
Volume 9 in this series
Using Corpora to Explore Linguistic Variation illustrates the ways in which linguistic variation can be explored through corpus-based investigation. Two major kinds of research questions are considered: variation in the use of a particular linguistic feature, and variation across dialects or registers. Part 1: “Exploring variation in the use of linguistic features” focuses on the study of specific words, expressions, or grammatical constructions, to study variation in the use of a particular linguistic feature. Part 2: “Exploring dialect and register variation” describes salient characteristics of dialects or registers and the patterns of variation across varieties. Part 3: “Exploring Historical Variation” applies these same two major perspectives to historical variation. One recurring theme is the extent to which linguistic variation depends on register differences, reflecting the importance of register as a key methodological and thematic concern in current corpus linguistic research.
Book
Requires Authentication
Unlicensed
Licensed
Volume 8 in this series
Teenage talk is fascinating, though so far teenage language has not been given the attention in linguistic research that it merits. The dearth of investigations into teenage language is due in part to under representation in language corpora. With the Bergen Corpus of London Teenage Language (COLT) a large corpus of teenage language has become available for research. The first part of Trends in Teenage Talk gives a description how the COLT corpus was collected and processed; the speakers are presented with special emphasis on the recruits and their various backgrounds; ending with a description what the COLT teenagers talk about and how they do it. The second part of the book is devoted to the most prominent features of the teenagers’ talk: ‘slanguage’; how reported speech is manifested; a survey of non-standard grammatical features; the use of intensifiers; tags; and interactional behaviour in terms of conflict talk.
Book
Requires Authentication
Unlicensed
Licensed
Volume 7 in this series
This volume takes stock of current research in contrastive lexical studies. It reflects the growing interest in corpus-based approaches to the study of lexis, in particular the use of multilingual corpora, shared by researchers working in widely differing fields — contrastive linguistics, lexicology, lexicography, terminology, computational linguistics and machine translation. The articles in the volume, which cover a wide diversity of languages, are divided into four main sections: the exploration of cross-linguistic equivalence, contrastive lexical semantics, corpus-based multilingual lexicography, and translation and parallel concordancing. The volume also contains a lengthy introduction to recent trends in contrastive lexical studies written by the editors of the volume, Bengt Altenberg and Sylviane Granger.
Book
Requires Authentication
Unlicensed
Licensed
Volume 6 in this series
The book offers a combined discussion of the main theoretical, methodological and application issues related to corpus work. Thus, starting from the definition of what is a corpus and why reading a corpus calls for a different methodology from reading a text, the underlying assumptions behind corpus work are discussed.
The two main approaches to corpus work are discussed as the “corpus-based” and the “corpus-driven” approach and the theoretical positions underlying them explored in detail. The book adopts and exemplifies the parameters of the corpus-driven approach and posits a new unit of linguistic description defined systematically in the light of corpus evidence. The applications where the corpus-driven approach is exemplified are language teaching and contrastive linguistics. Alternating between practical examples and theoretical evaluation, the reader is led step-by-step to a detailed understanding of the issues involved in corpus work and, at the same time, tempted to explore for himself some of the major applications where a corpus-driven methodology can reveal unprecedented insights into linguistic patterning.
The two main approaches to corpus work are discussed as the “corpus-based” and the “corpus-driven” approach and the theoretical positions underlying them explored in detail. The book adopts and exemplifies the parameters of the corpus-driven approach and posits a new unit of linguistic description defined systematically in the light of corpus evidence. The applications where the corpus-driven approach is exemplified are language teaching and contrastive linguistics. Alternating between practical examples and theoretical evaluation, the reader is led step-by-step to a detailed understanding of the issues involved in corpus work and, at the same time, tempted to explore for himself some of the major applications where a corpus-driven methodology can reveal unprecedented insights into linguistic patterning.
Book
Requires Authentication
Unlicensed
Licensed
Volume 5 in this series
Recent developments in this field of small corpus studies, largely brought about by the personal computer, have yielded remarkable insights into the nature and use of real language. This book presents work by a number of leading researchers in the field and covers a series of topics directly related to language teaching and language research. The ultimate aim of this book is to encourage the exploitation of small corpora by the ELT profession to make language learning more effective. In addition to descriptions of the basic corpus analysis tools, chapters in the collection cover syllabus and materials design, comparisons of different genres, descriptions of local and functional grammars, compilation and use of learner corpora, and making cross-linguistic comparisons. The message of this collection is that language use is purposeful and culture specific and that small corpus analysis is an effective method of linguistic investigation.
Preface by: John Sinclair;
Preface by: John Sinclair;
Book
Requires Authentication
Unlicensed
Licensed
Volume 4 in this series
This book describes an approach to lexis and grammar based on the concept of phraseology and of language patterning arising from work on large corpora. The notion of 'pattern' as a systematic way of dealing with the interface between lexis and grammar was used in Collins Cobuild English Dictionary (1995) and in the two books in the Collins Cobuild Grammar Patterns series (1996; 1998). This volume describes the research that led to these publications, and explores the theoretical and practical implications of the research. The first chapter sets the work in the context of work on phraseology. The next two chapters give several examples of patterns and how they are identified. Chapters 4 and 5 discuss and exemplify the association of pattern and meaning. Chapters 6, 7 and 8 relate the concept of pattern to traditional approaches to grammar and to discourse. Chapter 9 summarizes the book and adds to the theoretical discussion, as well as indicating the applications of this approach to language teaching. The volume is intended to contribute to the current debate concerning how corpora challenge existing linguistic theories, and as such will be of interest to researchers in the fields of grammar, lexis, discourse and corpus linguistics. It is written in an accessible style, however, and will be equally suitable for students taking courses in those areas.
Book
Requires Authentication
Unlicensed
Licensed
Volume 3 in this series
Discourse anaphora is a challenging linguistic phenomenon that has given rise to research in fields as diverse as linguistics, computational linguistics and cognitive science. Because of the diversity of approaches these fields bring to the anaphora problem, the editors of this volume argue that there needs to be a synthesis, or at least a principled attempt to draw the differing strands of anaphora research together. The selected papers in this volume all contribute to the aim of synthesis and were selected to represent the growing importance of corpus-based and computational approaches to anaphora description, and to developing natural language systems for resolving anaphora in natural language.
Book
Requires Authentication
Unlicensed
Licensed
Volume 2 in this series
Patterns and Meanings consists of case studies which make use of corpora and concordance technology. Each case study elaborates a problem area, makes reference to both the descriptive and applied literature thus far, and then suggests ways of exploiting corpus data to shed light on the problem. Language phenomena investigated include word sense, phraseology and syntax, metaphor and creative use, text reference, idiom, and translation. Emphasis is given to information that usually cannot be found in dictionaries, grammars, language textbooks or other resources, but which the study of corpus data makes available. This work is particularly important not only for its language description insights, but also for pedagogical application. Further useful suggestions are included on setting up a medium-sized corpus on a personal computer.
Book
Requires Authentication
Unlicensed
Licensed
Volume 1 in this series
Terms in Context applies the methodology that has been developed over the last two decades in corpus linguistics to the relatively new and still little developed field of corpus-based terminography. While corpora are already being used by some terminologists for the identification of terms and retrieval of contextual fragments, this book describes the first attempt to use corpora for terminography in much the same way as large general reference corpora are already being used for general language lexicography. The author goes beyond the standard problem of identifying terms as opposed to non-terminological lexical items in text and focuses on identifying metalanguage patterns which point to the presence in text of (parts of) reusable definitions of terms. The author examines these patterns and shows how the information which they contain can be retrieved and used as input for terminological entries.
Terms in Context should be of interest to ‘traditional’ terminologists who have not previously considered adopting a corpus-based approach to their work or at least not on the scale proposed here; to ‘modern’ terminologists who use text primarily for the identification of terms and the retrieval of contextual examples; to those in the corpus linguistic community who have hitherto used general language corpora for the purposes of lexicography and have not previously considered using special purpose corpora for more specific lexicography studies; and to academics in the ESP/LSP community who are interested in showing students how to use text as a means of ascertaining the meaning of terms.
Terms in Context should be of interest to ‘traditional’ terminologists who have not previously considered adopting a corpus-based approach to their work or at least not on the scale proposed here; to ‘modern’ terminologists who use text primarily for the identification of terms and the retrieval of contextual examples; to those in the corpus linguistic community who have hitherto used general language corpora for the purposes of lexicography and have not previously considered using special purpose corpora for more specific lexicography studies; and to academics in the ESP/LSP community who are interested in showing students how to use text as a means of ascertaining the meaning of terms.