Abstract
We studied German students’ academic writing skills in English at a Dutch university. Their performances are typical examples of English as a lingua franca (ELF) as these students are non-native users of English evaluated by subject lecturers who are non-native users as well. Our database is a corpus of written answers to an open examination question in the context of an EMI (English Medium Instruction) bachelor in psychology. We aimed to detect those characteristics in this specific type of discourse that may affect the comprehensibility of the students’ answers, which in turn may have consequences for their grading by the course lecturer. English language experts assigned Common European Framework of Reference (CEFR) levels and commented on grammar, use of (academic) vocabulary, and text coherence. First, we correlated the grades assigned by the course lecturer and the CEFR levels. There was no correlation. Second, we analyzed the linguistic comments. We found that academic style was poorly present in this type of text. Importantly, we found no proof of communicative blockings or obstacles related to English proficiency levels between the student writer and the lecturer reader. We conclude that informed content interpretation based on contextual appropriateness of the answers overrules grammatical and lexical non-standard characteristics and outweighs the lack of semantic coherence.
Samenvatting
Wij hebben de academische schrijfvaardigheid in het Engels onderzocht van Duitse studenten aan een Nederlandse universiteit. Hun producten zijn typische voorbeelden van Engels als lingua franca (ELF), aangezien deze studenten niet-native gebruikers van het Engels zijn, die worden beoordeeld door vakdocenten die ook niet-native gebruikers zijn. Onze database is een corpus van schriftelijke antwoorden op een open examenvraag in het kader van een EMI-bachelor in de psychologie. Ons doel was om die kenmerken in dit specifieke type discours te detecteren die de begrijpelijkheid van de antwoorden van de studenten kunnen beïnvloeden, wat op zijn beurt gevolgen kan hebben voor hun beoordeling door de cursusdocent. Experts in de Engelse taal kenden ERK-niveaus toe en gaven commentaar op grammatica, gebruik van (academische) woordenschat en tekstcoherentie. Eerst correleerden we de cijfers van de cursusdocent en de ERK-niveaus. Er was geen correlatie. Ten tweede hebben we de taalkundige opmerkingen geanalyseerd. We ontdekten dat academische stijl slecht aanwezig was in dit type tekst. Belangrijk is dat we geen bewijs hebben gevonden van communicatieve blokkades of obstakels tussen de student-schrijver en de docent-lezer veroorzaakt door de Engelse taalvaardigheid. We concluderen dat een lezer die bekend is met de gevraagde inhoud, niet gehinderd wordt door grammaticale en lexicale niet-standaard kenmerken en het gebrek aan semantische coherentie bij de beoordeling van de contextuele geschiktheid van de antwoorden.
Resumo
Ni pristudis la universitatnivelajn skribkapablojn en la angla lingvo de germanaj studentoj en nederlanda universitato. Iliaj skribaĵoj estas modelaj ekzemploj de la angla kiel ponta lingvo (English as a Lingua Franca, ELF) ĉar la studentoj estas nedenaskaj parolantoj de la angla, taksitaj fare de universitataj docentoj kiuj ankaŭ estas nedenaskuloj. Nia datenaro estas korpuso de skribitaj respondoj al malferma ekzamena demando en la kunteksto de anglalingva bakalaŭro pri psikologio. Ni celis detekti la karakterizaĵojn de ĉi tiu specifa tipo de teksto kiuj povas influi la kompreneblecon de la studentaj respondoj, kiu siavice povas havi sekvojn por ilia klasado fare de la instruisto. Anglalingvaj fakuloj asignis KER-nivelojn kaj komentis pri gramatiko, uzo de (akademia) vortprovizo kaj tekstkohereco. Unue, ni serĉis korelaciojn inter la poentoj asignitaj de la instruisto kaj la KER-niveloj. Ne estis korelacio. Due, ni analizis la lingvajn komentojn. Ni trovis, ke akademia stilo preskaǔ tute mankas en ĉi tiu tipo de teksto. Precipe ni trovis neniun pruvon por komunikaj blokadoj aŭ obstakloj rilataj al la scipovo de la angla de la studenta verkisto aǔ la instruisto. Ni konkludas, ke informita enhavinterpreto bazita sur konteksta taŭgeco de la respondoj superas gramatikajn kaj leksikajn ne-normajn trajtojn kaj semantikan nekoherecon.
1 Introduction
In internationalized higher education, the dominant language of communication is English. We are interested in the role of non-native language (NNL) use in relation to study results. We try to answer this question by conducting empirical research and focusing on a concrete example to investigate the relationship between grading and language performance in the typical context of a written examination of an undergraduate course at a Dutch university. Concretely, we studied German students’ English writing performance in answering an open examination question. The participants were students in an English-instructed bachelor’s study of psychology.
The aim of English Medium Instruction (EMI) in higher education settings is that students simultaneously acquire content knowledge and improve their English language proficiency to an academic level, as universities aim to deliver students that are equipped to participate in international research and/or otherwise realize international careers. This aim seems to be achieved, as outlined by Klaassen (2001), who established that “the limiting effects of learning through an L2 disappeared over a year when students gradually acquired more disciplinary content and presumably improved their English language competence” (Dafouz and Camacho-Miñano 2016: 59). This positive development effect contradicts De Vos’ (2019) finding that the richness of English vocabulary of German EMI students in a Dutch university did not show progress during their first year of study.
EMI has been enthusiastically introduced in higher education worldwide, with lecturers predominantly focusing on content rather than language learning (Rose et al. 2019). This would suggest a relatively low correlation between students’ proficiency level in English and their grades in their degree subjects. At the same time, there must be attention to linguistic skills as students need to communicate effectively and successfully in English in a high-stakes academic environment that demands sophisticated language skills (Mauranen 2017). The English used in these settings is typically English as Lingua Franca (ELF), defined by Mauranen (2018a) as “English as a contact language between speakers or speaker groups when at least one of them uses it as a second language” (Mauranen 2018a: 8).
1.1 Previous research
In our previous studies, we showed that German students at the Radboud University of Nijmegen, the Netherlands, with Dutch as their study language, systematically perform less well than Dutch domestic students (Zijlmans et al. 2016, 2020). However, not all studies observe a negative effect of studying in a non-native language. De Vos et al. (2020) collected data about grades, the number of study points (EC),[1] and dropout rates of Dutch and German students in parallel Dutch and EMI tracks in the psychology bachelor at the same university, during their first year of study. The research aim was to determine whether study success was related to studying in one’s first (L1) or second (L2) language. The group difference in ECs obtained was not significant. Also, the dropout rates were not significant, although admittedly higher than for the Dutch students. For our present study, the relevant conclusion was that studying in the L2 can be disadvantageous when it comes to grades. However, is such a general effect visible in concrete grading processes? Are lower grades merely due to the use of an L2? Or are lower grades related to the lower L2 proficiency levels?
Vander Beken and Brysbaert (2018) focused specifically on the relationship between language and the ways of assessing content knowledge. After reading disciplinary literature in either their L1 Dutch or in L2 English, students performed at the same level on a true/false recognition test in both languages but lower on a free recall test in L2 English. This outcome suggests that students’ performance may be underestimated if they are only tested with essay-type L2 exams.
These studies on international students’ academic success focus on grades, grade point average (GPA), and European credits (EC). However, Rose et al. (2019) point out that GPA and course completion – ECs – are based on subject knowledge, but also attendance, active participation, and group work. They, therefore, consider GPA and ECs as measures too indirect to examine the relationship between language proficiency and academic performance. Additionally, their study shows that besides English language proficiency, academic skills were statistically significant predictors of students’ final grades in a course in International Business. Their data analyses also revealed that students from all language proficiency levels were able to pass the course.
In Santos’ (1988) study, native English (NE) speaking and non-native English (NNE) speaking university lecturers (178 in total) rated two English essays from NNE university students. Their content appreciation was not influenced by the “quality of the language”. The language “errors” in the NNE essays were judged as generally “not irritating” and not hindering comprehensibility. The overall conclusion was that the lecturers were willing to appreciate the content beyond the perspective of a stringent native speaker norm.
Research on the relation between English language proficiency and study results, however, often assumes a native English perspective as the norm (English as a Native Language = ENL). Given that English is used worldwide as a Lingua Franca (ELF), comprehensibility has been put forward as a more relevant criterion in evaluating English proficiency. Santos’ use of the words “deficiencies” and “irritating” is perhaps unfortunate, but he revealed academic readers focus on content and successful communication rather than on form. ELF researchers argue that ENL should not be taken as the norm when studying the use of English worldwide. Mauranen and Metsä-Ketelä (2006) designate Jenkins’ work in 2000 as “the first major description of ELF as a kind of language in its own right rather than as a deficient form of English” (Mauranen and Metsä-Ketelä 2006: 3). ELF is best defined as “an additionally acquired language system which serves as a common means of communication for speakers of different first languages” (the website of VOICE, quoted by Jenkins 2011: 928).
All in all, studies on the effect of NNE on study results show contradictory and inconclusive results. Whatever the outcomes of various studies are, however, the Englishization of higher education seems to have reached a point of no return. The aims of EMI, to develop content knowledge and academic English language proficiency simultaneously, however, are not always met.
1.2 Research questions
This study aims to contribute to a better understanding of how English functions within international academia. Rose et al. (2019) point out that studies examining the association between language proficiency and academic performance are scarce in EMI contexts. Within EMI university programs, where the need for academic and professional productive language skills has increased (Pitkänen 2018), we investigated the role of English as an Academic Lingua Franca. Academic English proficiency manifests enormous variation, and the perspectives on the communicative appropriateness of these manifestations are currently changing (Mauranen 2018b). So far, most research has been done on spoken data, and to a lesser extent on written data. Shchemeleva (2022) investigated research articles by NNE scholarly writers (she used the SciELF corpus). These writers were professional NNE users, different from student writers. Our data consists of texts written by first-year students who have not had much earlier contact with English academic literature. Their answers to an exam question were written in October. These texts represent NNE writing language skills at the start of their studies. We wanted to investigate NNE communication in an early stage as communication difficulties might negatively affect the study’s progress.
We concentrate on productive skills typical of dense short texts written in ELF by students in a time-limited examination. These texts have been graded by the lecturers of the course in question. The English proficiency level of these texts was graded by four English language experts. We address three research questions to find out more about the relationship between written NNE and content grades. The first question concerns their correlation:
Do the Common European Framework of Reference (CEFR) English levels of written examination answers assigned by language experts correlate with the grades assigned by content course lecturer(s)?
We chose to express language proficiency at the CEFR levels (Council of Europe 2001). Firstly, although we are aware of criticisms regarding the validity and relevance of the framework, using these levels is commonplace in Europe in university admission demands and in evaluating students’ language proficiency skills. Our choice does not mean that we support this current use of CEFR levels. Secondly, the higher CEFR levels are related to more cognitively demanding tasks (Hulstijn 2015), such as writing an academic text. Thirdly, although parts of the CEFR are written from the viewpoint that interaction takes place with native speakers (NSs), the general description that the interaction should be “without strain for either party” (Council of Europe 2001: 33) is well applicable to the interaction between interlocutors with different L1s.
Our written answers to an open examination question provide us with the data we need to answer our first research question. From an ENL perspective, there should be a significant correlation, from an ELF perspective not, assuming that all students have sufficient English skills to provide a comprehensible answer.
In addition, we investigate what features are recurrent in the ELF writings of German students in this specific academic context:
What are linguistic and textual features, different from standard native use, of written examination answers by German students, and do these features resemble findings in other ELF research?
We counted their occurrences (frequencies) to get an idea of their distribution, persistence, and relevance. These features are not by definition specific ELF features only, but features of the writings produced by the German students.
Thirdly, we wanted to know whether some features may disturb comprehensibility in written ELF discourse. In this context of writing answers in examinations, this would mean that the content expert cannot assess the answer positively, due to a lack of clarity, caused by inadequate ‘command’ of the language needed for the purpose. Or can the content experts accept and assess written answers diverging from ENL when these are intelligible to them (Jenkins 2011)? The role of comprehensibility is addressed in the third research question:
Which features may disturb comprehensibility, in terms of inadequacies, omissions, or ambiguities, and thus negatively influence the grading of a written examination question?
In the context of the “Englishization of higher education”, we need to know which minimum standard norms and practices should be applied, that is justified by the specific context and purpose (Jenkins and Leung 2017).
2 Methods
2.1 Data
2.1.1 Description of data
We used texts from the corpus of De Vos (2019), resulting from an extensive study among Dutch and German students in parallel Dutch and English tracks in psychology at the same university, during the first bachelor year. Available were the grades that the course lecturers (content experts in our study) assigned to those answers. The ethical committee of the Faculty of Social Sciences of Radboud University approved data handling. We had De Vos’ permission to reuse them. The texts in this corpus were answers to open examination questions from three exams written in October, February, and April. We selected the answers from German students only from the exam that was written in October at an early stage of the students’ EMI study program. They were all written under the same circumstances: with time pressure and without any corrective feedback (De Vos 2019). Research assistants retyped the handwritten answers, without ‘corrections’ but possibly with some misinterpretations due to unclear handwriting. Finishing the higher form of secondary education in Germany implies that their level in English should be at least at the required admission level CEFR B2 (Council of Europe 2001) of the Radboud University, Nijmegen.[2]
2.1.2 Selection of texts
The question to be answered was: Discuss Whorf’s language theory. Include the following terms in your answer: strong and weak variations of the theory. The corpus contains 278 answers varying in length between 7 and 379 words. Grades vary between 0 (min.) and 10 (max.).[3] The correlation between text length and grade is strong (r = 0.745, p = 0.01). Selecting from these 278 texts to investigate the relationship between writing proficiency and content grading, we took the following steps.
Firstly, we excluded texts that were too short to do research on features such as sentence linking and textual coherence. Four English language experts indicated in a pilot that, for estimating the CEFR level, the ideal text length would be 200–300 words. However, out of the 42 texts that met this criterion, 39 were graded with a 10. As we were interested in the correlation between language proficiency and grade, we needed variation in grades. We decided to select texts between 150 and 250 words. A subset of 94 texts met this criterion.
Secondly, we used variation in scores as a criterion. We selected all 14 texts with content scores under 10. Next, to achieve a balanced spread of content-wise skilled and less-skilled students, we took into account the content grades of the two other sets of answers of these students in De Vos’ corpus written in February and April. Table 1 gives an overview of the final sample of 46 answers or texts.
Final sample of 46 texts.
Criterion | Content score Examination 1 |
Content score Examination 2 |
Content score Examination 3 |
N = 46 |
---|---|---|---|---|
Maximum content score on all three exams | 10 | 10 | 10 | 10 |
Twice the max. score and one sufficient (between 6 and 9) | 10 | Between 6 and 9 or 10 | 10 | |
Max. on the first exam and insufficient content scores (<6) on one of the two other exams | 10 | <6 | 8 | |
Max. on the first exam and insufficient content scores (<6) on both other exams | 10 | <6 | <6 | 4 |
Under 10 on first exam | 8 | 6 | ||
7 | 2 | |||
6 | 4 | |||
4 | 1 | |||
2 | 1 |
2.2 Rating the texts
2.2.1 The rating task
Four Language Experts (LEs) of the university’s Language Learning Institute participated in the qualitative evaluation of the selected texts. They were all English for Academic Purposes (EAP) specialists:
LE1: NS Dutch, trainer Cambridge English, and academic communication;
LE2: NS English, good command of Dutch, trainer academic communication, Cambridge courses;
LE3: NS Dutch, senior trainer English;
LE4: NS English, business, scientific, and academic English writing trainer; English language training.
The examination question was available to the LEs in their instructions, but the LEs were asked not to base their judgment on the presumed correctness of the answer given. As we wanted to exclude the risk that the content score would influence the linguistic judgment of the text, we presented the texts without content score and in different random orders. The instructions were as follows (the original instructions were in Dutch).
Please do the following with each of the texts below: |
|
Attached are the global-level descriptions of the CEFR for writing skills. |
After this consultation, we found a good interrater agreement (ICC [4,45] = 0.771). Still, there were some major differences in the estimated CEFR levels. We confronted the LEs with each other’s ratings and arguments for 21 cases. The LEs were asked if they would reconsider their rating.
Each LE received four evaluations of the three other LEs, so 12 texts in total. In the sets, there was always one text with an agreement between the LEs on the CEFR level (green boxes), one with a disagreement of one level (unmarked), and one text with a high disagreement of two levels (red boxes); higher disagreements did not occur. See Table 2.
After this second consultation of the LEs, the new consensus-directed ratings had a higher reliability score (ICC [4,45] = 0.871).
2.2.2 CEFR level and text quality
Per text, we calculated the mean CEFR level as follows: we coded the levels B1 = 1, B2 = 2, C1 = 3, and C2 = 4. A text assigned with two levels, for example B1/B2, was coded 1.5. We then computed the mean level per text, rounding the final proficiency score to an actual CEFR level:
3 × B2 and 1 × C1 resulted in score of 2.25 = B2;
3 × B1 and 1 × B1/B2 resulted in an end score of 1.125 = B1.
In addition to the expert assessment, we obtained an independent text quality score via an automated tool, Grammarly Premium. Grammarly does not only check the formal linguistic features of spelling, punctuation, and grammar, but claims to check reader-oriented features of clarity, engagement, and delivery as well. For our texts, we chose the goals as follows:
Audience: knowledgeable;
Formality: formal;
Domain: academic;
Intent: describe.
These choices come close to the requirements of this type of text. Grammarly assigns an overall score to each text.
2.3 Coding procedure
We inserted the texts and the LEs’ comments in the program ATLAS.ti. We did not supply the LEs with codes or a codebook in advance, based on specific features that had appeared in former research. We wanted the LEs to mention all aspects they felt as striking. We transferred the LEs’ comments into codes. Finally, we revisited each text repeatedly via the tool ‘intercoder agreement’ in search of overlapping codes, thus determining the final set of codes. We want to point out that the same fragments regularly got different codes, as the LEs interpreted the writer’s answers differently and/or suggested different linguistic explanatory categories. We eventually distinguished five coding categories, namely (i) grammar, (ii) vocabulary, (iii) academic register, (iv) structure and coherence, and (v) content and comprehensibility. Spelling as a separate category was removed, as we used typed transcripts of the handwritten texts. We do not know which incorrect spelling was originally done by the students, or the result of a misinterpretation by the typists.
3 Results
In 3.1, we present results on the relationship between CEFR levels and content grade. In 3.2, we list and interpret the frequent non-standard English features signaled by the experts. In 3.3, we investigate which of these features influence the comprehensibility of a text and have an impact on grading.
3.1 The relation between English language proficiency levels and grades
To answer our first research question we allocated a rounded mean CEFR level to each text. Eventually, we had six texts at B1, 28 texts at B2, and 12 texts at C1. Table 3 contains the cross-tabulation of CEFR levels by grades.
Example selection of text combinations for the second consultation for LE1 with the three others.
LE1 | Pair | Text | First LE | Other LE |
---|---|---|---|---|
1a | LE1 = LE4 | 792 | B1 | B1 |
1b | LE1 < LE4 | 675 | B1 | B2 |
1c | LE1 > LE4 | 754 | C1 | B2 |
1d | LE1 > LE4 | 863 | C1 | B2 |
2a | LE1 = LE2 | 640 | B2 | B2 |
2b | LE1 < LE2 | 820 | B2 | C1 |
2c | LE1 > LE2 | 754 | C1 | B2 |
2d | LE1 << LE2 | 884 | B2 | C2 |
3a | LE1 = LE3 | 622 | B2 | B2 |
3b | LE1 < LE3 | 725 | B2 | C1 |
3c | LE1 > LE3 | 754 | C1 | B2 |
3d | LE1 << LE3 | 656 | B2 | C2 |
Cross-tabulation of CEFR levels by grades.
![]() Grade |
10 | 8 | 7 | 6 | 4 | 2 |
---|---|---|---|---|---|---|
CEFR![]() |
||||||
B1 n = 6 | 5 | 1 | ||||
B2 n = 28 | 17 | 5 | 4 | 1 | 1 | |
C1 n = 12 | 10 | 1 | 1 |
There is no significant correlation between the CEFR levels allocated by the LEs, and the grades assigned by the content expert (r = −0.042). The correlation between grade and each LE separately ranged between −0.037 (LE3) and 0.019 (LE4). Interestingly, the correlation between the automatically calculated overall Grammarly score and the average CEFR level was moderately positive (r = 0.676, p = 0.01).
Also available to us were the ECs obtained by each student at the end of the first year of study. This gives us a broader view of the effect of language proficiency on study results, as course completions increase the number of ECs. We did not find a correlation between the CEFR levels of the texts in the first study trimester and ECs at the end of the year (r = 0.014). This is illustrated in Figure 1. If students obtain fewer than 42 ECs, they drop out after the first year. This was the case for two students at the B2 level and one student at the C1 level.

Number of ECs obtained by CEFR level.
3.2 Features of ELF writing
We present the features most frequently mentioned by the LEs in this section. We will report their occurrences in the five coding categories we distinguished. Note that at times the label of ‘non-standard’ usage is a matter of individual interpretation and judgment of the LEs, as we left this for their interpretation. Table 4 contains the top 10 features.
The LEs’ top 10 features and the coding category they belong to.
Category | Feature | Number of occurrences |
---|---|---|
Academic register | Informal wording | 163 |
Vocabulary | Wrong word | 96 |
Grammar | Sentence structure | 73 |
Coherence | Linkers | 54 |
Comprehensibility | Problematic | 42 |
Vocabulary | Prepositions | 40 |
Comprehensibility | Wording vague | 32 |
Grammar | Relative clause or pronoun | 30 |
Vocabulary | Remove word | 29 |
Vocabulary | Wrong word form | 29 |
3.2.1 Academic register
The LEs frequently indicated divergence from ‘academic register’, 215 times, of which 163 were on non-academic vocabulary. To be more precise: these are all remarks on English for general academic purposes, not related to a specific domain. The code group contains codes on general academic wording and codes on academic writing conventions such as the ‘rules’ to avoid contracted forms, brackets, personal pronouns such as I and you, and the use of conjunctions such as but, and, so at the beginning of sentences. All of these categories are generally considered inappropriate in academic texts by the assessors. Evidently, these forms do not affect comprehensibility. See (1) and (2) below, where the LEs indicated that ‘believe’ should be replaced by the ‘correct’ academic wordings rejected and accepted. This code occurs in almost all texts (43) and at all proficiency levels.[4]
People don’t believe in this hypothesis anymore and even Whorf himself doesn’t. (634) |
Nowadays, linguistics actually believe in the weak variation of the Whorfian hypothesis. (794) |
3.2.2 Vocabulary
Codes in the category ‘vocabulary’ occurred 207 times including prepositions (40 times) and linking words (see also coherence). The code ‘choose (a) different word(s)’ occurs 96 times.
The LEs differentiate between words that they qualify as inappropriate for academic English and words that have a different meaning than the one the student probably intended (then for than). Often, the students’ word combinations are rather inventive, like the phrase languageless thinking in (3):
However, this theory has to be crityzed due to the fact, that there is empirical evidence of languageless thinking, for example thinking in pictures. (512) |
One LE assigned ‘not existing word’, the three others did not even mark this phrase. It seems to be a fully comprehensible example of a “neologism that makes use of morphological productivity” (Mauranen 2017: 6). Another category is the use of full German words or anglicized German words (code ‘word from L1’ occurs 15 times), as in after this hypothesis which is a literal translation of nach dieser Hypothese in German, and therefore for dafür, where for that is meant. In (4) below, the German word Stamm is used instead of tribe. Often, students resort to descriptive writing. LEs indicate 32 times that words and phrases are vague and thus affect comprehensibility.
There was an example given with mens of a stamm in Africa which can not count more than three so if somebody asked them about a number above three they cannot tell, but is it than that they also don’t know about the numbers because they are that much influenced by there language? (675) |
To obtain an overall perspective of lexical proficiency, we added up comments on ‘wrong’, vague and inappropriate academic wording per text and compared them to the assigned CEFR level, as can be seen in Figure 2. We seem to see a relation between the number of word codes and the CEFR level, in particular between levels B1 and B2 on the one hand and C1 on the other. This pattern is blurred by the outlying value in C1.

The relation between the sum of perceived inappropriate academic wording, vague wording, and ‘wrong’ word codes, and the CEFR level.
3.2.3 Grammar
The texts contain 8,712 words in total. There are 286 cases marked that fit the category of ‘grammar’. Table 5 sums up the most frequent grammatical features. These features were observed in earlier studies of non-standard English (cf. Ranta 2018). For instance, number disagreement (one concepts, most of the case, this thoughts). These features occurred at all CEFR levels and irrespective of content grade (see Table 5).
Frequently assigned grammatical features and number of occurrences >10.
Feature | Number of occurrences |
---|---|
Sentence structure incorrect/incomplete | 73 |
Articles
|
47 |
Relative clause or pronoun | 30 |
Number disagreement | 28 |
Verbs: no agreement | 27 |
Verbs: tense | 18 |
Reference unclear | 14 |
Verbs: progressive | 12 |
Not all occurrences will be due to lacking proficiency. They might be slips of the pen, due to time pressure in an examination.
The following sentences contain examples of grammar codings (ower in [7] = our):
If I think of Steven Hawking who lost with the time the total ability to speak it show me that a person who is not able to speak, is still able to think in a way most people who can speak are not. (850) |
Another example would be, in German is just one word for language . In english is a split into language and speech and it is therefore easier to understand what the other is talking about. (785) |
Ower thought are not only influenced by ower language also from feelings, emotion, smell and other factors (906) |
3.2.4 Structure and coherence
Eighty-one of the 438 sentences in the data have a note on coherence. The (grammatical) code ‘sentence structure/word order incorrect’ occurred 73 times. Both codes occur at all CEFR levels – in 38 out of 46 texts – and irrespective of content grade. There are notes on ‘incomplete sentences’, for example in which the subject or verb is omitted, and sentences with comments on the use of non-standard inversion.
3.2.5 Content and comprehensibility
We have 99 instances of codes in the group ‘comprehensibility’. The LEs have many comprehension problems in three texts (four to five notes). Two of these texts are at the B1 level, the third is at B2. The obtained content grades are a 10, a 7, and a 6. In six cases, our LEs indicate that more context is needed for the less informed reader.
An interesting phenomenon is sentence length, a code that appeared 19 times. Sentences are very often quite long: between 36 and 66 words. The LEs only comment on long sentences when they consider comprehensibility problematic. The LEs do not systematically link sentence length to comprehension disturbances. They consider complex sentences when appropriate linking is applied, as proof of good proficiency and command of an academic writing style. LEs also comment on too many short sentences without good linkage as problematic for comprehensibility. LEs indicate several times that lack of coherence makes the reasoning in the text difficult to follow. The content experts do not seem to experience comprehension disturbances, maybe because they expected a more ‘telegraphic’ style as is usual in the genre of answering a quiz question. Examples (8) and (9) were marked as ‘comprehensibility problematic’.
Whorf’s language theory is about the ability of language and it’s skills concluding the level of intelligence. (689) |
A strong aspect of the theory would be, for example, that languages with a stronger pattern of talking ‘future-related’, will more likely have native speakers that are concerned about future investigations that native speakers of languages that do not show a similar future related tense. (863) |
3.3 Comprehensibility: an exemplary text example
We addressed the potential negative effect of comprehensibility disturbances in the third research question. Some examples were discussed in the above section. Our data contains texts graded the maximum 10 by the content expert, while LEs indicate that at least for parts of the text, comprehensibility is problematic. We selected one complete answer to demonstrate the absence of a relation between potential comprehensibility and the actual grade.
The LEs assigned a B1 and commented that, apart from spelling and grammatical divergences, the answer was hard to follow when one did not know the correct answer to the question. The content expert graded the answer with a maximum score of 10. The answer below contains the markings (gray) and CEFR evaluation of LE3.

There was one senior lecturer for the subject course who gave the lectures, compiled the exam, and supplied an answer model. Several assistant lecturers corrected the answers based on this answer model. Given the comments of LE3, the maximum grade of 10 by the content expert can only be understood when the grading is driven by ticking the box for a set of key concepts mentioned in whatever format. We did not have access to the answer model. The following elements must have been part of it as they occurred in most answers, and can also be found in the above answer:
The relation between language and thought according to Whorf;
The strong version and the word determine or synonyms;
The weak version and the word influence or synonyms;
An indicative example for each of the versions;
The strong version is rejected and the weak version is still current.
Our LEs noted comprehensibility problems but never to such an extent that they rejected the text. Only fragments appeared less comprehensible to them. Five students with B1 level scored a grade of 10; one student scored a grade of 7, which is still sufficient. We conclude that no serious communicative disturbances are being signaled in this type of text. Informed content interpretation by the content expert seems to outweigh actual semantic coherence and overrule non-standard grammatical and lexical features. Even our LE, not a content expert, comments to feel that the student understood the concept.
4 Discussion
4.1 The relation between English proficiency levels and grades
In relation to our first research question, we found no correlation between proficiency and grades, a clear outcome. The current study might on the other hand be called limited in how written NNE was evaluated. Scholars have pointed out that the CEFR scales do not primarily address language skills, particularly not at the higher levels. The descriptors of level B2 and above contain academic skills and are very general when it comes to grammatical and lexical ‘mastery’ (Hulstijn 2015). Moreover, the CEFR framework takes ENL as the standard and assumes that interaction takes place with native speakers (McNamara 2012), whereas native English is unrepresentative for the use of English as a lingua franca worldwide (Jenkins and Leung 2017). In 2018, there has been a revision of the descriptors, no longer assuming the approximation of native speakers.[5]
Our choice was nevertheless to use the CEFR framework as it is often the label with which the students enter the university. The CEFR is used worldwide standardly by many second language lecturers, test and learning materials developers, and researchers. At the same time, our study reveals the limitations of the framework. Interpretations differ when LEs float away from the ‘can-do approach’ to arguments based on grammar and lexis in assigning a certain level: “these grammatical features indicate the B1 level.” This can be seen in the differences between the raters: LE1 based her grading sometimes more on grammar, LE2 is a native English speaker who used the ENL standard for grammar, lexis, and formulations, and LE3 and LE4 looked much more at what the students can express. Note, however, that apart from their different focus, there is a high level of consensus among the experts. Also, we found a moderately positive correlation between the experts’ CEFR score and the total score by Grammarly. The latter is based on ENL spelling, grammar, and lexis. This seems to imply that language teachers link CEFR levels for a significant part at least to formal aspects of language.
4.2 Features of ELF writing
The LEs provided us with a list of potential linguistic and textual features of written ELF. Two issues stood out: (i) vocabulary/academic register, and (ii) grammar. Although grammatical accuracy is not a central issue in the ELF literature and there is a general tendency to focus on performance and pragmatics, we are also interested in which grammatical divergences from the ENL norm might lead to incomprehension (if at all) on the part of the recipient of the message.
4.2.1 Vocabulary and academic register
Codes concerning the academic register and lexical choices were in our top 10. One might expect the mastery of vocabulary, or the lack thereof, to affect the content appreciation of a text directly. We could not find such an effect in our data. Texts deemed to have inadequate use of vocabulary frequently were rewarded with the maximum grade by the content expert. Obviously, one is not looking for sophisticated or even precise wordings in this context.
The number of notes concerning the absence of an academic register is distinctive. We see codes concerning ‘not academic vocabulary’ in almost all our texts (43 out of 46) ranging from B1 to C1 and assessed with content grades ranging from 4 to 10. Our data came from a study on the development of productive vocabulary knowledge in English L2 (De Vos 2019). De Vos found some degree of lexical sophistication and even a positive correlation to content grades (De Vos 2019). However, the richness of English vocabulary did not show progress during the first year of study.
Apart from vocabulary, the code ‘not academic register’ frequently occurred. We asked the language experts for an estimate of the CEFR level, based on the (attached) general descriptors. We gave them the freedom to name other aspects on which they based their assessment. They often gave ‘EAP-related’ comments, even though they knew the writings were answers to exams. Perhaps this perspective is triggered by their daily work as EAP teachers. Our study revealed that in the context of an examination, it is in the student’s first interest to express themselves as transparently as possible, not to write a sophisticated academic text. Under time pressure they may resort to descriptions if a word just does not come to mind. This might explain De Vos’outcomes, as she only used answers to examination questions in her study (De Vos 2019). Moreover, the B2 and C1 CEFR levels allow lexical gaps to be solved by circumlocution (Council of Europe 2001), and this is precisely what the students did.
Tiryakioglu et al. (2019) also bring up the relationship between time constraints and word-finding problems in writing a text in an L2. Their subjects performed an argumentative writing task in their L1 Turkish as well as in their L2 English. While writing the L2 text, the subjects needed more time, faced word-related problems, and used more words than in their L1 (Tiryakioglu et al. 2019). Interestingly, in De Vos’ (2019) data, there is also a set of answers to the same examination question written in native Dutch. The average text length in the Dutch answers was lower than in the English answers. The difference that Vander Beken and Brysbaert (2018) found related to testing type – true-false questions versus free recall – may also be attributed to the finding that writing an L2 text takes more time.
Time constraint has certainly played a role in writing the examination, causing slips in spelling, not taking into account academic conventions, and triggering the use of abbreviations, brackets, and contracted forms. This is not to say that students do not master or use these forms when they have to write essays or papers. Shchemeleva (2022) concludes that “L2 speakers have to invest a lot of efforts and resources into developing their ability to conform to a more normative academic writing in English” (Shchemeleva 2022: 19), but at the same time can create non-conventional expressions that are perfectly comprehensible. In recognizing variability as inherent to ELF writing lies the challenge of instructions and support for EAP teachers, instead of sticking to highly conventional norms of English academic writing.
4.2.2 Grammar
We found many NNE features in our analyses of the texts similar to other researchers of ELF. Features such as non-standard articles, prepositions and affixes, absence of logical connectors, subject-verb agreement and tense, and the non-standard use of the -ing form, appear to be familiar forms used by NNE users (Björkman 2013; Mauranen 2018b; Santos 1988). Their familiarity might contribute to not triggering overt communicative disturbance. Moreover, most of these features are used frequently but effectively even by highly proficient L2 users (Jenkins 2011; Mauranen 2018b).
Notes on sentence structure by the LEs appeared frequently. Again, one could attribute this to the type of text. The use of L1-sentence structures might be a case of regression due to the pressing circumstances. One of our LEs frequently pointed out that he had the feeling that students underperformed.
As linguists, we have concentrated on linguistic features in the productive skill of writing. This does not mean that we feel that high linguistic (L2) proficiency alone determines successful performance at university. Students will need to develop other skills, such as retrieving information from lectures and literature relevant to them, taking part in discussions and giving presentations, argumentation, and reasoning. All of these are not only to be acquired by L2-speaking students. These skills cannot be seen as inseparable from language though. Our aim is to get to know those aspects through which some students already at the start of their academic career experience a disadvantage compared to their peers who develop proficiency in the language at a high level.
4.3 Communicative disturbances
The main form feature that might have caused opacity in our data in the eyes of our LEs is sentence length in combination with non-standard word order. The problem of long sentences with unclear meaning occurs at all proficiency levels in our data. However, sentence length in itself does not necessarily obscure meaning. The LEs appreciate complex sentences with effective use of linking as evidence of advanced ELF proficiency and as adding to comprehensibility. This idea is in line with the findings of Tiryakioglu et al. (2019). Their results indicate that there is a relationship between the non-native proficiency level and the composing processes in the L2. The formulation process is performed more efficiently when a learner/user is more proficient in that language.
We studied written communication where there is no possibility to resolve misunderstandings and communicative breakdowns. The evaluation by intended readers is crucial in signaling the occurrence of disturbances in communication in the sample of texts. The grades assigned by the content experts do not signal such disturbances in relation to NNE features. There appears to be a sufficient level of mutual comprehensibility. We conclude that the content experts, having the model answer in mind, are not bothered by formulations in English that were considered less adequate by the LEs, while understanding these subject-specific texts might be more difficult for non-content experts. The main conclusion of this study is that evaluators let content prevail over non-native language forms.
One may argue that the language background of the content expert is a significant factor in our study. The content experts are typically ELF speakers. Björkman (2013) observed many of the same NNE features in the language of lecturers as in the language of students (Björkman 2013). In our study, the content experts roughly share the linguistic background of their students, as German and Dutch are closely related languages. This might explain why the content expert is not distracted in reading ELF.
In an earlier publication, we reported that university lecturers – the content experts in the present study – said to read ‘cooperatively’ when reading non-standard Dutch by German students. Mauranen (2017) used the term “augmented cooperativeness between participants” (Mauranen 2017: 8). This applies particularly in the setting of an examination. Often, answers lack cohesion, contain lexical gaps, and the formulation lacks accuracy (see example above), and still, the assessor was willing to uncover the student’s intent in a supportive way. In spoken language, the conversational partners frequently use pragmatic strategies to achieve communicative effectiveness and to prevent or repair disturbances (Björkman 2013). In writing, the recipient is not present, so it is in the writer’s interest to be as clear and explicit as possible, especially in the setting of an examination. Nevertheless, even when the student does not succeed in the eyes of the LEs, the content expert as an assessor is so cooperative as to interpret the student’s message as informative as possible. University lecturers unilaterally accept, if not condone, instances of English that diverge from ENL but that are nevertheless intelligible to them (Jenkins 2011). By ticking the box for words and phrases representing the key concepts, the assessor accumulates evidence for a student’s understanding of Whorf’s hypothesis.
5 Conclusion, further research, and pedagogical implications
This study aims to contribute to the discussion on the possible detrimental effects of studying in an L2. We addressed three research questions. Firstly, we questioned the relationship between the CEFR level of a short written answer to an open examination question and the grade assigned by content lecturers. There was no correlation between the two. Secondly, we analyzed features of these spontaneously written NNE academic texts, both linguistically and textually. Finally, we explored which non-standard features might cause communicative disturbances and thus influence the content grading of an open examination question. We did not find any feature that caused some form of communicative disturbance for a content grader who is an informed reader with a model answer in mind.
Our findings reflect that content interpretation prevails over language forms. ELF in this setting is not very problematic for informed content experts. However, students do not only write answers to examination questions. They need to develop academic writing skills for essays and papers, as in such cases the reader is less sure about the interpretation of the content. Insight into the NNE communication in an early stage of the academic career can aid to develop (discipline-specific) adequate support for students settling into university.
We endorse the need for integrated content and language support in the form of dedicated classes that target the academic requirements associated with the subject area (De Vos 2019; Rose et al. 2019; Wingate 2018). Such an approach calls for a change of mindset on the part of the subject lecturers. Developing English language competence is often seen uniquely as the task of language teachers; academic literacy is often regarded as the responsibility of secondary education and is therefore taken as something first-year students bring with them as they enter university (Wingate 2018). This is not the case for students in their L1, let alone in their L2. Universities should develop programs in which academic writing specialists and subject lecturers work together to improve subject-specific academic language skills (Pitkänen 2018; Wingate 2018).
Funding source: Radboud into Languages, the expertise center for language and communication of the Radboud University Nijmegen.
-
Research funding: The work on this research is partly funded by Radboud into Languages, the expertise center for language and communication of the Radboud University Nijmegen.
References
Björkman, Beyza. 2013. English as an Academic Lingua Franca: An investigation of form and communicative effectiveness, vol. 3. Boston & Berlin: Walter de Gruyter.10.1515/9783110279542Suche in Google Scholar
Council of Europe. 2001. Common European framework of reference for languages: Learning, teaching, assessment. Cambridge: Cambridge University Press. http://ebcl.eu.com/wp-content/uploads/2011/11/CEFR-all-scales-and-all-skills.pdf (accessed 29 November 2022).Suche in Google Scholar
Dafouz, Emma & María-del-Mar Camacho-Miñano. 2016. Exploring the impact of English-medium instruction on university student academic achievement: The case of accounting. English for Specific Purposes 44. 57–67. https://doi.org/10.1016/j.esp.2016.06.001.Suche in Google Scholar
De Vos, Johanna F. 2019. Naturalistic word learning in a second language. Nijmegen: Radboud University unpublished doctoral dissertation.Suche in Google Scholar
De Vos, Johanna F., Schriefers Herbert & Kristin Lemhöfer. 2020. Does study language (Dutch versus English) influence study success of Dutch and German students in the Netherlands? Dutch Journal of Applied Linguistics 9(1/2). 60–78. https://doi.org/10.1075/dujal.19008.dev.Suche in Google Scholar
Hulstijn, Jan H. 2015. Language proficiency in native and non-native speakers: Theory and research, vol. 41. Amsterdam & Philidelphia: John Benjamins.10.1075/lllt.41Suche in Google Scholar
Jenkins, Jennifer. 2011. Accommodating (to) ELF in the international university. Journal of Pragmatics 43(4). 926–936. https://doi.org/10.1016/j.pragma.2010.05.011.Suche in Google Scholar
Jenkins, Jennifer & Constant Leung. 2017. Assessing English as a Lingua Franca. In Elana Shohamy, Iair G. Or & Stephen May (eds.), Language testing and assessment. (Encyclopedia of Language and Education 7), 1–15. Cham: Springer.10.1007/978-3-319-02261-1_7Suche in Google Scholar
Klaassen, Renate G. 2001. The international university curriculum: Challenges in English-medium engineering education. Delft: Technische Universiteit Delft unpublished doctoral dissertation.Suche in Google Scholar
Mauranen, Anna. 2017. Second-order language contact: English as an Academic Lingua Franca. In Markku Filppula, Yuhani Klemola & Devyani Sharma (eds.), The Oxford handbook of World Englishes, 735–754. Oxford: Oxford University Press.Suche in Google Scholar
Mauranen, Anna. 2018a. Conceptualising ELF. In Jennifer Jenkins, Will Baker & Dewey Martin (eds.), The Routledge handbook of English as a Lingua Franca, 7–24. London: Routledge.10.4324/9781315717173-2Suche in Google Scholar
Mauranen, Anna. 2018b. Second language acquisition, World Englishes, and English as a Lingua Franca (ELF). World Englishes 37(1). 106–119. https://doi.org/10.1111/weng.12306.Suche in Google Scholar
Mauranen, Anna & Maria Metsä-Ketelä. 2006. Introduction: English as a Lingua Franca. Nordic Journal of English Studies 5(2). 1–8. https://doi.org/10.35360/njes.9.Suche in Google Scholar
McNamara, Tim. 2012. English as a Lingua Franca: The challenge for language testing. Journal of English as a Lingua Franca 1(1). 199–202. https://doi.org/10.1515/jelf-2012-0013.Suche in Google Scholar
Pitkänen, Kari K. 2018. From a reader/listener to a speaker/writer: Student views confirm the need to develop English courses further towards productive, interactive skills. Language Learning in Higher Education 8(2). 445–468. https://doi.org/10.1515/cercles-2018-0023.Suche in Google Scholar
Ranta, Elina. 2018. Grammar in ELF. In Jennifer Jenkins, Will Baker & Dewey Martin (eds.), The Routledge handbook of English as a Lingua Franca, 242–254. London: Routledge.10.4324/9781315717173-21Suche in Google Scholar
Rose, Heath, Samantha Curle, Ikuya Aizawa & Gene Thompson. 2019. What drives success in English medium-taught courses? The interplay between language proficiency, academic skills, and motivation. Studies in Higher Education 45(11). 2149–2161. https://doi.org/10.1080/03075079.2019.1590690.Suche in Google Scholar
Santos, Terry. 1988. Professors’ reactions to the academic writing of non-standard-speaking students. Tesol Quarterly 22(1). 69–89. https://doi.org/10.2307/3587062.Suche in Google Scholar
Shchemeleva, Irina. 2022. When the unconventional becomes convention: Epistemic stance in English as a lingua franca research articles. Journal of English as a Lingua Franca 11(1). 1–23. https://doi.org/10.1515/jelf-2022-2075.Suche in Google Scholar
Tiryakioglu, Gulay, Elke Peters & Lieven Verschaffel. 2019. The effect of L2 proficiency level on composing processes of EFL learners: Data from keystroke loggings think alouds and questionnaires. In Eva Lindgren & Kirk P. H. Sullivan (eds.), Observing writing. Insights from keystroke logging and handwriting, 212–235. Leiden & Boston: Brill.10.1163/9789004392526_011Suche in Google Scholar
Vander Beken, Heleen & Marc Brysbaert. 2018. Studying texts in a second language: The importance of test type. Bilingualism: Language and Cognition 21(5). 1062–1074. https://doi.org/10.1017/S1366728917000189.Suche in Google Scholar
Wingate, Ursula. 2018. Academic literacy across the curriculum: Towards a collaborative instructional approach. Language Teaching 51(3). 349–364. https://doi.org/10.1017/s0261444816000264.Suche in Google Scholar
Zijlmans, Lidy, Anneke Neijt & Roeland Van Hout. 2016. The role of second language in higher education: A case study of German students at a Dutch university. Language Learning in Higher Education 6(2). 473–493. https://doi.org/10.1515/cercles-2016-0026.Suche in Google Scholar
Zijlmans, Lidy, Marc van Oostendorp & Roeland van Hout. 2020. Studying in a foreign language: Study performance and experiences of German students at a Dutch university. Language Learning in Higher Education 10(1). 25–51. https://doi.org/10.1515/cercles-2020-2017.Suche in Google Scholar
© 2022 the author(s), published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Artikel in diesem Heft
- Frontmatter
- Articles
- “English gradually” and multilingual support in EMI: insights from lecturers in two Brazilian universities
- Do writing performance and examination grading correlate in an EMI university setting?
- An analysis of written English: assessing characteristics of English writing by Japanese university students through perspectives of World Englishes and English as a lingua franca
- Polite impoliteness? How power, gender and language background shape request strategies in English as a Business Lingua Franca (BELF) in corporate email exchanges
- Developing English language teachers’ and learners’ ELF awareness: the background, design and impact of the ENRICH project’s continuous professional development programme
- JELF Colloquium
- Conclusion
- Book Reviews
- Tweedie, M. Gregory and Robert C. Johnson: Medical English as a Lingua Franca
- Guilherme, Manuela and Lynn Mario T. Menezes de Souza: Glocal Languages and Critical Intercultural Awareness: The South Answers Back
- Walkinshaw, Ian: Pragmatics in English as a Lingua Franca: Findings and developments
Artikel in diesem Heft
- Frontmatter
- Articles
- “English gradually” and multilingual support in EMI: insights from lecturers in two Brazilian universities
- Do writing performance and examination grading correlate in an EMI university setting?
- An analysis of written English: assessing characteristics of English writing by Japanese university students through perspectives of World Englishes and English as a lingua franca
- Polite impoliteness? How power, gender and language background shape request strategies in English as a Business Lingua Franca (BELF) in corporate email exchanges
- Developing English language teachers’ and learners’ ELF awareness: the background, design and impact of the ENRICH project’s continuous professional development programme
- JELF Colloquium
- Conclusion
- Book Reviews
- Tweedie, M. Gregory and Robert C. Johnson: Medical English as a Lingua Franca
- Guilherme, Manuela and Lynn Mario T. Menezes de Souza: Glocal Languages and Critical Intercultural Awareness: The South Answers Back
- Walkinshaw, Ian: Pragmatics in English as a Lingua Franca: Findings and developments