Abstract
This paper presents a research project, funded by Durham University, investigating the potential of Generative AI (GenAI) for Modern Foreign Languages (MFL) learning, teaching and assessment within the School of Modern Languages and Cultures (MLaC) at Durham University. The study explores how Generative AI can support writing skills and feedback in the MFL context. Through comparative analysis of tasks completed with and without ChatGPT, and feedback by both teachers and ChatGPT, the study investigates the pedagogical implications of emerging technologies in language teaching, with a focus on writing. The project involves 8 language areas, namely: Arabic, Chinese, French, German, Japanese, Italian, Russian, and Spanish, and recruited two participants for each language with a CEFR-level of B1-B2. Participants completed diagnostic surveys, written tasks with reflections, sought feedback from ChatGPT and their teachers and reflected on them, and engaged in semi-structured interviews and focus groups. The project highlights the potential of GenAI for supporting students’ learning and concerns about its implications for academic integrity and the development of linguistic skills, especially for beginner learners. It also emphasizes the need for clear institutional guidelines on the ways in which students can and cannot use GenAI to complete their academic work.
1 Introduction
According to Oxford English Dictionary (2023), generative artificial intelligence is defined as “artificial intelligence designed to produce output, esp. text or images, previously thought to require human intelligence, typically by using machine learning to extrapolate from large collections of data.” Since ChatGPT was released to the public in late 2022, interest in the field of GenAI has risen exponentially, and many other tools have subsequently been launched. ChatGPT remains one of the most popular writing tools currently, and it has been updated frequently. Open AI announced that as of August 2024, ChatGPT had 200 million weekly active users worldwide (Baclinko Team 2024).
GenAI has transformed many aspects of human society, particularly education (Cooper 2023), causing many Higher Education institutions to rethink their assessment methods (Walker 2025). It has the potential to automate tasks, process large quantities of data, and “integrate knowledge of different disciplines and multiple technologies simultaneously” (Yang 2022: 1). The advent of GenAI in education has raised concerns among educators and learners alike. Educators have become concerned about the originality of work submitted by students (Grove 2024), particularly in summative assessments, and students worry about the possibility of AI replacing humans in completing tasks, thus reducing job opportunities available to them (Attewell 2023).
In response to the potentials and challenges of GenAI to Higher Education, many universities were keen to study its impacts on their practices and their students. Numerous academics became interested in studying how these new technologies can affect their respective fields. This paper represents an attempt to understand how GenAI, particularly ChatGPT, can be used by students of Modern Foreign Languages (MFL) in facilitating their writing processes and how they perceive its role and effectiveness in these practices. After offering a brief review of the literature on the use of GenAI in language learning, the article will present the methodology of an experiment conducted in the School of Modern Languages and Cultures (MLAC) and the Centre for Foreign Language Study (CFLS) at Durham University aiming to explore the use of ChatGPT in developing students’ writing skills in eight languages. The findings and discussion section will focus on students’ practices and perceptions of ChatGPT.
2 Literature review
Although research on the potential of chatbots as interactional partners for language learning has a relatively long history over several decades (Coniam 2008), recent advances in GenAI have led to a surge in publications exploring the subject-specific properties and challenges of these technologies for language teaching and learning (e.g., Bibauw et al. 2022; Coronado-Badillo and Santana-Negrín 2023; Klímova and Ibna Seraj 2023; Kohnke et al. 2023; Leal and Torres 2023; Mavropoulou 2023; Muñoz Basols and Fuertes Gutiérrez 2024; Popenici and Kerr 2017; Ribes Lafoz et al. 2023; Ricart-Vayá 2024; Román Mendoza 2023; Sarrazola 2023; Xiao and Zhi 2023). This is particularly evident in publications that focus specifically on ChatGPT, given its popularity. Klímova and Ibna Seraj (2023) highlight the positive effects of chatbots use on the development of additional language skills, including intonation, stress and fluency. Quinio and Bidan (2023) posit that ChatGPT offers exceptional opportunities for learning in general, contingent upon the learner assuming an active role and treating the chatbot as a collaborative partner, evaluating its responses and looking for biases or omissions. In this regard, Román Mendoza's (2023) cautions against the Anglophone bias and hallucinations (i.e., invented information presented as truthful) inherent in ChatGPT when used for metalinguistic purposes, such as Spanish grammar explanations. The cultural biases inherent in GenAI are further explored in the work of Dai and Hua (2024). Moreover, in addition to the concerns regarding plagiarism across the higher education sector (Grove 2024), several scholars (e.g., Dooly and Comas-Quinn 2024; Szabó and Szoke 2024) have drawn attention to the accessibility and social justice issues associated with ChatGPT and shared by most technological advancements, namely the perpetuation of the digital gap and the promotion of an elite form of multilingualism, with some languages (and their communities of speakers) being more represented than others.
Nevertheless, it is beyond doubt that the field of foreign language teaching and learning is undergoing a paradigm shift, where ChatGPT must be regarded as a “new interactional dimension with the target language” that is here to stay (Muñoz Basols and Fuertes Gutiérrez 2024: 347). Additional language education is indeed becoming increasingly technology-based, and more experimental studies on the practical use of the latest technologies with clear pedagogical outcomes are needed (Klímova et al. 2023). There has been a significant growth in the literature examining the effective use of GenAI tools by language teachers to create pedagogical content (e.g., Mavropoulou 2023). However, empirical studies investigating how learners employ ChatGPT or similar tools in autonomous ways (Xiao and Zhi 2023), particularly drawing on critical and multilingual perspectives, remain limited. This study aims to address this gap by exploring the following research question: In what ways do students incorporate ChatGPT into their MFL writing practices and how do they perceive its role and effectiveness in this process? In pursuing this research question, we aim to shed light on how GenAI tools such as ChatGPT could potentially impact students’ approaches to produce coherent writing in their target languages, and how the use of ChatGPT could help students to develop inner-feedback, evaluation, comparison and prompt-writing skills specific to learning MFL. We also aim to understand students’ experiences and reflections on incorporating ChatGPT into their writing processes to inform potential enhancements in MFL teaching, assessment and feedback practices, and to better support and guide students in using ChatGPT ethically, responsibly and effectively in their language learning process.
3 Methodology
This project involved the eight language areas represented in the MFL degrees offered at Durham University, namely: Arabic, Chinese, French, German, Italian, Japanese, Russian and Spanish. Although the project focused on developing foreign language writing skills, one of the participating researchers investigated the use of ChatGPT in the teaching of Italian medieval literature. After obtaining ethical approval, data collection began. The team recruited a research assistant and 18 research participants: four students from Italian, and two from each other language area. All students had a B1-B2 level of proficiency in their target language according to the CEFR.
The project collected different types of data over the different stages of preparation, writing, reflection, and evaluation. First, students were asked to complete a benchmarking survey (see Appendix 1). The purpose of the survey was to collect demographic data about the students, including their age group, previous experience with language learning, perceived level in the language they were studying, and previous experience with ChatGPT or other AI tools. Secondly, students were asked to complete the writing task without drawing on AI tools and to answer questions to reflect on their experience (see Appendix 2). The questions dealt with the writing process, how students prepare for writing, the resources they use while completing the writing and the skills they develop in the process. Thirdly, students were asked to complete the same writing task with ChatGPT and complete another reflection (see Appendix 2). The questions this time focused on how they used ChatGPT in writing, and their perceptions of its role in the writing process. In both cases, the writing task had a word limit of 250–400 words, and students were recommended to complete it within 1.5 h. The essay was designed to be personal and reflective, ensuring that students could not get straightforward answers from ChatGPT and prompting them to interact with the tool independently in creative ways. The task instructions read as follows: “Reflect on your experience as a language student at Durham University, making specific reference to modules you studied, skills you acquired, and whether you have used your skills outside the classroom. How does your experience correspond to your expectations before joining?”
After completing both writings and reflections, the research assistant anonymised all the data and sent the essays and reflections to the relevant member of the research team for assessment and feedback. Following a common set of assessment criteria - including content, coherence and cohesion, accuracy, and language repertoire - the teacher-researchers provided written feedback on all writing tasks, without being told which ones had been written with the aid of ChatGPT and which ones without. The research assistant collected this feedback from teachers and passed it to the relevant students. After that, participants were asked to prompt ChatGPT for feedback on their essays. Students were then asked to reflect on the usefulness of the feedback received (see Appendix 3). Finally, they attended semi-structured interviews and a focus group, where they had the chance to reflect again on their experience with the research assistant and other participants in the project (see Appendix 4).
All audio-recorded data from the interviews and focus groups were transcribed by the research assistant, and a thematic analysis of all participant data was conducted collaboratively, and multilingually (Holmes et al. 2013), in different phases. Following Braun and Clarke's (2006) approach to thematic analysis, first, the teacher-researchers familiarised themselves with the data and shared a summary of the main points that emerged from the chat history and audio-recorded data in their language area. For instance, the Arabic teacher-researcher analysed and summarised for the rest of the research team the interactions of the two Arabic student-participants with ChatGPT, in conjunction with their texts in the target language and their interview data. The summaries of the interview and chat histories from each language area led to the generation of some initial codes that informed the search for overarching themes and sub-themes by the multilingual team, which in turn informed the coding process carried out by the research assistant using NVivo. In this paper, we present an analysis of the NVivo codes generated under each of the following overarching themes: methodology for completing the task, own task competence, reflection on experience, GenAI capability, ethics of using GenAI, and the future of GenAI in language learning.
4 Findings and discussion
This section presents an analysis of the key codes derived from the interviews and focus group transcripts via a thematic analysis conducted with NVivo. Each code captures a recurring theme of the participants’ narratives, representing the primary topics and concerns they raised. The accompanying tables summarise the codes, the number of students who referenced them (N for number) and the frequency with which each topic was mentioned (F for frequency). A detailed interpretation of each code is presented below, along with a discussion of the findings supported by quotations drawn from participants’ data in combination with the researchers’ views on their chat histories and texts produced in the target language. The section is divided into two subsections that will explore participants’ practices in using ChatGPT during the experiment and their perceptions of it.
4.1 Students’ use of ChatGPT and reflection on their writing
4.1.1 Students’ methodology for completing tasks
Across the various languages represented in the experiment, looking at the students’ chat history data, we saw that they used ChatGPT to generate an essay plan, correct their grammar and to provide content and ideas. Some participants engaged in a collaborative dialogue with ChatGPT, adopting an enquiry-based approach. For instance, a participant in Chinese asked about translations of specific sentences and words, sought definitions of certain terms, and requested that ChatGPT analyse sentences. The student also questioned ChatGPT’s word choices, asking why it had opted for certain words over others. Furthermore, they sought clarification on the distinctions between various words and requested example sentences demonstrating particular grammatical constructions. At the other end of the spectrum, one of the participants in French used ChatGPT as a vocabulary tool exclusively asking questions about: essential vocabulary related to university; how to use “encore plus”; connecting words; French words for “as well”. Similarly, one learner of Spanish, after asking for an essay plan, prompted ChatGPT to provide “some connectives or phrases” in the target language to be used at the beginning of each paragraph. The other participant in Spanish initiated the dialogue with ChatGPT by asking “Can you write essays in Spanish?” Subsequently, the student entered the writing task prompt in the chat and copied and pasted the entire essay generated by the machine, without making any modifications. Arabic students did not use ChatGPT to produce texts for them at all. One student used it to answer grammar and vocabulary questions, and the other one used it to translate phrases and to answer questions. A minority of students (3 out of 18) reported that using ChatGPT to complete this writing task required less time compared to their usual writing tasks. These various experiences suggest that not all students understood how to use ChatGPT efficiently for learning. Indeed, many participants did not have any prior experience of ChatGPT, and the data challenged some of the expectations we had about students’ digital literacy, considering their “digital nativeness” (Ng 2012). It revealed that students had varying levels of digital skills and several of them struggled with prompt writing, as will be discussed below. This skill is essential for the optimal use of ChatGPT and the difficulty in forming effective prompts had an impact on the participants’ perception of ChatGPT (Tables 1–6).
Methodology for completing tasks.
Approach to working with GenAI |
|
Tasks GenAI asked to complete |
|
Time taken on task |
|
Feedback |
|
Own task competence.
Care put into work |
|
Motivation during tasks |
|
Self-critique |
|
Reflection on experience.
Future usage of GenAI |
|
Prompt formulation |
|
Reflections on GenAI |
|
Retention of learning |
|
Using GenAI to progress |
|
GenAI capability.
GenAI errors |
|
GenAI is capable |
|
GenAI lacks ability |
|
GenAI speed |
|
GenAI tone |
|
Comparison with other tools |
|
Ethics of using GenAI.
Trust in GenAI |
|
Risks |
|
University GenAI guidelines |
|
Future of GenAI in language learning.
Hypothetical future usage |
|
Teaching how to use GenAI |
|
Participants were asked to compare the feedback provided by their teachers and by ChatGPT. While several participants noted that ChatGPT’s feedback was helpful, some students thought it was not helpful at all. Participants enjoyed the immediacy of ChatGPT’s feedback, with one participant in Chinese noting that they appreciated ChatGPT’s “unlimited patience” in providing detailed, sentence-by-sentence analysis, including grammatical mistakes and improvement suggestions (see also Xiao and Zhi 2023). However, overall participants preferred the feedback they received from teachers. They felt that the feedback provided by ChatGPT was often too vague and generalising as well as potentially inaccurate. Several participants observed that ChatGPT cannot tell whether an expression is idiomatic and were disappointed to find that the revised version of their essays still contained mistakes, as pointed out by the teacher in their feedback. Similarly, ChatGPT appeared to have a “people pleasing” tendency and its feedback tended to be overly generous. For example, one of the participating students in French asked ChatGPT to mark their essay. The original essay received a mark of 90 %. The participant then submitted a revised version of the essay and asked ChatGPT to assess it, based on the prompt: “How accurate is the French in this essay?” ChatGPT provided detailed feedback with a slightly lower mark of 85 %. While the student was very pleased to obtain such high marks, they observed that they would not be able to achieve such marks at university and that it would be helpful to supply ChatGPT with a marking scheme, to receive more accurate feedback.
4.1.2 Students’ reflection on their own task competence
When reflecting on their ability to complete tasks with or without the use of GenAI, some respondents expressed that they tended to be less meticulous when using GenAI because it addressed grammatical issues and offered linguistic variation. However, several participants also observed that ChatGPT occasionally made errors and tended to satisfy users when being questioned, so they emphasised the importance of being critical and analytical when integrating GenAI into writing tasks.
In terms of motivation, most students reported feeling more motivated when using ChatGPT because it improved their performance and increased the speed at which they could complete the task. This echoes previous studies conducted about students’ motivation and engagement (Nghi et al. 2019; Klímova and Ibna Seraj 2023). For example, one student who was studying Japanese mentioned that ChatGPT helped alleviate their anxiety, because they lacked confidence in learning foreign languages, especially in writing. However, some respondents expressed feeling less motivated when using ChatGPT. One, who was studying German, stated that,
“[…] by the time I’d finished writing the second one with ChatGPT I just thought, I don’t really want to do this anymore. I felt a bit demotivated and like my work wouldn’t be good enough without ChatGPT. […] I feel more proud of the work that I did by myself.”
Some participants had a more neutral position on GenAI. For example, a student studying Italian said:
“ChatGPT maybe impact my confidence in writing in English, […] but it does not substitute my effort to improve my skills. […] it can strengthen my confidence, but not determine it.”
Overall, many respondents acknowledged that ChatGPT has supported their language learning process by promoting self-critique via immediate and tailored feedback, which encouraged active engagement and reflection in their language studies. Such engagement may, in turn, help learners develop their autonomy, defined by Palfreyman (2014:182) as:
the capacity for intentional use in context of a range of interacting resources toward learning goals. (…) The autonomous learner will identify in her environment resources relevant to her purposes, make effective use of these, be open to new affordances in her environment and be able to adapt to changing circumstances by seeking out new resources or adopting new ways of using them for learning.
However, similar to other studies on the topic (e.g., Szabó and Szoke 2024), our findings affirm that care must be taken when guiding students in the use of GenAI, as some participants were not necessarily aware of their overreliance on ChatGPT to construct their sentences and texts, at the risk of bypassing learning and distancing themselves from their own voice in the target language. Learners must be equipped with the skills to be critical and analytical about the work generated by GenAI, and to understand GenAI systems’ tendency to hallucinate (Román Mendoza 2023; Szabó and Szoke 2024). Furthermore, students should be provided with foundational knowledge about how large language models work, so they can recognise that making mistakes-including errors that a GenAI system might not make-is a natural and essential part of language acquisition, and one that has to do with authenticity and finding one’s own voice in a new language (van Lier 1996). Indeed, mistakes should be embraced as part of the learning process. GenAI, therefore, should be regarded as a supportive learning and guidance tool to assist learners, rather than as a tool with which to compete.
4.1.3 Students’ reflection on their experiences of GenAI
In reflecting upon their experiences using ChatGPT for writing tasks, the majority of students expressed a willingness to continue using it for work and study. However, some identified specific scenarios in which they would refrain from using ChatGPT. For example, a student who was studying Russian said that they would avoid using ChatGPT for translation because completing translation tasks independently was more beneficial for their language learning. They also indicated that they would neither rely on ChatGPT for literary analysis, citing concerns about the trustworthiness of its information sources, nor generate full sentences or essays, saying, “You could potentially just stop thinking, […] and […] not make any progress”.
Indeed, 11 out of 18 students reported challenges with prompt formulation while using ChatGPT for writing tasks. For example, some who were studying Japanese mentioned that they “could only come up with limited prompts” and encountered difficulties in crafting the right prompt, providing detailed instructions to improve their essays, or adjusting the prompt when the ChatGPT output was unsatisfactory. A respondent studying French expressed a similar concern, stating: “It was really annoying, because I couldn’t figure out what to say to get the answer I wanted. But when I did, the answer was good”.
Most of the participants stated that their experience using ChatGPT had provided them with valuable insights into GenAI’s capabilities, particularly to support the development of their grammar and writing skills. One student learning Russian shared that “it was a mind-blowing discovery that everything’s better with ChatGPT”, referring to the end product. However, several students also voiced concerns and negative views about the use of GenAI in language learning. Half of the students felt that the work produced through ChatGPT was not their own and believed that it was more harmful than beneficial in terms of learning retention.
4.2 Students’ perception of ChatGPT
4.2.1 GenAI capability
As noted, several participants were impressed by ChatGPT’s ability in the target language. One of the participants in French was surprised to find out that ChatGPT could write a better essay in French than she could, but across languages, most participants believed that ChatGPT had better proficiency than them. For instance, one of the participants in Chinese, while acknowledging their own proficiency limitations, found the ChatGPT’s Chinese output to be fluent. The student suggested that this was likely due to the abundance of Chinese training data available. On the contrary, participants in Arabic did not use ChatGPT as a writing tool, as they had doubts about its linguistic ability. At the same time, students noted that while ChatGPT was often better than they were in terms of grammatical accuracy, it was not always helpful in terms of creating interesting content. Participants observed that ChatGPT often repeated structures and ideas and was not capable of generating in-depth content, only formulating vague answers. One of the participants in French suggested that ChatGPT was helpful for writing short essays but was too generic for literary analysis and more substantial pieces of writing. Students additionally mentioned that ChatGPT lacked cultural awareness and personality. For example, a participant in Spanish noted that “[i]t lacks nuances in terms of […] how normal Spanish people speak”. Similarly, one of the participating students in Chinese expressed uncertainty about ChatGPT’s cultural sensitivity, particularly regarding nuanced word choices and context-specific language use.
When comparing ChatGPT to other AI tools, participants overall thought that ChatGPT was better than Google Translate, as well as other translation tools. Participants enjoyed how comprehensive ChatGPT is, combining several tools (used for paraphrasing, translating, looking for synonyms, etc.) in one. Several participants noted how fast and convenient it is to refine their own language use. However, participants expressed their concerns, including in the initial benchmark survey, about over-reliance on GenAI. As mentioned earlier, the researchers had access to participants’ saved chat histories with ChatGPT, which revealed how they had utilised the tool to complete the writing task and obtain feedback. These were valuable empirical data that complemented the participants’ reflections and comments gathered through the other data collection methods. For instance, there were notable differences in how students acted on the afore-mentioned shared concerns in their actual usage of ChatGPT, as their chat history revealed that some students relied heavily on ChatGPT to generate ideas and structure as well as well as correct and even translate in the target language.
In addition, several participants noted that ChatGPT could not replace human teachers. A participant in Chinese believed that ChatGPT cannot completely replace language teachers, but that it can handle some basic tasks. They thought ChatGPT was capable of delivering grammar lessons, explaining new constructions, and providing examples. However, for more complex tasks like improving language use in context, understanding subject nuances, or exploring deeper linguistic concepts, the student felt that a teacher or trained linguist would be superior. To our surprise, participants did not mention the risk of obtaining hallucinations when prompting ChatGPT for grammar explanations (see Román Mendoza 2023). Similarly, a participating student in Spanish noted that the human element is essential in language learning: “it does take out the human and cultural element of it. I think one of the things I’ve loved learning about my target language is also […] the cultural accumulation, the learning of the cultural history of the language with it.”
4.2.2 Ethics of using GenAI
Participants across languages shared concerns about plagiarism and the need for clear university guidelines on the ethical use of GenAI. In addition, participants expressed their concerns around ethical questions of authorship, ownership and authenticity. A participant in French expressed her opposition to using ChatGPT for marked assessment, while at the same time feeling confused about ethical boundaries:
“I definitely wouldn’t use it for […] an essay for class that we’re gonna mark in class, […] I don’t think it’s appropriate. It is cheating, because it’s not your own ideas, but also that being said […] you’re allowed to quote research papers and stuff in your exams. So, is that not the same thing? Oh, I don’t know, actually, because I guess I copy other people’s ideas, anyway by using like journal articles so sorry. I don’t know what my answer is.”
At the same time, ChatGPT was also perceived as a source of confidence by some participants. The knowledge that students can write without making mistakes was perceived as a positive part of the experience. Indeed, the question of confidence and trust appeared repeatedly in students’ responses. Students’ trust in GenAI’s ability was mixed. Some participants highly trusted ChatGPT. For instance, a participant in Chinese rated their trust in ChatGPT at 7 out of 10, expressing higher confidence in its content knowledge and cultural explanations than in its ability to generate or translate full texts. The participant viewed GenAI as an educational tool comparable to dictionaries or textbooks, maintaining that work produced with GenAI assistance was still fundamentally their own. Other participating students, notably in Russian studies, had no trust at all in ChatGPT. Conversely, some participants did not have much trust in their own language skills and felt that ChatGPT could do better than what they can achieve.
Hence, while ChatGPT can be seen as a valuable tool for enhancing confidence and language accuracy, participants displayed divided perceptions of trust in its use for language learning. The lack of clear guidelines on its ethical use can generate confusion amongst students, particularly regarding its role in assessments and the boundaries of authorship and authenticity. In this regard, although his work predates the digital age, the three conditions identified by van Lier (1996) for developing one’s own voice in a foreign language may still be relevant today: awareness of language and learning; autonomy and self-determination in using the language and in the learning processes, and authenticity in communicative events. The dilemma for many is how to determine the level of authenticity of a text written with the help of GenAI. While some of the participants viewed ChatGPT as a supportive tool, others were skeptical of its cultural and linguistic reliability or worried about its impact on their independence and skills. As noted above, these mixed reactions highlight the need for institutions to establish clear guidelines and promote a balanced approach, as also suggested by Xiao and Zhi in their study of students’ perception of ChatGPT (2023), emphasising ChatGPT as a complementary tool rather than a replacement for personal effort and critical thinking (Leal and Torres 2023).
4.2.3 Future of GenAI in language learning
Regarding the hypothetical use of GenAI in language learning, one-third of students highlighted its potential to support their language studies as a tutor. For example, one student who was studying Spanish suggested that submitting a selection of recently written essays to GenAI, which could summarise the recurring errors that would typically be identified by a teacher, and then generate tailored exercises for the student, could be advantageous. This approach was described as highly beneficial for fostering independent learning. A different student, who was studying Italian, emphasised the use of GenAI for improving their grammar and identifying resources for practising specific grammatical structures. Similarly, one-third of the students expressed that utilising GenAI for speaking practice would be useful, particularly for improving pronunciation (see Klímova and Ibna Seraj 2023). Other students noted the potential of using GenAI to generate tailored exercises, such as vocabulary tests.
Students evidently understand their language learning needs and that GenAI could be a tool to support their studies. However, caution must be exercised to avoid over-reliance on GenAI in the language learning process (Darwin et al. 2023). For example, one student, who was learning Chinese, mentioned that they hoped to use GenAI to summarise texts when pressed for time during reading exercises. While GenAI could potentially improve reading efficiency for learners, it may also deprive students of opportunities to develop and strengthen their reading skills. Some students suggested that teaching how to use GenAI by offering introductory classes on its applications would be helpful. We also believe that providing training for teachers on how to guide students in appropriately and effectively using GenAI to support their learning is essential.
As technology continues to develop at an unprecedented pace, the integration of GenAI in language teaching and learning has become inevitable. It is encouraging that some students have already recognised both the advantages and the limitations of GenAI. To guide students in using GenAI effectively, responsibly and ethically for independent learning, it is crucial for teachers to emphasise the importance of actively engaging with the learning material and environment to ensure that GenAI is utilised as a tool to support their learning, rather than a quick solution that replaces their cognitive processes and effort. Furthermore, it is also essential to equip students with the necessary skills for prompt engineering so that they can craft precise prompts, increase their learning efficiency and gain more meaningful insights while interacting with AI. Following this research project, several members of the team have been involved in the development of guidelines in MLAC and CFLS to support students on the ways in which they can and cannot use GenAI in completing their tasks and how to acknowledge its use when permitted. The guidelines clearly establish forbidden uses of AI (such as plagiarism and direct submission of AI content or as a redrafting tool of the students’ own writing) and encourage students to use AI as a tool to support their own independent thinking and language learning process, through idea generation, dialogue and translation assistance. They also state that any use of AI should be explicitly referenced. The guidelines seek to address ethical concerns as well as students’ need for guidance and, similarly to our research project, they engage proactively with AI as a language learning tool.
5 Conclusions
Our study demonstrates that students perceive ChatGPT as offering several advantages to develop writing skills in the target language by providing immediate support and feedback to language learners. This could potentially promote learner autonomy while offering personalised feedback, which cannot always be available with human teachers. Its high proficiency in the target language ensures a satisfactory level of accuracy in the end product, although not in all languages that took part of our study, making it a reliable tool for language practice. However, as noted, its effectiveness in producing a meaningful text and in providing constructive feedback that students can learn from is contingent on the user’s digital skills, such as prompt engineering. Therefore, although ChatGPT has the potential to enhance learners’ autonomy, a certain level of digital and autonomous learning skills is necessary to effectively capitalise on the potential of ChatGPT for improving writing skills and overall proficiency in the target language (Szabó and Szoke 2024).
ChatGPT may inadvertently encourage repetition of structures and ideas. Additionally, its explanations of complex grammar, idiomatic expressions, and cultural nuances can be unreliable, often reflecting Anglocentric biases and producing hallucinations (Dai and Hua 2024; Román Mendoza 2023). Feedback can sometimes be generic or overly vague, and the tool’s tendency to “please” users might limit its ability to provide critical, constructive input. These limitations, combined with concerns about academic integrity and plagiarism due to learners’ potential overreliance on the technology, suggest that while ChatGPT is a valuable tool, it is not without limitations. Our study engaged with students’ perceptions of some of these limitations based on a small sample of participants. Future research could benefit from large samples across diverse language areas. Additionally, it would be worth conducting empirical studies to investigate students’ use of GenAI tools for modern language learning following structured pedagogical guidance. This would provide valuable insights into the effectiveness and impact of such pedagogical tools on student learning outcomes and perceptions in incorporating new technologies in language learning. With the development of more GenAI tools and the upgrading of existing ones, more studies are needed on how students interact with such tools and how they are impacted by them.
When guiding students through using ChatGPT for writing, it is important that teachers remind students of the possible consequences of overreliance on GenAI and using it as a quick solution to tasks, which will hinder rather than facilitate learners’ language acquisition process, especially for beginners. We should promote an interactive, enquiry-based approach during their interaction with GenAI, motivating students to pose follow-up questions to obtain relevant and constructive feedback on their writing, and to cross-check AI-generated responses on complex topics (including grammar explanations) with those from other sources to verify information. By doing so, students will continue to actively participate in their learning process and develop critical thinking skills alongside subject knowledge (Darwin et al. 2023; Leal and Torres 2023). It is crucial to stress to students that GenAI’s responses and explanations can be flawed, and that these tools do hallucinate. This is another reason why the role of teachers in fostering students’ language awareness and analytical skills is paramount, as applying one’s own reasoning to scrutinise and evaluate the GenAI’s output is a vital skill in the digital age (Attewell 2023).
Funding source: Durham University
Award Identifier / Grant number: Collaborative Innovation Grants (CIGs)
Acknowledgment
We are grateful to the Durham Centre for Academic Development (DCAD) at Durham University for funding this project. We are also grateful for all our colleagues who have contributed to the project by participating in research design, recruitment of research participants, providing feedback, qualitatively examining students’ essays and reflections for their respective languages, namely: Laura Lewis (German), Luca Malici and Lorenzo Dell’Oso (Italian language and medieval literature), Kaoru Umezawa (Japanese), Olga Zabotkina and Ekaterina Chown (Russian) and Mari Maya Medina (Spanish). Finally, we would like to thank Thomas Garth, our research assistant; James Youdale, our Project Manager and Madeleine Jablonowska, our student advisor.
Appendix 1: Benchmarking survey questions
Demographic data |
|
Mother tongue |
|
Other languages |
|
Target language proficiency |
|
Previous exposure to generative AI |
|
Learning needs and preferences |
|
Appendix 2: Reflection questions on the writing process
Reflection on writing without ChatGPT:
Before writing |
|
While writing |
|
After writing |
|
Reflection on writing with ChatGPT:
Before writing |
|
While writing |
|
After writing |
|
Appendix 3: Reflection on feedback
How satisfied were you with the quality of the feedback that your teacher provided? What aspects were most and least helpful?
How satisfied were you with the quality of the feedback ChatGPT provided? What aspects were most and least helpful?
Appendix 4: Interview and focus group questions
Reflections on experience |
|
Outline of methodology for completing task |
|
Estimations of own task competence |
|
Task complexity |
|
Project research design |
|
References
Attewell, Sue. 2023. How will generative AI affect students and employment? Prospects Luminate: https://luminate.prospects.ac.uk/how-will-generative-ai-affect-students-and-employment (accessed 26 November 2024).Suche in Google Scholar
Baclinko Team. 2024. ChatGPT/OpenAI statistics: How many people use ChatGPT? Baclinko. https://backlinko.com/chatgpt-stats (accessed 28 November 2024).Suche in Google Scholar
Bibauw, Serge, Thomas François & Piet Desmet. 2022. Dialogue systems for language learning: Chatbots and beyond. In Nicole Ziegler & Marta González-Lloret (eds.), The routledge handbook of second language acquisition and technology, 121–134. London: Routledge.10.4324/9781351117586-12Suche in Google Scholar
Braun, Virginia & Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3(2). 77–101.10.1191/1478088706qp063oaSuche in Google Scholar
Coniam, David.. 2008. Evaluating the language resources of chatbots for their potential in English as a second language. ReCALL 20(1). 98–116. https://doi.org/10.1017/s0958344008000815.Suche in Google Scholar
Cooper, Grant. 2023. Examining science education in ChatGPT: An exploratory study of generative artificial intelligence. Journal of Science Education and Technology 32(3). 444–452. https://doi.org/10.1007/s10956-023-10039-y (accessed 28 November 2024).Suche in Google Scholar
Coronado Badillo, Dolores & Leticia Santana Negrín. 2023. ChatGPT en el aula de ELE: entre la fascinación y la preocupación. Propuestas didácticas, reflexiones y sugerencias de uso para docentes. In Presentation at the III Congreso Internacional de profesores de ELE, Universitat Jaume I. Ense%C3%B1anzaformal%2Cinformalynoformal.Suche in Google Scholar
Dai, David. W. & Zhu Hua. 2025. When AI meets intercultural communication: New frontiers, new agendas. Applied Linguistics Review 16(2). 747–751. https://doi.org/10.1515/applirev-2024-0185.Suche in Google Scholar
Darwin, Diyenti Rusdin, Nur Mukminatien, Nunung Suryati, Ekaning D. Laksmi, Marzuki & Marzuki. 2023. Critical thinking in the AI era: An exploration of EFL students’ perceptions, benefits, and limitations. Cogent Education 11(1). https://doi.org/10.1080/2331186X.2023.2290342 (accessed 28 November 2024).Suche in Google Scholar
Dooly, Melinda & Anna Comas-Quinn. 2024. Accesibilidad a la tecnología y justicia social. In Javier Muñoz Basols, Mara Fuertes Gutiérrez & Luis Cerezo (eds.), La enseñanza del español mediada por tecnología. De la justicia social a la Inteligencia Artificial (IA), 23–47. London: Routledge.10.4324/9781003146391-3Suche in Google Scholar
Grove, Jack. 2024. Student AI cheating cases soar at UK universities. Higher Education. https://www.timeshighereducation.com/news/student-ai-cheating-cases-soar-uk-universities (accessed 26 November 2024).Suche in Google Scholar
Holmes, Prue, Richard Fay, Jane Andrews & Mariam Attia. 2013. Researching multilingually: New theoretical and methodological directions. International Journal of Applied Linguistics 23(3). 285–299. https://doi.org/10.1111/ijal.12038.Suche in Google Scholar
Klímova, Blanka & Prodhan Mahbub Ibna Seraj. 2023. The use of chatbots in university EFL settings: Research trends and pedagogical implications. Frontiers in Psychology 14. https://doi.org/10.3389/fpsyg.2023.1131506 (accessed 26 November 2024).Suche in Google Scholar
Klímova, Blanka, Marcel Pikhart, Petra Polakova, Miloslava Cerna, Sule Yildirim Yayilgan & Sarang Shaikh. 2023. A systematic review on the use of emerging technologies in teaching English as an applied language at the university level. Systems 11(1). https://doi.org/10.3390/systems11010042 (accessed 26 November 2024).Suche in Google Scholar
Kohnke, Lucas, Benjamin Luke Moorhouse & Di Zou. 2023. ChatGPT for language teaching and learning. RELC Journal 54(2). 537–550. https://doi.org/10.1177/00336882231162868.Suche in Google Scholar
Leal, Isabel & Lola Torres. 2023. Cómo utilizar ChatGPT en el aprendizaje de lenguas: procesos y pensamiento crítico. Campamento Norte. https://campamentonorte.com/como-utilizar-chatgpt-en-el-aprendizaje-de-lenguas-procesos-y-pensamiento-critico/ (accessed 26 November 2024).Suche in Google Scholar
Mavropoulou, Eleni. 2023. Exploitation de l’intelligence artificielle dans l’enseignement du français langue étrangère sur objectifs spécifiques: une étude de cas. RA2LC 8. 63–70.Suche in Google Scholar
Muñoz Basols, Javier & Mara Fuertes Gutiérrez. 2024. Oportunidades de la Inteligencia Artificial (IA) en la enseñanza y el aprendizaje de lenguas. In Javier Muñoz Basols, Mara Fuertes Gutiérrez & Luis Cerezo (eds.), La enseñanza del español mediada por tecnología. De la justicia social a la Inteligencia Artificial (IA), 343–364. London: Routledge.10.4324/9781003146391Suche in Google Scholar
Ng, Wan. 2012. Can we teach digital natives digital literacy? Computers & Education 59(3). 1065–1078 https://doi.org/10.1016/j.compedu.2012.04.016 (accessed 26 November 2024).Suche in Google Scholar
Nghi, Tran Tin, Tran Huu Phuc & Thang Nguyen Tat. 2019. Applying AI chatbot for teaching a foreign language: An empirical research. International Journal of Scientific and Technology Research 8(11). 897–902.Suche in Google Scholar
Oxford English Dictionary, s. v. 2023. Generative artificial intelligence. https://doi.org/10.1093/OED/9657191441 (accessed 28 November 2024).Suche in Google Scholar
Palfreyman, David M. 2014. The ecology of learner autonomy. In Garold Murray (ed.), Social dimensions of autonomy in language learning, 175–191. London: Palgrave Macmillan.10.1057/9781137290243_10Suche in Google Scholar
Popenici, Stefan A. & Sharon Kerr. 2017. Exploring the impact of artificial intelligence on teaching and learning in higher education. Research and Practice in Technology Enhanced Learning 12(1). 1–13. https://doi.org/10.1186/s41039-017-0062-8.Suche in Google Scholar
Quinio, Bernard & Marc Bidan. 2023. ChatGPT: Un robot conversationnel peut-il enseigner. Management et Datascience 7(1). https://doi.org/10.36863/mds.a.22060 (accessed 28 November 2024).Suche in Google Scholar
Ribes Lafoz, María & Borja Navarro, Colorado. 2023. Aprovechamiento de ChatGPT en la enseñanza de lengua extranjera en educación superior. In Delfín Ortega-Sánchez & Alexander López-Padrón (eds.), Educación y sociedad: claves interdisciplinares, 1264–1271. Barcelona: Ediciones Octaedro.Suche in Google Scholar
Ricart-Vayá, Alicia. 2024. ChatGPT como herramienta para mejorar la expresión escrita en inglés como lengua extranjera. Íkala, Revista de Lenguaje y Cultura 29(2). 1–16.10.17533/udea.ikala.354584Suche in Google Scholar
Román Mendoza, Esperanza. 2023. El arte de formular preguntas para comprender las respuestas: ChatGPT como agente conversacional en el aprendizaje de español como segunda lengua. MarcoELE, Revista de didáctica ELE 36. 1–18. https://marcoele.com/chatgpt-como-agente-conversacional/(accessed 26 November 2024).10.18002/sin.v18i1.8428Suche in Google Scholar
Sarrazola, Andrés. 2023. Uso de ChatGPT como herramienta en las aulas de clase. Revista EIA 20(40).1–23. https://doi.org/10.24050/reia.v20i40.1708.Suche in Google Scholar
Szabó, Fruzsina & Joanna Szoke. 2024. How does generative AI promote autonomy and inclusivity in language teaching? ELT Journal 78(4). 478–488 https://doi.org/10.1093/elt/ccae052 (accessed 28 November 2024).Suche in Google Scholar
van Lier, Leo. 1996. Interaction in the language curriculum: Awareness, autonomy and authenticity. London: Longman.Suche in Google Scholar
Walker, Simon. 2025. Trends in assessment in higher education: Considerations for policy and practice. Jisc Report https://repository.jisc.ac.uk/9887/1/trends-in-assessment-report.pdf (accessed 28 November 2024).Suche in Google Scholar
Xiao, Yangyu & Yuying Zhi. 2023. An exploratory study of EFL learners’ use of ChatGPT for language learning tasks: Experience and perceptions. Languages 8(3). https://doi.org/10.3390/languages8030212 (accessed 28 November 2024).Suche in Google Scholar
Yang, Weipeng. 2022. Artificial intelligence education for young children: Why, what, and how in curriculum design and implementation. Computers and Education: Artificial Intelligence 3. https://doi.org/10.1016/j.caeai.2022.100061 (accessed 28 November 2024).Suche in Google Scholar
© 2025 the author(s), published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Artikel in diesem Heft
- 10.1515/cercles-2025-frontmatter2
- Introduction
- Language learning across cultures and continents: exploring best practices of dialogue, collaboration and innovation
- Research Articles
- Students’ perceptions of a sense of belonging in Language Centre courses – What role do teachers play?
- Use of ePortfolios in EAP classes to facilitate self-efficacy through the improvement of creative, organizational, reflective, revision and technological skills
- Boosting learner autonomy through a learner diary: a case study in an intermediate Korean language class
- Examining the (in)accuracies and challenges when rating students’ L2 listening notes
- The relationship between English Medium Instruction and motivation: a systematised review
- Generative AI in teaching academic writing: guiding students to make informed and ethical choices
- Developing writing skills and feedback in foreign language education with chatGPT: a multilingual perspective
- Activity Reports
- Fostering sustainability literacy and action through language education: perspectives and practices across regions
- Receptive communication skills to support inclusive learning in the multilingual classroom: a workshop for university teaching staff
- The challenge of LSP in languages other than English: adapting a language-neutral framework for Japanese
- Promoting autonomous learning amongst Chinese learners of Japanese – introducing flipped learning and learner portfolios
Artikel in diesem Heft
- 10.1515/cercles-2025-frontmatter2
- Introduction
- Language learning across cultures and continents: exploring best practices of dialogue, collaboration and innovation
- Research Articles
- Students’ perceptions of a sense of belonging in Language Centre courses – What role do teachers play?
- Use of ePortfolios in EAP classes to facilitate self-efficacy through the improvement of creative, organizational, reflective, revision and technological skills
- Boosting learner autonomy through a learner diary: a case study in an intermediate Korean language class
- Examining the (in)accuracies and challenges when rating students’ L2 listening notes
- The relationship between English Medium Instruction and motivation: a systematised review
- Generative AI in teaching academic writing: guiding students to make informed and ethical choices
- Developing writing skills and feedback in foreign language education with chatGPT: a multilingual perspective
- Activity Reports
- Fostering sustainability literacy and action through language education: perspectives and practices across regions
- Receptive communication skills to support inclusive learning in the multilingual classroom: a workshop for university teaching staff
- The challenge of LSP in languages other than English: adapting a language-neutral framework for Japanese
- Promoting autonomous learning amongst Chinese learners of Japanese – introducing flipped learning and learner portfolios