AI-based causal evaluation of teacher’s opening lessons: a multidimensional study

Harjono; Nor Unsa Akmal; Sri Susilogati Sumarti; Woro Sumarni; Dimas Gilang Ramadhani

doi:10.1515/cti-2025-0045

Article Open Access

AI-based causal evaluation of teacher’s opening lessons: a multidimensional study

, , , and

Published/Copyright: April 17, 2026

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Chemistry Teacher International Volume 8 Issue 1

Abstract

The opening phase of a lesson can strongly influence students’ readiness to learn, yet teachers often receive limited, delayed feedback on the quality of this phase. This study examines which opening-lesson features are most strongly linked to higher-quality lesson introductions in chemistry classrooms. We analysed 200 classroom videos of Indonesian chemistry teachers. Video audio was transcribed and then scored using a structured rubric covering three features: (1) attention-building, (2) teacher–student interaction, and (3) use of instructional media. In this study, “lesson outcomes” refers to the rubric-based quality score of the opening phase (i.e., the overall opening-lesson score), not student achievement scores. Descriptively, teachers scored highest on interaction (mean = 3.44), while media use showed greater variation (mean = 3.04). Using an observational design with causal-inference analyses to reduce selection bias, we found that stronger attention-building and richer teacher–student interaction were associated with higher opening-lesson quality. Media use appeared to strengthen these benefits, partly by supporting interaction (media → interaction → overall opening score). Sensitivity checks suggested the main findings were stable under moderate unobserved bias. Overall, the results highlight that the most effective lesson introductions combine clear attention cues, meaningful interaction, and purposeful media use. Teachers with lower baseline performance showed the greatest improvement when openings included clear learning objectives, suggesting a need for targeted, personalised professional development. This AI-supported evaluation approach offers a scalable way to provide more consistent feedback on lesson introductions for teacher development.

Keywords: artificial intelligence; opening lessons; causal inference; technology

1 Introduction

Lesson introduction (the opening phase of the lesson) is an important initial phase in teaching designed to build students’ mental and cognitive readiness, which significantly affects their engagement, motivation, and perception of the learning materia. ¹ ^, ² This phase usually involves actions such as greeting students, cognitive warm-up activities, and clearly communicating learning objectives, which together help to focus students’ attention and set expectations from the start. ³ ^, ⁴ Pedagogically, a strong opening can predict the success of the subsequent learning process by increasing student motivation and readiness. ⁵ However, despite its importance, this phase is often under-analyzed in teacher evaluation systems, which largely rely on subjective observations and lack standardized, objective and quantifiable assessment tools. ⁶ Such limitations highlight the need for a more rigorous and technology-based approach to evaluating lesson introductions, to address the current gap in fair and quantitative teacher assessment.

The study focuses on attention, interaction, and media as the primary causal variables because these elements are recognized as central to the initial phase of lesson delivery. Attention is essential for engaging students mentally at the start of the lesson, ensuring they are cognitively prepared to absorb new information. ⁷ Teacher-student interaction fosters a dynamic learning environment, encouraging participation and setting the tone for the rest of the lesson. ⁸ Lastly, media is a critical tool for enhancing engagement and reinforcing content through visual and auditory elements. ⁹ To support teacher readability, we define these as observable opening-phase practices in video: attention (e.g., a brief motivating question/phenomenon, explicit focus cue), interaction (e.g., short exchanges that elicit student responses), and media use (e.g., images, objects, short clips, slides used purposefully). While elements like prior knowledge activation are indeed important for effective learning, they were not included in this particular study model due to the scope of the evaluation, which aimed to focus on measurable features directly observable in classroom video recordings. The study seeks to provide insights into how these observable features specifically impact opening-phase lesson quality, offering actionable data for teacher development.

One of the key challenges in evaluating lesson introductions is identifying which specific elements most effectively influence the quality of the opening phase. While many studies highlight the importance of various teaching strategies, most stop at identifying correlations without examining plausible causal mechanisms. For instance, research on phenomenon-based learning (PBL) suggests it promotes higher-order thinking, ¹⁰ ^, ¹¹ but these studies often lack causal validation. Similarly, while analyzing teacher-recorded videos can reveal effective practices, ¹² they generally don’t use causal analysis. Existing tools like Terracotta ¹³ are focused on broader educational interventions, leaving the analysis of specific, teacher-actionable features within the lesson introduction less explored. This research gap is what this study seeks to fill by using automated analysis to estimate which opening-phase features are most impactful, helping teachers better understand what strategies are most likely to improve opening-lesson quality.

This research makes a significant contribution by integrating video transcripts of teaching, automated scoring using Gemini AI, and causal inference to assess lesson introductions in a systematic way. Using an observational video-based design, the approach enables estimation of the effects of attention, interaction, and media on a rubric-based opening-phase quality score (the study outcome), rather than student achievement test scores. Key findings include the analysis of heterogeneous effects across teacher groups and indirect pathways between teaching features (e.g., media supporting interaction during the opening). This approach provides a data-driven evaluation system that offers clearer insights into the effectiveness of teaching strategies in real classroom settings. ¹⁴ ^, ¹⁵ The study also introduces the first systematic application of this evaluation framework to real-life teaching contexts, offering a more objective and measurable way for teachers to receive actionable feedback for improving their practices. ¹⁶ Previous research has explored the use of AI in teaching video analysis, including applications of video-based analysis tools for teacher evaluation, ¹⁷ and some studies have applied causal inference in educational settings. ¹¹ ^, ¹³ However, combining Gemini-based scoring with causal analysis focused specifically on the lesson introduction phase represents a distinct approach. This research fills an important gap in the existing literature by providing a scalable, evidence-based tool for improving educational quality and fostering more targeted teacher development. ¹⁸ ^, ¹⁹ The novelty of this study lies in its combination of video analysis, AI-powered scoring, and causal inference within a single framework, enabling more transparent and actionable insights into teaching practices than studies that have typically examined these components separately.

2 Methods

2.1 Research design and procedure

The research workflow implemented in this study is illustrated in Figure 1. The process began with collecting video data from 200 classroom lesson recordings (observational design), where teachers performed the lesson introduction (opening phase). These videos were then transcribed automatically using an automatic speech recognition (ASR) system, and the transcripts were verified through a structured cleaning procedure (e.g., spot-checking and manual correction on a defined subset) to reduce errors from classroom noise and overlapping speech. Once transcripts were available, an AI model (Gemini) was used to score the lesson introduction based on a standardized rubric focusing on three teacher-actionable, video-observable features: attention, initial interaction, and media use. Human raters scored a subset of videos using the same rubric to benchmark the AI output and to report agreement indices (e.g., ICC/kappa) for transparency. The resulting scores were analysed descriptively and then modelled using causal-inference methods to estimate the likely contribution of specific opening-phase features to the rubric-based opening-phase quality score (the outcome in this study). The primary aim was to evaluate how elements such as attention-building activities, the clarity of objectives, the use of media, and initial interactions influence opening-phase lesson quality, providing actionable evidence for teacher development.

Figure 1:

Reaserch process flow.

The research methodology employs a systematic approach to evaluate teacher performance using teacher-readable, evidence-based indicators supported by AI. The process begins with automatic transcription of classroom video data and subsequent analysis through a Gemini-powered scoring procedure. This model uses a standardized rubric to assess attention, initial interaction, and media use during the lesson introduction phase, which are widely recognised as core elements shaping students’ readiness to learn. To improve reproducibility, the AI scoring procedure should be reported with model/version and access date, prompt template, and generation settings (e.g., temperature), and the prompt can be provided as Supplementary Material. To ensure reliability, the AI assessments are benchmarked against human evaluations conducted by trained raters, and interrater reliability should be reported for the human scoring and for AI–human agreement. ²⁰ After validation, the scored data undergo analysis using descriptive statistics and causal-inference methods such as average treatment effect (ATE), heterogeneous treatment effect (HTE), and mediation analysis. For teacher interpretability, these analyses are used to answer practical questions (e.g., “Which opening practices are most likely linked to higher-quality lesson introductions, and for which teachers?”), rather than only reporting technical parameters. Visualisations of the findings provide actionable insights for educators to improve teaching quality in the lesson introduction phase. This AI-driven framework offers a scalable and objective approach for evaluating teaching practices and supporting professional development, with the study outcome defined as the rubric-based opening-phase quality score (not student test scores).

Research has shown that attention-building activities and teacher-student interactions are crucial for enhancing student engagement and learning outcomes. Strategies like interactive questioning and clear learning objectives are associated with improvements focus and material retention, with teachers’ questions directly influencing student participation and responses. ²¹ When combined with proactive teacher-student interactions, these strategies promote cognitive engagement and improve social dynamics, fostering deeper understanding and critical thinking. ²² Furthermore, the integration of artificial intelligence (AI) in educational research has introduced scalable and objective methods for analyzing complex data. AI models using causal inference techniques, such as propensity score matching (PSM), allow for more accurate evaluation of teaching strategies by controlling for confounding variables. ²³ These methods enable researchers to isolate the estimated effects of instructional strategies on the study outcome, under explicit assumptions. ²⁴ The application of AI in causal inference frameworks improves the reliability and precision of educational research, ultimately supporting more effective teaching practices and interventions. ²⁵

2.2 Data and instrumentation

The data instrumentation for this study establishes a structured framework to evaluate teaching performance during lesson introductions using 200 video recordings produced by participants in Indonesia’s Teacher Professional Education (PPG) program, including both in-service (PPG Dalam Jabatan) and pre-service (PPG Prajabatan) teacher candidates undergoing the professional teacher certification process. The recordings captured participants’ teaching practice sessions (e.g., microteaching and/or practicum-based teaching), and the analysis focused specifically on the lesson introduction segment. Participant background characteristics (e.g., teaching experience for in-service candidates and prior practicum exposure for pre-service candidates) and practicum/classroom context (e.g., placement setting and instructional conditions) were summarised to support representativeness and potential bias assessment.

The videos were transcribed using an automatic speech recognition (ASR) system, followed by systematic cleaning and coding using a predefined codebook to ensure data accuracy. To strengthen transparency, transcription quality was audited through manual verification of a stratified subset, with corrections applied where necessary; additionally, intercoder agreement was calculated for the cleaning/coding process to document procedural reliability. Scoring was conducted using Gemini AI based on a rubric comprising three key indicators: attention, initial interaction, and media use. Finally, in line with the program’s formative assessment cycle and participants’ consent, the scoring outputs were synthesised into individualised feedback to support improvement in subsequent teaching sessions. Attention measures how effectively a teacher captures student focus through greetings and engaging techniques, with high scores awarded for methods that are highly relevant and spark curiosity, while lower scores indicate minimal or routine efforts. ²⁶ ^, ²⁷ Initial interaction assesses the teacher’s ability to promote student participation through questioning or warm-up activities, ranging from dynamic and inclusive engagement to a complete absence of interaction. ²⁸ ^, ²⁹ Media use evaluates the relevance and creativity of visual or audio aids used during the lesson opening, with top scores given when media significantly enhances understanding and engagement, and lower scores when media is absent or ineffective. ³⁰ ^, ³¹

The integration of Gemini AI into scoring for educational assessments enhances consistency and objectivity in evaluating teaching behaviors, enabling reliable, replicable, and large-scale analyses while reducing human bias through nuanced, rubric-based evaluations. ³² Beyond automation, AI supports teachers in lesson planning, execution, and appraisal, ³³ fostering personalized learning and targeted professional development. Its effectiveness in video-based studies allows multi-dimensional assessment of pedagogical strategies, deepening insights into teaching quality. However, successful implementation depends on teacher trust, transparency, and willingness to adopt AI tools, influenced by prior experiences and perceived usability. ³⁴ Ethical integration, as emphasized by Lee et al. and Liu, is critical to ensure AI promotes equity, avoids bias, and supports reflective teaching practices. ³⁵ This synergy between AI and structured rubrics advances evidence-based evaluation frameworks that strengthen both teacher development and the quality of lesson introductions.

2.3 Descriptive analysis

Descriptive analysis of teacher performance during lesson opening is summarized in the Descriptive Statistics of Opening Lessons Scores, presented in Table 1. This table highlights three main indicators: attention, interaction, and media use. The data shows that most teachers achieved moderate to high scores across these indicators, with Interaction showing the highest mean score of 3.44 and median of 4, reflecting strong and consistent engagement between teachers and students. ³⁶ Otherwise, media use recorded the lowest mean score of 3.04 and the highest variability, indicating that media integration was inconsistent among teachers. ³⁷ ^, ³⁸ Negative skewness values across indicators, especially −1.36 for interaction, suggest that the majority of teachers perform well in fostering interaction during lesson opening, supported by kurtosis values that show high concentration of scores. ³⁹ An interquartile range of 1 across indicators indicates moderate score stability among teachers, while the percentages of teachers scoring 3 or above 90 % for interaction, 86.5 % for attention, and 76.5 % for media underscore general strengths in interpersonal skills, but reveal opportunities for improvement in media use. ⁴⁰ ^, ⁴¹ Importantly, the “Percent ≥ 3” column applies to indicator scores (1–4 scale) and is not directly comparable to the total score; therefore, we report total score descriptively without using the same threshold to avoid misinterpretation. ⁴²

Table 1:

Descriptive statistics of opening lessons scores.

Indicator	Mean	Median	SD	Skewness	Kurtosis	IQR	Percent ≥ 3
Attention	3.18	3	0.66	−0.32	−0.34	1	86.5
Interaction	3.44	4	0.75	−1.36	1.49	1	90
Media	3.04	3	1.05	−0.85	−0.49	1	76.5
Total score	9.66	10	2.1	−0.94	0.66	3	253

Percent ≥ 3 refers to the proportion of teachers scoring ≥3 on each indicator (attention/interaction/media). The previous value “253” was inconsistent with a percentage scale and has been removed to avoid misinterpretation.

2.4 Causal analysis framework

This study utilised causal analysis to evaluate the impact of key instructional features – attention, initial interaction, and media use – on opening-phase quality scores using the T-Learner model and propensity score matching (PSM) to calculate average treatment effects (ATE). ⁴³ ^, ⁴⁴ In addition, heterogeneous treatment effects (HTE) were examined across quartiles of teacher performance to identify how different groups of teachers respond to instructional strategies. ⁴⁵ ^, ⁴⁶ Mediation analyses were conducted to investigate whether the effect of media use on opening-phase quality was mediated by teacher–student interactions, highlighting the role of engagement in improving lesson introduction practices. ⁴⁷ ^, ⁴⁸ Data processing used Gemini AI within Google Colab, while causal relationships were stringently analyzed using Causal AI, enabling an objective and measurable assessment framework to improve educational practices. To strengthen methodological transparency, we explicitly state the key assumptions (e.g., no unmeasured confounding/ignorability given observed covariates, positivity/overlap, and stable unit treatment value) and report diagnostics (e.g., covariate balance after matching, overlap checks, and sensitivity analysis) to support trust in the estimated effects.

3 Results

3.1 Sensitivity of causal estimates

The Tornado Plot (Figure 2) visualizes the sensitivity of the average treatment effect (ATE) to unmeasured confounding. It illustrates how the estimated ATE changes in response to varying levels of assumed unobserved confounder bias, represented on the vertical axis. The plot displays two sets of bars: one set for the baseline (no-bias) ATE estimate (blue bars) and the other for the bias-adjusted ATE under each assumed bias level (orange bars). This comparison helps readers judge how robust the estimated effects are to plausible violations of the “no unmeasured confounding” assumption in an observational design, rather than implying that unmeasured confounding is fully eliminated.

Figure 2:

Tornado Plot – sensitivity of ATE to unobserved cofounding.

The Tornado Plot in Figure 2 illustrates the sensitivity of the average treatment effect (ATE) to varying levels of unmeasured confounding bias, ranging from 2 % to 25 %. At lower bias levels (2–5%), ATE estimates remain relatively stable, suggesting that the main conclusions would not change under small-to-moderate unmeasured confounding. ⁴⁹ Around the 10 % level, the plot indicates a practical “tipping point” where the magnitude of the ATE begins to change more noticeably, signalling that stronger unmeasured confounding could meaningfully alter the interpretation. ⁵⁰ At higher assumed bias (e.g., above 10 %), the ATE becomes increasingly sensitive, indicating that the causal estimates depend more strongly on the validity of the ignorability assumption and the adequacy of measured covariates. This underscores the importance of reporting sensitivity analysis in observational causal studies and considering robustness-oriented approaches (e.g., alternative estimators such as TMLE or doubly robust methods) as complementary checks. ⁵¹ Overall, Figure 2 supports the interpretation that the estimated ATE is reasonably robust to modest unobserved confounding, while also transparently acknowledging that substantial unmeasured confounding could distort the estimated effects, reinforcing the need for careful study design, covariate measurement, and diagnostic reporting.

3.2 Multidimensional causal effects

The multidimensional causal effects analysis using 2D causal effect landscapes, 3D partial dependence plots, and a ternary plot provides a comprehensive overview of how the combined levels of attention, media, and interaction relate to opening-phase quality (rubric-based opening-lesson score). The multidimensional analysis suggests that opening-phase quality is highest when attention, interaction, and media use are all strong and reasonably balanced, rather than when one element is high in isolation. The 2D landscapes indicate strong joint patterns for attention–interaction, the partial dependence plots show that the association of media use with higher scores is strongest when attention is also high, and the ternary plot illustrates that the most favourable region occurs near balanced combinations of the three factors. Together, these visualisations provide an interpretable, teacher-facing summary of how opening-phase practices may work in combination to support higher-quality lesson introductions.

The causal effect landscapes in Figure 3a, b, and c show that estimated effects on opening-phase quality (ATE-based patterns) are highest when key opening-phase practices – attention, interaction, and media use – are jointly strong. Figure 3a shows a strong co-occurrence/synergistic pattern between attention and interaction, which emphasises the importance of engaging students cognitively and socially from the start. ⁵² ^, ⁵³ Figure 3b shows that media use is more strongly associated with higher opening-phase quality when paired with high attention, highlighting the complementary role of media in strengthening student focus during the lesson introduction. ⁵⁴ ^, ⁵⁵ Finally, Figure 3c indicates that the interaction–media combination yields the largest estimated improvement pattern, suggesting that interactive opening strategies may be most effective when supported by relevant media. ⁵⁶ ^, ⁵⁷ Collectively, these findings underscore the value of combining attention, interaction, and media use to optimise lesson introductions, while recognising that these are estimated associations under the study’s causal-inference assumptions.

Figure 3:

Causal effect landscape. (a) Attention versus interaction, (b) attention versus media, (c) interaction versus media.

The findings from Figure 4a, b, and c further highlight the combined relationships among attention, interaction, and media use with opening-phase quality, visualised via partial dependence patterns. Figure 4a shows that the highest predicted opening-phase quality occurs when both attention and media use are high, consistent with research suggesting that media can enhance learning environments by capturing students’ attention when combined with focused engagement. ⁵⁸ ^, ⁵⁹ Figure 4b emphasises the importance of teacher–student interaction combined with attention, which aligns with evidence that such interaction promotes cognitive and social engagement during instruction. ⁶⁰ ^, ⁶¹ Figure 4c reinforces this by indicating that higher predicted scores are achieved when media use is combined with strong teacher–student interaction, consistent with findings that multimedia can enhance motivation and lesson clarity through dynamic interaction. ⁵⁸ ^, ⁶² Overall, these results suggest that balanced integration of attention, interaction, and media use is important for strengthening the quality of lesson introductions, rather than treating any single element as sufficient on its own.

Figure 4:

Partial dependence plot. (a) Attention versus media, (b) interaction versus attention, (c) media versus interaction.

Figure 5’s ternary plot illustrates the combined influence of attention, interaction, and media use on opening-phase quality scores (opening-lesson rubric totals). The triangular graph shows that the highest scores occur when these three factors are simultaneously high and relatively balanced, with regions near high attention and interaction corresponding to better opening-phase quality. The plot also indicates that emphasising one factor excessively (e.g., media use without adequate attention or interaction) is less likely to maximise effectiveness. This highlights the practical importance of integrating all three elements coherently when designing lesson introductions.

Figure 5:

Ternary Plot.

Figure 5 highlights the importance of balancing attention, interaction, and media use in lesson introductions, showing that stronger outcomes occur when all three are well integrated. The analysis suggests that high levels of attention and interaction are consistently linked to higher opening-phase quality, with media use playing a supportive role that can strengthen their effects when applied purposefully. In practical terms, effective openings begin by capturing student focus and encouraging meaningful engagement, while media resources are used to reinforce clarity and interest in the learning objectives. Prior research by Greve et al. ⁵² and Mavilidi and Vazou ⁵⁶ similarly shows the strong synergy between attention and interaction in improving teaching effectiveness, while Ref. ⁶³ emphasize the supportive role of media in enhancing learning environments.

3.3 Heterogeneous effects by teacher performance

The analysis of heterogeneous effects by teacher performance, shown through the Raincloud and Interaction Plots, suggests that lower-performing teachers (Q1) show larger estimated gains in opening-phase quality compared to higher-performing teachers (Q4). This aligns with research indicating that targeted support – such as structured guidance and clear learning goals – can be particularly beneficial for less experienced or less confident educators. ⁶⁴ ^, ⁶⁵ Lower-performing teachers, often lacking a strong pedagogical foundation, are more receptive to professional development initiatives that offer clear expectations. Studies also highlight that a supportive school climate, strong principal-teacher relationships, and a conducive educational environment significantly enhance teacher performance. ⁶⁶ Overall, these patterns support the practical value of differentiated professional development that is matched to teachers’ current performance level, with the outcome here defined as the rubric-based opening-phase quality score (not student test achievement).

The Raincloud Plot in Figure 6 illustrates how the average treatment effect (ATE) varies across teacher performance quartiles, from lowest (Q1) to highest (Q4). It shows that lower-performing teachers (Q1) experience a wider range of estimated ATE values, with many benefiting more strongly from improved lesson introduction practices. In contrast, higher-performing teachers (Q4) show less variation and more consistent but smaller estimated gains, suggesting a potential ceiling effect because many already implement effective opening routines.

Figure 6:

Raincloud Plot.

The Raincloud Plot suggests that teachers in the lower quartiles (Q1 and Q2) experience greater positive effects from changes in lesson introduction practices, indicating they may benefit more from improvements in opening-phase techniques. Meanwhile, teachers in higher quartiles (Q3 and Q4) show more stable but smaller changes, suggesting that their baseline practices are already relatively strong. This pattern supports targeted support for Q1–Q2 teachers, while Q3–Q4 teachers may benefit more from refinement-focused feedback (e.g., higher-level questioning, efficiency, and consistency).

The Interaction Plot shows how learning objective clarity affects teachers differently across performance groups. Lower-performing teachers (Q1) show larger improvements in their opening-phase quality scores as learning objectives become clearer, indicating that explicit learning goals may be a high-leverage practice for teachers who are still developing opening routines. Higher-performing teachers (Q4) show smaller changes, likely because they already communicate objectives effectively and therefore gain less from additional increases in clarity.

The plot in Figure 7 shows that teachers in the middle quartiles (Q2 and Q3) experience moderate improvements from clearer learning objectives, indicating that the impact varies by performance level. While learning objectives appear beneficial across groups, their estimated effect is strongest for lower-performing teachers (Q1), highlighting the value of targeted support for teachers who need it most. The findings from the heterogeneous effects analysis further indicate that Q1 teachers show the most substantial improvement when learning objectives are clearly communicated, suggesting that objective clarity is a practical, teacher-actionable focus for professional development. ⁶⁷ In contrast, higher-performing teachers (Q3 and Q4) show little change, suggesting they already effectively plan and deliver lessons. ⁶⁸ This underscores the importance of targeted professional development focusing on clear learning goals for lower-performing teachers. Studies confirm that academic supervision and goal clarity are vital for improving teaching quality, especially for those in lower performance. ⁶⁹ ^, ⁷⁰ Accordingly, interventions that strengthen objective clarity for Q1 teachers are likely to improve the quality of lesson introductions (the outcome in this study), while broader claims about student learning should be treated as theoretical implications rather than directly measured effects in this dataset.

Figure 7:

Interaction Plot.

4 Discussion

4.1 Evaluating the causal impact of key features

Evaluation of the estimated causal impact of key features in the lesson introduction highlighted that capturing students’ attention had the strongest estimated effect on opening-phase quality scores (rubric-based opening-lesson score), reflected by the highest average treatment effect (ATE). This supports educational theories emphasising that clarity of purpose at the beginning of a lesson can improve students’ mental structuring of the material, promoting engagement and better understanding. ⁷¹ ^, ⁷² Moreover, the combined pattern of interaction and media use suggests that interaction can be strengthened when supported by appropriate media, as illustrated in the ternary plot (Figure 5), indicating that media may function as a supportive resource that amplifies attention and interaction during the lesson introduction. ⁷³ The robustness checks in the sensitivity analysis (Figure 2) suggest that the main conclusions are reasonably stable under modest levels of unobserved bias, while also highlighting that stronger unmeasured confounding could change effect magnitudes – an expected limitation in observational causal inference. ⁷⁴ Collectively, these findings underscore the potential of AI-supported evaluation systems to move beyond superficial performance impressions by providing transparent, rubric-based evidence about which opening-phase practices are most strongly linked to higher-quality lesson introductions. ⁷⁵ ^, ⁷⁶

4.2 Heterogeneity and personalization in teacher evaluation

The purpose of this analysis is to explain variation in impact across teacher groups using heterogeneous treatment effects (HTE). Figure 6 suggests that teachers with lower baseline scores experience the greatest improvement in opening-phase quality when learning objectives are communicated more clearly, highlighting the potential value of personalised evaluation and training systems tailored to teachers’ starting points. This approach challenges a “one-size-fits-all” model and supports coaching practices focused on high-leverage, teacher-actionable moves in the lesson introduction. The interaction plot in Figure 7 illustrates how lesson introduction strategies can be optimised based on teacher profiles, with objective clarity appearing particularly beneficial for lower-performing teachers. These findings emphasise the importance of incorporating HTE analysis into teacher quality improvement policies, as it helps identify who benefits most from particular supports. ⁷⁷ It also aligns with trends toward individualised professional development and feedback-oriented coaching, ⁷⁸ ^, ⁷⁹ ^, ⁸⁰ ^, ⁸¹ supporting effective and equitable improvements in teaching practice. ⁸²

4.3 Strengths, weaknesses, and limitations of the multidimensional approach

One of the major strengths of the multidimensional approach used in this study is its ability to assess the integrated relationships among attention, teacher–student interaction, and media use in the lesson introduction phase. This approach supports a more nuanced interpretation of how these factors may work together to strengthen opening-phase quality, rather than treating each practice in isolation. The use of multiple visual representations, including the ternary plot (Figure 5), causal landscapes (Figure 3), and partial dependence plots (Figure 4a–c), provides complementary perspectives by highlighting both individual patterns and combined effects, offering richer interpretation than a single summary estimate. For teacher readers, these visualisations can function as practical “maps” that show which combinations of opening practices are most consistently associated with stronger lesson introductions. ⁶⁹ ^, ⁷⁰

However, one limitation of this approach is the reliance on classroom video data and AI-supported scoring, which may not fully capture the complexity of in-class dynamics. Although human raters were involved to validate the AI scoring, some subtleties of teacher–student interaction (e.g., tone, classroom climate, and non-verbal cues) may be partially missed by automated pipelines. ⁷⁸ ^, ⁷⁹ ^, ⁸⁰ ^, ⁸¹ Moreover, because the study is observational, the multidimensional analysis may not account for all potential confounders (e.g., class composition, school policies, topic difficulty, or prior teacher training), which can limit generalisability and requires careful interpretation of causal claims. ⁷⁵ ^, ⁷⁶ Despite these limitations, the approach provides a structured framework for evaluating lesson introductions by combining rubric-based scoring, causal-inference modelling, and interpretable visual summaries. Importantly, the study outcome is the rubric-based opening-phase quality score; therefore, references to “student learning outcomes” should be treated as theoretical implications supported by prior literature rather than as directly measured effects in this dataset. This work provides a foundation for future research that strengthens external validity (e.g., broader sampling and context variables), improves reproducibility (e.g., published prompts and scoring protocols), and combines AI-supported evaluation with teacher coaching cycles to support sustainable professional growth.

5 Conclusions

The key features of the lesson opening - particularly the clear delivery of learning objectives and activities that build students’ attention – have the most significant causal impact on improving the quality of the lesson opening. The combination of teacher-student interaction with the use of interactive media synergistically strengthens learning effectiveness, as shown in the ternary plot graph. The robustness of the ATE estimates to hidden confounders confirms the reliability of these findings across different educational contexts. In addition, heterogeneous effects analysis shows that teachers with low baseline scores benefit most from improved clarity of learning objectives, opening up opportunities for personalized evaluation and training systems. This approach shifts the paradigm from uniform evaluation to individualized professional development, supported by continuous feedback mechanism and impact-based coaching. Overall, the integration of key learning features with AI-based evaluation offers an effective framework for improving learning experiences and teaching quality in an equitable and sustainable manner.

Corresponding author: Harjono, Chemistry Education, Faculty of Mathematics and Natural Sciences, Universitas Negeri Semarang, Semarang, Indonesia, E-mail: harjono_hanis@mail.unnes.ac.id

Funding source: Ministry of Higher Education, Science, and Technology (KEMENDIKTISAINTEK)

Award Identifier / Grant number: 089/C3/DT.05.00/PL/2025

Acknowledgments

This work was supported by the Fundamental Research Program of the Ministry of Higher Education, Science, and Technology (KEMENDIKTISAINTEK), Indonesia, under Grant No. 089/C3/DT.05.00/PL/2025 and Research Assignment Letter No. 69.2.6/UN37/PPK.11/2025.

Research ethics: This study was conducted in compliance with all applicable research ethics guidelines for educational research, ensuring integrity and ethical treatment of participants and data.
Informed consent: Informed consent was obtained from all participants involved in the study.
Author contributions: H. (Harjono) served as the first author and correspondence author responsible for conceptualizing the study, designing the methodology, conducting the data analysis, and drafting the manuscript. N.U.A. (Nor Unsa Akmal) contributed by providing critical revisions, methodological guidance, and supervising the research process. D.G.R. (Dimas Gilang Ramadhani), S.S.S. (Sri Susilogati Sumarti), and W.S. (Woro Sumarni) contributed as team members supporting data collection, literature review, and manuscript editing.
Use of Large Language Models, AI and Machine Learning Tools: AI language models were used exclusively for improving the manuscript’s language and clarity.
Conflict of interest: The authors declare that they have no known competing financial interests of personal relationship that could have appeared to influence the work reported in this paper.
Research funding: Fundamental Research Program of the Ministry of Higher Education, Science, and Technology (KEMENDIKTISAINTEK), Indonesia, under Grant No. 089/C3/DT.05.00/PL/2025 and Research Assignment Letter No. 69.2.6/UN37/PPK.11/2025.
Data availability: Data supporting the findings and conclusions are available upon request from the corresponding author.

References

1. Andriani, R.; Rasto, R. Motivasi Belajar Sebagai Determinan Hasil Belajar Siswa. J. Pendidik. Manaj. Perkantoran 2019, 4 (1), 80. https://doi.org/10.17509/jpm.v4i1.14958.Search in Google Scholar

2. Hamid, A.; Utami, R. T.; Vernanda, G. Upaya Guru Seni Budaya Dalam Meningkatkan Motivasi Belajar Dan Prestasi Siswa Tunanetra Di SLB a BINA INSANI Bandar Lampung. Jiip – J. Ilm. Ilmu Pendidik 2024, 7 (4), 3776–3784. https://doi.org/10.54371/jiip.v7i4.4117.Search in Google Scholar

3. Ramadhan, I.; Sulistyarini, S.; Afandi, A.; Firmansyah, H.; Wiyono, H.; Wahyudi, A.; Zalianty, A. Pemerataan Pendidikan Kawasan Perbatasan (Implementasi Kurikulum Merdeka Bagi Guru Di Perbatasan Indonesia-Malaysia). Reswara J. Pengabdi. Kpd. Masy. 2025, 6 (1), 474–481. https://doi.org/10.46576/rjpkm.v6i1.5297.Search in Google Scholar

4. Sinaga, Y.; Mustika, D. Persepsi Guru Kelas Rendah Terhadap Tahap Pelaksanaan Pembelajaran Tematik Di Sekolah Dasar. Aulad J. Early Child. 2023, 6 (2), 197–204. https://doi.org/10.31004/aulad.v6i2.496.Search in Google Scholar

5. Rosari, M. D. Pelatihan English for Specific Purpose Untuk Siswa Teknik Komputer Dan Jaringan Di Kabupaten Tangerang. Devosi 2025, 6 (1), 41–56. https://doi.org/10.33558/devosi.v6i1.10738.Search in Google Scholar

6. Lestari, Y. A.; Syamsuddin, M. B.; Usman, M. Hubungan Motivasi Belajar Dengan Prestasi Belajar Bahasa Arab Siswa Kelas XI SMA IT Al Fatih Makassar. Af 2022, 2 (2), 151. https://doi.org/10.59562/al-fashahah.v2i2.40436.Search in Google Scholar

7. Lai, P. K.; Stroud, F.; Paladino, Á.; Kalamir, N. Investigating Learning Designers’ Perceptions of Student Cognitive Engagement in Online Learning; Ascilite Publ: Christchurch, 2023.10.14742/apubs.2023.499Search in Google Scholar

8. Wood, K. The Path of Teachers’ Learning through Lesson and Learning Studies. Int. J. Lesson Learn. Stud. 2020, 9 (2), 93–99. https://doi.org/10.1108/ijlls-12-2019-0083.Search in Google Scholar

9. Kamardeen, I.; Samaratunga, M. DigiExplanation Driven Assignments for Personalising Learning in Construction Education. Constr. Econ. Build. 2020, 20 (3). https://doi.org/10.5130/ajceb.v20i3.7000.Search in Google Scholar

10. Asahid, R. L.; Lomibao, L. S. Embedding Proof-Writing in Phenomenon-Based Learning to Promote Students’ Mathematical Creativity. Am. J. Educ. Res. 2020, 8 (9), 676–684. https://doi.org/10.12691/education-8-9-9.Search in Google Scholar

11. Scharlott, L. J.; Rippey, D. W.; Rosa, V.; Becker, N. Progression toward Causal Mechanistic Reasoning through Phenomenon-Based Learning in Introductory Chemistry. J. Chem. Educ. 2024, 101 (3), 777–788. https://doi.org/10.1021/acs.jchemed.3c00517.Search in Google Scholar

12. Zubainur, C. M.; Johar, R.; Hayati, R.; Ikhsan, M. Teachers’ Understanding about the Characteristics of Realistic Mathematics Education. J. Educ. Learn. 2020, 14 (3), 456–462. https://doi.org/10.11591/edulearn.v14i3.8458.Search in Google Scholar

13. Motz, B.; Üner, Ö.; Jankowski, H.; Christie, M.; Burgas, K.; Orobitg, D. d. B.; McDaniel, M. A. Terracotta: A Tool for Conducting Experimental Research on Student Learning. Behav. Res. Methods 2023, 56 (3), 2519–2536. https://doi.org/10.3758/s13428-023-02164-8.Search in Google Scholar PubMed PubMed Central

14. Hernadi, D.; Mulia, W. R.; Kusmana, S.; Gloriani, Y. Development of Multiliteracy-Gemini AI Module to Improve Education Guru Penggerak. J. Learn. Dev. Stud. 2024, 4 (3), 76–83. https://doi.org/10.32996/jlds.2024.4.3.10.Search in Google Scholar

15. Zubair, L.; Mini, D. A. M.; Kurnia, Z. A.; Bashith, A. Strategi Inovatif Dalam Pengembangan Evaluasi Pembelajaran Pendidikan Agama Islam Untuk Meningkatkan Kualitas Pendidikan. J. Pendidik. Indones. 2024, 5 (11), 1217–1227. https://doi.org/10.59141/japendi.v5i11.5911.Search in Google Scholar

16. Mahardika, A. I.; Saputra, N. A. B.; Muda, A. A. A.; Riduan, A.; Luzuardi, N. S.; Nurmalinda, N. Pelatihan Pengembangan Evaluasi Pembelajaran Digital Menggunakan Quizizz Bagi Guru Di Kota Banjarmasin. J. Abdimas Prakasa Dakara 2023, 3 (1), 1–9. https://doi.org/10.37640/japd.v3i1.1540.Search in Google Scholar

17. Williams, R.; Ali, S.; Devasia, N.; DiPaola, D.; Hong, J.; Kaputsos, S. P.; Jordan, B. T.; Breazeal, C. AI + Ethics Curricula for Middle School Youth: Lessons Learned from Three Project-Based Curricula. Int. J. Artif. Intell. Educ. 2022, 33 (2), 325–383. https://doi.org/10.1007/s40593-022-00298-y.Search in Google Scholar PubMed PubMed Central

18. Giam, N. M.; Nguyen, N.; Giang, N. T. H. Situation and Proposals for Implementing Artificial Intelligence-Based Instructional Technology in Vietnamese Secondary Schools. Int. J. Emerg. Technol. Learn. 2022, 17 (18), 53–75. https://doi.org/10.3991/ijet.v17i18.31503.Search in Google Scholar

19. Lyanda, J. N.; Owidi, S.; Simiyu, A. M. Rethinking Higher Education Teaching and Assessment In-Line with AI Innovations: A Systematic Review and Meta-Analysis. African J. Empir. Res. 2024, 5 (3), 325–335. https://doi.org/10.51867/ajernet.5.3.30.Search in Google Scholar

20. Alshuraiaan, A. Exploring the Relationship between Teacher-Student Interaction Patterns and Language Learning Outcomes in TESOL Classrooms. J. English Lang. Teach. Appl. Linguist. 2023, 5 (3), 25–34. https://doi.org/10.32996/jeltal.2023.5.3.3.Search in Google Scholar

21. Setyo, F. D. P. S.; Suprihadi, S.; Kartika, F. K. Teacher’s Questions of Thinking Skills in an English as Foreign Language Classroom. Prominent 2022, 5 (1), 13–21. https://doi.org/10.24176/pro.v5i1.6636.Search in Google Scholar

22. Chao, F.; Wang, W.; Yu, G. Causal Inference in the Age of Big Data: Blind Faith in Data and Technology. Kybernetes 2023, 53 (12), 5740–5748. https://doi.org/10.1108/k-06-2023-1026.Search in Google Scholar

23. Mayer, I.; Moyer, J.-D.; Dreyfus, A.; Mathieu, B.; Cungi, P.-J.; Foucrier, A.; Harrois, A.; James, A.; Nadal, J.-P.; Josse, J.; Gauss, T. Machine Learning Augmented Causal Inference To Estimate the Treatment Effect of Tranexamic Acid in Traumatic Brain Injury. Research Square 2021, preprint; https://doi.org/10.21203/rs.3.rs-600886/v1.Search in Google Scholar

24. Sözeri̇, M. C.; Kert, S. B. Ineffectiveness of Online Interactive Video Content Developed for Programming Education. Int. J. Comput. Sci. Eng. Syst. 2021, 4 (3), 49–69. https://doi.org/10.21585/ijcses.v4i3.99.Search in Google Scholar

25. Moser, A.; Puhan, M. A.; Zwahlen, M. The Role of Causal Inference in Health Services Research II: A Framework for Causal Inference. Int. J. Publ. Health 2020, 65 (3), 367–370. https://doi.org/10.1007/s00038-020-01334-1.Search in Google Scholar PubMed PubMed Central

26. Ali, M. K.; Ali, A. M.; Hasanah, A. Efektivitas Fitur ChatGPT, Gemini Dan Claude AI Dalam Membantu Guru Membuat Bahan Ajar. Ijset 2024, 4 (1), 58–71. https://doi.org/10.54373/ijset.v4i1.1649.Search in Google Scholar

27. Liu, Q.; Yang, X.; Chen, Z.; Zhang, W. Using Synchronized Eye Movements to Assess Attentional Engagement. Psychol. Res. 2023, 87 (7), 2039–2047. https://doi.org/10.1007/s00426-023-01791-2.Search in Google Scholar PubMed PubMed Central

28. Droissart, J.; Tuytens, M. The Integration of Lecturer Collaboration within Higher Education Institutions’ Quality Culture Framework. Qual. Assur. Educ. 2024, 32 (3), 356–370. https://doi.org/10.1108/qae-09-2023-0157.Search in Google Scholar

29. Engin, C. D.; Karatas, E.; Öztürk, T. Exploring the Role of ChatGPT-4, BingAI, and Gemini as Virtual Consultants to Educate Families about Retinopathy of Prematurity. Children 2024, 11 (6), 750. https://doi.org/10.3390/children11060750.Search in Google Scholar PubMed PubMed Central

30. Duran, S.; González, A.; Nguyen, K. A.; Nguyen, J.; Zinn, Z. Enhancing Spanish Patient Education Materials: Comparing the Readability of Artificial Intelligence‐Generated Spanish Patient Education Materials to the Society of Pediatric Dermatology Spanish Patient Brochures. Pediatr. Dermatol. 2024, 42 (1), 106–108. https://doi.org/10.1111/pde.15805.Search in Google Scholar PubMed

31. Tariq, D.; Madhusudan, R.; Guntupalli, Y.; Sai, S.; Vejandla, B.; Lnu, M. A Cross-sectional Study Comparing Patient Information Guides for Amyotrophic Lateral Sclerosis, Myasthenia Gravis, and Guillain-Barré syndrome Produced by ChatGPT-4 and Google Gemini 1.5. Cureus 2025, 17 (2), 1–9; https://doi.org/10.7759/cureus.79646.Search in Google Scholar PubMed PubMed Central

32. Akavova, A.; Temirkhanova, Z.; Lorsanova, Z. M. Adaptive Learning and Artificial Intelligence in the Educational Space. E3s Web Conf. 2023, 451, 6011. https://doi.org/10.1051/e3sconf/202345106011.Search in Google Scholar

33. Dai, Y.; Chai, C. S.; Lin, P.-Y.; Jong, M. S.; Guo, Y.; Jian-jun, Q. Promoting Students’ Well-Being by Developing Their Readiness for the Artificial Intelligence Age. Sustainability 2020, 12 (16), 6597. https://doi.org/10.3390/su12166597.Search in Google Scholar

34. Nazaretsky, T.; Cukurova, M.; Alexandron, G. An instrument for measuring teachers’ trust in AI-based educational technology. In LAK22: 12th International Learning Analytics and Knowledge Conference; Association for Computing Machinery: New York, NY, USA, 2022; pp. 56–66.10.1145/3506860.3506866Search in Google Scholar

35. Liu, L. Survey and Analysis of Primary School Teachers’ Use of Generative Artificial Intelligence. Lect. Notes Educ. Psychol. Public Media 2024, 74 (1), 43–52. https://doi.org/10.54254/2753-7048/2024.bo17693.Search in Google Scholar

36. Aksoy, M.; Ceylan, T. An Action Research on Improving Classroom Communication and Interaction in Social Studies Teaching. Educ. Res. Int. 2021, 2021, 1–19. https://doi.org/10.1155/2021/9943194.Search in Google Scholar

37. Kara, S. An Investigation of Visual Arts Teachers’ Attitudes towards Distance Education in the Time of COVID-19. Int. J. Soc. Educ. Sci. 2021, 3 (3), 576–588. https://doi.org/10.46328/ijonses.246.Search in Google Scholar

38. Jaggars, S. S. Introduction to the Special Issue on the COVID-19 Emergency Transition to Remote Learning. Online Learn. 2021, 25 (1). https://doi.org/10.24059/olj.v25i1.2692.Search in Google Scholar

39. Wati, I.; Dayal, H. C. Exploring Possibilities and Challenges of Lesson Study: A Case Study in a Small Island Developing State. Waikato J. Educ. 2022, 27 (3), 73–88. https://doi.org/10.15663/wje.v27i3.812.Search in Google Scholar

40. Abdulbakioglu, M.; Kolushpayeva, A.; Balta, N.; Japashov, N.; Bae, C. L. Open Lesson as a Means of Teachers’ Learning. Educ. Sci. 2022, 12 (10), 692. https://doi.org/10.3390/educsci12100692.Search in Google Scholar

41. Stollman, S.; Meirink, J.; Westenberg, M.; Driel, J. v. Teachers’ Interactive Cognitions of Differentiated Instruction: An Exploration in Regular and Talent Development Lessons. J. Educ. Gift. 2021, 44 (2), 201–222. https://doi.org/10.1177/01623532211001440.Search in Google Scholar

42. Fitrianingsih, W. S.; Lestari, Y. B. Teachers’ adaptation to post-Covid-19 English language teaching and learning situation. In Proceedings of the 3rd Annual Conference of Education and Social Sciences; Springer Nature: Paris, France, Vol. 686, 2023; pp 67–72.10.2991/978-2-494069-21-3_9Search in Google Scholar

43. Bouvier, F.; Chaimani, A.; Peyrot, E.; Gueyffier, F.; Grenet, G.; Porcher, R. Estimating Individualized Treatment Effects Using an Individual Participant Data Meta-Analysis. BMC Med. Res. Methodol. 2024, 24 (1). https://doi.org/10.1186/s12874-024-02202-9.Search in Google Scholar PubMed PubMed Central

44. Whitaker, R. C.; Dearth‐Wesley, T.; Herman, A. N.; Benz, T. L.; Saint-Hilaire, S. A.; Strup, D. D. The Association between Teacher Connection and Flourishing Among Early Adolescents in 25 Countries. J. Early Adolesc. 2023, 44 (5), 600–623. https://doi.org/10.1177/02724316231190828.Search in Google Scholar

45. Yao, D.; Tang, C.; Cui, Q.; Li, L. Combining incomplete observational and randomized data for heterogeneous treatment effects. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management; Association for Computing Machinery: New York, NY, USA, 2024; pp. 2961–2970.10.1145/3627673.3679593Search in Google Scholar

46. Yu, M.; Lu, W.; Song, R. A New Framework for Online Testing of Heterogeneous Treatment Effect. Proc. AAAI Conf. Artif. Intell. 2020, 34 (06), 10310–10317. https://doi.org/10.1609/aaai.v34i06.6594.Search in Google Scholar

47. Caron, A.; Baio, G.; Manolopoulou, I. Estimating Individual Treatment Effects Using Non-parametric Regression Models: A Review. J. R. Stat. Soc. Ser. A (Statistics Soc.) 2022, 185 (3), 1115–1149. https://doi.org/10.1111/rssa.12824.Search in Google Scholar

48. Dao, P.; Nguyen, M. X. N. C.; Chi, D. N. Reflective Learning Practice for Promoting Adolescent EFL Learners’ Attention to Form. Innovat. Lang. Learn. Teach. 2020, 15 (3), 247–262. https://doi.org/10.1080/17501229.2020.1766467.Search in Google Scholar

49. Barberio, J.; Ahern, T. P.; MacLehose, R. F.; Collin, L. J.; Cronin‐Fenton, D.; Damkier, P.; Sørensen, H. T.; Lash, T. L. Assessing Techniques for Quantifying the Impact of Bias due to an Unmeasured Confounder: An Applied Example. Clin. Epidemiol. 2021, 13, 627–635. https://doi.org/10.2147/clep.s313613.Search in Google Scholar PubMed PubMed Central

50. Nab, L.; Groenwold, R. H. H.; Smeden, M. v.; Keogh, R. H. Quantitative Bias Analysis for a Misclassified Confounder. Epidemiology 2020, 31 (6), 796–805. https://doi.org/10.1097/ede.0000000000001239.Search in Google Scholar PubMed PubMed Central

51. Rostami, M.; Saarela, O. Targeted L1-Regularization and Joint Modeling of Neural Networks for Causal Inference. Entropy 2022, 24 (9), 1290. https://doi.org/10.3390/e24091290.Search in Google Scholar PubMed PubMed Central

52. Greve, S.; Thumel, M.; Jastrow, F.; Krieger, C.; Schwedler, A.; Süßenbach, J. The Use of Digital Media in Primary School PE – Student Perspectives on Product-Oriented Ways of Lesson Staging. Phys. Educ. Sport Pedagog. 2020, 27 (1), 43–58. https://doi.org/10.1080/17408989.2020.1849597.Search in Google Scholar

53. Li, K. Q.; Shi, X.; Miao, W.; Tchetgen Tchetgen, E. T. Doubly Robust Proximal Causal Inference under Confounded Outcome-Dependent Sampling. arXiv 2022, arXiv:2208.01237; https://doi.org/10.48550/arXiv.2208.01237.Search in Google Scholar

54. Lackmann, S.; Léger, P.; Charland, P.; Aubé, C.; Talbot, J. The Influence of Video Format on Engagement and Performance in Online Learning. Brain Sci. 2021, 11 (2), 128. https://doi.org/10.3390/brainsci11020128.Search in Google Scholar PubMed PubMed Central

55. Yulianti, Y.; Safitri, I. N.; Ladamay, I. Interactive Learning Media Based on “Scientific” Assisted by Android Studio for Elementary School Students. Kne Soc. Sci. 2023. https://doi.org/10.18502/kss.v8i8.13304.Search in Google Scholar

56. Mavilidi, M. F.; Vazou, S. Classroom‐based Physical Activity and Math Performance: Integrated Physical Activity or Not? Acta Paediatr. 2021, 110 (7), 2149–2156. https://doi.org/10.1111/apa.15860.Search in Google Scholar PubMed

57. Richit, A.; Ponte, J. P. d.; Tomasi, A. P. Aspects of Professional Collaboration in a Lesson Study. Int. Electron. J. Math. Educ. 2021, 16 (2), em0637. https://doi.org/10.29333/iejme/10904.Search in Google Scholar

58. Permatasari, R.; Suarman, S.; Gimin, G. Examining the Impact of Using Learning Media on Students’ Learning Motivation and Learning Outcomes. Int. J. Educ. Best Pract. 2024, 8 (1), 88. https://doi.org/10.31258/ijebp.v8n1.p88-102.Search in Google Scholar

59. Khan, S.; Siraj, D.; Ilyas, Z. Effect of Lesson Planning on Academic Performance: Evidence from the Elementary Level Classroom. Pakistan Soc. Sci. Rev. 2024, 8 (I). https://doi.org/10.35484/pssr.2024(8-i)15.Search in Google Scholar

60. Ifenthaler, D.; Yau, J. Y. Utilising Learning Analytics to Support Study Success in Higher Education: A Systematic Review. Educ. Technol. Res. Dev. 2020, 68 (4), 1961–1990. https://doi.org/10.1007/s11423-020-09788-z.Search in Google Scholar

61. Xu, S.; Esperanza, M. The Effectiveness of Case Method in Teaching Calculus Using Lesson Study Model. Int. J. Multidiscip. Res. 2024, 6 (5). https://doi.org/10.36948/ijfmr.2024.v06i05.29468.Search in Google Scholar

62. Nurjanah, N.; Latif, B.; Yuliardi, R.; Tamur, M. Computer-Assisted Learning Using the Cabri 3D for Improving Spatial Ability and Self-regulated Learning. Heliyon 2020, 6 (11), e05536. https://doi.org/10.1016/j.heliyon.2020.e05536.Search in Google Scholar PubMed PubMed Central

63. Richit, A.; Tomkelski, M. L. Secondary School Mathematics Teachers’ Professional Learning in a Lesson Study. Acta Sci. 2020, 22 (3), 2–27. https://doi.org/10.17648/acta.scientiae.5067.Search in Google Scholar

64. Kanya, N.; Fathoni, A. B.; Ramdani, Z. Factors Affecting Teacher Performance. Int. J. Eval. Res. Educ. 2021, 10 (4), 1462. https://doi.org/10.11591/ijere.v10i4.21693.Search in Google Scholar

65. Suyitno, S. Developing a Teacher Performance Model: The Impact of Principal Support on Teacher Performance by Mediating Organizational Commitment. Teacher Compet. Teacher Attitudes 2022, 6 (8), 876–900; https://doi.org/10.35542/osf.io/6gz3x.Search in Google Scholar

66. Susanti, T.; Abidin, Y. Factors Affecting Elementary School Teacher Performance: A SEM-PLS Review. J. Ilm. Sekol. Dasar 2024, 7 (4), 658–667. https://doi.org/10.23887/jisd.v7i4.60626.Search in Google Scholar

67. Putra, H. E. J.; Warsim, W.; Titirloloby, P. The Effect of Teacher Competency on Performance Appraisal. Akademika 2021, 10 (01), 235–247. https://doi.org/10.34005/akademika.v10i01.1148.Search in Google Scholar

68. Sejati, F. The Effect of Burnout, Emotional Intelligence and Extrovert Personality Types on Teacher Performance in Senior High School 13 Padang, Indonesia. World J. Adv. Res. Rev. 2023, 18 (3), 1112–1122. https://doi.org/10.30574/wjarr.2023.18.3.1137.Search in Google Scholar

69. Dewi, R.; Singh, P. The Effect of Academic Supervision and Teacher Professional Competence on Teacher Performance. PIJED 2022, 1 (1), 122–131. https://doi.org/10.59175/pijed.v1i1.9.Search in Google Scholar

70. Lastri, S.; Sudarno, S.; Sudrajat, A. The Effect of Academic Supervision and Organizational Culture on Teacher Performance. J. Pajar (Pendidikan Dan Pengajaran) 2023, 7 (5), 1027. https://doi.org/10.33578/pjr.v7i5.9491.Search in Google Scholar

71. Gaytan, J.; Kelly, S.; Brown, W. S. Writing Apprehension in the Online Classroom: The Limits of Instructor Behaviors. Bus. Prof. Commun. Q. 2021, 85 (4), 376–394. https://doi.org/10.1177/23294906211041088.Search in Google Scholar

72. Ratner, K.; Xie, H.; Zhu, G.; Estevez, M.; Burrow, A. L. Trajectories and Predictors of Adolescent Purpose Development in Self‐driven Learning. Child Dev. 2024, 96 (2), 691–704. https://doi.org/10.1111/cdev.14201.Search in Google Scholar PubMed PubMed Central

73. Rubino-Hare, L.; Whitworth, B. A.; Boateng, F. D.; Bloom, N. The Impact of Geospatial Inquiry Lessons on Student Interest in Science and Technology Careers. J. Res. Sci. Teach. 2023, 61 (2), 419–456. https://doi.org/10.1002/tea.21904.Search in Google Scholar

74. D’Amour, A.; Franks, A. Deconfounding Scores: Feature Representations for Causal Effect Estimation with Weak Overlap. arXiv 2021, arXiv:2104.05762; https://doi.org/10.48550/arXiv.2104.05762.Search in Google Scholar

75. Abion, L. M.; Alcantara, M.; Ching, D. A. E-learning Games Enjoyment to Pupils’ Learning Behaviors in Mathematics Classroom. Int. J. Educ. Manag. Dev. Stud. 2023, 4 (2), 170–186. https://doi.org/10.53378/352993.Search in Google Scholar

76. Sharma, M. Outcome-Based Education Pyramid: A Comprehensive Framework for Enhancing Educational Outcomes. Thiagarajar Coll. Preceptors Edu Spectra 2023, 5 (S1), 67–73. https://doi.org/10.34293/eduspectra.v5is1-may23.012.Search in Google Scholar

77. Bondie, R. Exploring Personalized Learning and Open Education Pedagogy in Multilingual Learner Teacher Preparation. Online Learn. 2023, 27 (4). https://doi.org/10.24059/olj.v27i4.4018.Search in Google Scholar

78. Dee, T. S.; James, J.; Wyckoff, J. Is Effective Teacher Evaluation Sustainable? Evidence from District of Columbia Public Schools. Educ. Financ. Policy 2021, 16 (2), 313–346. https://doi.org/10.1162/edfp_a_00303.Search in Google Scholar

79. Jin, Y. On the Current Teacher Evaluation System in China Analysis Based on Compulsory Education Stage. J. Educ. Humanit. Soc. Sci. 2023, 13, 320–325. https://doi.org/10.54097/ehss.v13i.7927.Search in Google Scholar

80. Li, F.; Li, K.; Li, X.; Wang, J. Evaluation Methods for Graduation Requirements under the Background of Teacher Education Professional Certification. Int. J. New Dev. Educ. 2024, 6 (2). https://doi.org/10.25236/ijnde.2024.060222.Search in Google Scholar

81. Siddique, R. Teachers’ Instructional Modification through Teachers’ Evaluation System and Its Impact on the Classroom Management. Pakistan Lang. Humanit. Rev. 2022, 6 (II). https://doi.org/10.47205/plhr.2022(6-ii)82.Search in Google Scholar

82. Wei, L.; Chen, Y. A. Narrative Inquiry into an ESL Teacher’s Professional Development: Problems and Recommendations. Int. J. Engl. Lang. Educ. 2022, 10 (2), 1. https://doi.org/10.5296/ijele.v10i2.20131.Search in Google Scholar

Received: 2025-05-25

Accepted: 2026-02-27

Published Online: 2026-04-17

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/cti-2025-0045

Keywords for this article

artificial intelligence; opening lessons; causal inference; technology

Creative Commons

BY-NC-ND 4.0