Home Optimization of student learning status by instructional intervention decision-making techniques incorporating reinforcement learning
Article Open Access

Optimization of student learning status by instructional intervention decision-making techniques incorporating reinforcement learning

  • Jifeng Gong EMAIL logo
Published/Copyright: September 8, 2025
Become an author with De Gruyter Brill

Abstract

The aim of this study is to solve the problem that existing instructional intervention decision-making techniques are difficult to achieve accurate interventions when optimizing students’ learning contexts. Therefore, this study combines the reinforcement learning model and the quantile tracking regression model to construct a theoretical model for instructional intervention decision-making and validate its effectiveness. The experimental results showed that the proposed model had high prediction accuracy in different student groups, and its application in practical teaching practice could obviously improve students’ learning effectiveness. Compared with the comparison method, the research model performed better in accuracy, precision, recall rate, and F1 value, and the accuracy was as high as 96.4%. In different educational data sets, the F1 scores of the proposed model were all above 0.89. The results show that the model can achieve accurate teaching intervention, thus optimizing students’ learning conditions. The research lays the foundation for creating a more intelligent and adaptable educational system, and promotes the intelligent development of educational technology.

1 Introduction

The widespread use of various systems and platforms in education, due to the development of computer technology and the improvement of educational informatization management, has resulted in a massive growth of educational resource data. This growth has posed significant challenges to educational evaluation and instructional decision analysis [1,2]. In education, big data analytics can provide scientific support for educational decision-making and personalized services for learners. It helps to understand the process of education and teaching. The integration of information technology and modern education can achieve accurate teaching and personalized learning, which is becoming a development trend in education [3,4]. Personalized instruction, on the other hand, is tailored to individual needs but lacks the economies of scale. It is important to strike a balance between the two approaches. Although the classroom system has a scale effect, which makes it easier to popularize education and greatly promotes the progress and development of science and technology, it usually adopts the same teaching interventions for all students, which makes it difficult to pay enough attention to individual differences [5,6]. The use of big data to make accurate teaching intervention decisions (TIDs) is a promising solution. It is challenging to fully depict the rules of education, nevertheless, because of the complexity and dynamics of educational data. As a result, current TID technology is limited in its versatility and can only be used in certain scenarios. Additionally, the level of intelligence is insufficient [7,8,9]. Based on this, the research integrates a reinforcement learning (RL) model and a quantile trace regression (QTR) model to construct a theoretical model of TID. The purpose of this model is to enhance the generality of the current TID technology and to utilize computer technology to improve its intelligence level.

The research is divided into five sections. The first section is to summarize and discuss the current research on teaching intervention decision-making technology. The second section is to build the theoretical model and application framework of precision teaching. The third section is to verify the theoretical model of precision TTD-making. The fourth section is to discuss the research results. The fifth part is the summary of the whole article.

2 Related work

Due to the complexity and dynamics of education itself, its research problems usually involve many aspects of factors and interrelationships. How to accurately intervene in the decision-making of student teaching has become the focus of the current education big data research [10,11]. Usher et al. optimized teachers’ distance learning decisions by using learner data and data-driven to address the problems associated with distance TID during the New Crown Epidemic, thereby effectively improving the quality of distance education while also improving student learning outcomes [12]. Carter et al. addressed the issues related to teaching decision-making interventions for students with special educational needs (SEN) by providing a comprehensive discussion of teaching strategies for students with SEN in mainstream schools in Australia and conducting a survey of stakeholders, thereby providing data theoretical support for proposing a rationalized TID technique [13]. Yulianti et al. addressed the issues related to the mediating role of parents in student TID by providing a comprehensive discussion of the TID approach in multiple schools in Indonesia by using multilevel regression analysis, thus providing data support for the optimization of the TID technique [14]. To address the problems with TID in terms of students’ professional development, Gesel et al. proposed a TID technique that uses big data technology to integrate teachers’ knowledge, skills, and self-efficacy. This effectively raised teaching standards and boosted student academic performance [15].

In addition, Pesce et al. addressed the problems of TID in physical education by conducting a 2-year experiment in a classroom using random selection, thereby optimizing TID techniques in physical education while promoting students’ self-control [16]. Gion et al. addressed the problems associated with TID in a multilevel classroom by using an empirical experiment to synthesize the learning of different races in the same classroom. This effectively optimized the quality of the classroom while enhancing the effectiveness of multilevel classroom interventions [17]. Jungjohann and Gebhardt constructed a questionnaire portfolio model of TID-related issues in inclusive education by optimizing classroom assessment dimensions. This could improve the quality of teachers’ teaching based on optimized teaching assessment and inform the improvement of learning outcomes for students with TID [18]. Kim and Kim addressed the problems related to degenerate Bernoulli numbers and degenerate Euler numbers by proposing to derive fully degenerate Bernoulli polynomials and degenerate Euler polynomials using moment representations of the parameters of Laplacian random variables. It allowed further study of degenerate hyperbolic functions and optimized product expansions of related functions [19]. Lysytska et al. proposed a combined online and offline model for teaching intervention technologies in response to the challenges of teaching foreign languages in the context of turbulent world events. The creation of an adaptive online learning platform, the development of a multimedia resource library, and the selection of innovative pedagogical tools have been identified as effective strategies to meet the psychological and pedagogical needs of students [20].

Current decision-making technology of teaching intervention has poor versatility due to the educational data itself, which is difficult to be promoted in different technical environments, and has not yet realized both scaled and personalized educational TID. At the same time, current purely personalized teaching intervention decision-making technologies often exhibit representational and diversified characteristics, making it difficult to adapt to dynamic and continuous teaching processes. In education, RL has been widely used for personalized learning path recommendation and intelligent tutoring systems. In particular, Q-Learning is chosen for its simplicity and effectiveness in partially observable Markov decision processes [21]. Although deep RL performs well when dealing with high-dimensional data, in educational interventions, the dimensionality and complexity of the data are usually low and Q-learning is sufficient to deal with it. In addition, the QTR model is chosen for its ability to capture the nonlinear relationship between behavioral patterns and grade point average (GPA), which has been shown to be useful for educational decision-making in previous studies [22]. The study proposes a theoretical model of precision teaching that enables the computability of teaching interventions. This model lays the foundation for data-driven intelligent teaching interventions and constructs an application framework to strengthen the generality of teaching interventions. It provides a reference for the practical promotion of precision teaching interventions (PTIs). Additionally, the theoretical model’s learning effectiveness prediction method achieves differentiated and dynamic predictions of learning achievement. The integration of RL experiments with precise TID experiments enables intelligent and dynamic teaching intervention. This research is innovative as a whole.

3 Optimizing students’ learning status with a big data-based theoretical model of PTIs

The current science of traditional teaching interventions in actual teaching still needs to be strengthened. Therefore, this section mainly constructs a theoretical model of quasi-instructional intervention. Among them, the learning effect prediction method is its guarantee and the TID method is its key.

3.1 Precision instructional interventions modeling and application architecture building

The central concern of this study using big data for instructional interventions is to facilitate student learning. It also provides students with personalized and targeted instructional interventions based on their actual student status and characteristics. Thus, the problem PTI can be transformed into an optimization problem, i.e., finding the optimal PTI to achieve optimization of student learning states (LSs). Based on this, the study uses big data to construct a theoretical model of PTI, as shown in Figure 1.

Figure 1 
                  Schematic diagram of a theoretical model for PTI based on big data.
Figure 1

Schematic diagram of a theoretical model for PTI based on big data.

In Figure 1, the PTI model constructed in the study contains learning characteristics and feature characterization, learning effectiveness prediction, and TID. LS and feature characterization are the prerequisites for accurate instructional interventions, learning effectiveness prediction is its guarantee, and TID is its key. Learning effectiveness prediction relies on LS and feature characterization, and is an important feedback for the effectiveness of instructional interventions. TID is realized based on LS and features with the goal of optimizing learning effectiveness. Learning characteristics and characteristics representations refer to the actual situation of students in educational activities and their relevant characteristics, according to the different student status of the existence of large personalized differences [23,24,25]. In Figure 1, the study uses big data information technology to categorize students’ learning characteristics into static and dynamic characteristics. Among them, static characteristics further include basic characteristics and learning styles, while dynamic characteristics further include behavioral characteristics, cognitive levels, and affective states. Therefore, in essence, students’ LSs and characteristics are portrayed by five dimensions, which can be expressed as a quintuple, the expression of which is shown in Eq. (1) [26].

(1) T = ( TJ , TX , TW , TR , TG ) ,

where T denotes LSs and characteristics, and TJ denotes basic characteristics. TX denotes learning style. TW denotes behavioral characteristics. TR denotes cognitive level and TG denotes affective state. Among them, TJ mainly consists of the student’s own information related to his/her academics, and the composition is represented by a numerical code. TX can be measured by self-selected methods, learning style scales, and big data analysis techniques, and the study focuses on the measurement of students’ learning styles through the use of big data techniques. TW measurement currently consists of coding theory using theoretical classifications of student behavior and the use of big data analytics to obtain learner behavioral characteristics from learner activity data. The study combines the two under the guidance of the theory of multidimensional characteristics of student behavior and uses big data analytics to mine student behavioral data in order to be able to better describe the characteristics of student behavior. TR is measured by means of educational assessment, which measures the cognitive structure of students based on their actual responses in the assessment. TG is measured by dividing the analyzed data of student subjects into three parts, namely, textual analysis, perceptions of outward behaviors, and representations of physiological signals, and by combining the three in an all-encompassing way.

Considering the problem of practical application of the theoretical model in reality, the study constructed the application architecture of PTI in teaching practice. Thus, the application architecture of PTI using big data is shown in Figure 2.

Figure 2 
                  Application architecture of PTI based on big data.
Figure 2

Application architecture of PTI based on big data.

In Figure 2, the architecture consists of a data layer, a methodology layer, a result layer, and an application layer from the bottom up, with the four layers progressing one after another and interlocking. The data layer provides an important source of information for the implementation of precise instructional interventions. The method layer is based on students’ LS and characteristics, and provides key technology for realizing intelligent and dynamic PTI, which is the way and medium to realize precision teaching. The result layer is the precise teaching intervention strategy obtained by solving the core technology of the method layer, which is the law to be followed for the implementation of precise teaching intervention. Intelligent and dynamic teaching interventions can be realized with the support of data layer, method layer, and result layer.

3.2 Learning effectiveness prediction method based on QTR

Based on the theoretical model of PTI constructed in the previous study, the study proceeds with the design of the learning effectiveness prediction method. Considering the relationship between students’ behavior and performance before and after learning, as well as the problem of relational similarity between adjacent pixel points of images, the study proposed a learning effectiveness prediction method using QTR. Conventionally speaking, statistical modeling is a crucial step in the process of statistical learning, which is a useful tool for extracting information from data [27,28,29]. Auxiliary data, which are typically in vector form, are frequently employed as variables to enhance learning performance. The development of big data technology has significantly enhanced the performance of data collection, and functional and matrix covariates have gradually appeared, which are widely used in statistics learning problems [30,31,32]. The simplest method in the analysis of matrix-type covariates is to construct the trace regression (TR) model, the expression of which is shown in Eq. (2) [33].

(2) = tr ( Φ T ) + φ ,

where denotes the response variable and tr ( ) denotes the trace of the matrix. Φ is the matrix of unknown regression coefficients and denotes the matrix of explanatory variables possessing fixed dimensions w 1 and w 2 . φ denotes the zero-mean model error. It is worth noting that the environmental dimension of Φ ( w 1 × w 2 ) may be very large. At this point, it is necessary to suppose that Φ is a low-rank structure, meaning that its degree of freedom is significantly smaller than that of w 1 × w 2 . This is because it becomes impossible to regulate the real sparsity of the elements due to the complexity of the model of Eq. (1). Most of the current studies for the TR model focus on estimating the conditional mean, and very few relevant studies exist that consider the QTR, so the study analyzes the QTR model, the expression of which is shown in Eq. (3).

(3) = tr ( A t T ) + φ ,

where A denotes the quantile regression coefficient matrix, and t denotes the quantile that is in the range of 0–1. φ denotes random error. Eq. (3) computes the matrix inner product between the explanatory variable matrix and the parameter matrix A , and introduces a quantile-specific error term φ at each quantile level t . Therefore, it enables fine-grained prediction across different learning performance levels (e.g., low-, medium-, and high-performing groups). Compared to traditional regression methods, the QTR model can simultaneously capture variations in the relationship between behavioral traits and learning outcomes across distributional positions. With the support of this QTR model, the study constructed a prediction method for students’ learning effectiveness. For the determination index of students’ learning effectiveness prediction, the study chooses the most direct academic performance, and transforms the problem into a regression problem to solve. The final construction of the learning effectiveness prediction method architecture is shown in Figure 3.

Figure 3 
                  Architecture diagram of learning effectiveness prediction method based on QTR model.
Figure 3

Architecture diagram of learning effectiveness prediction method based on QTR model.

In Figure 3, considering the strong correlation between the dynamic data of students’ state and characteristics at the before and after stages of the teaching process, the direct regression using the zheshi data will seriously affect the prediction effect of learning effectiveness. Therefore, the study images the dynamic data of student status and features, which not only preserves the correlation between the dynamic data of student status and features but also avoids the influence of correlation on learning effectiveness. Based on this, the method integrates the advantages of QTR and TR, so that both QTR can be used to portray the different relationships between students’ LS and characteristics with different learning effectiveness. Moreover, TR can be used to portray the correlations between the ranks of the matrix variables used as regression inputs.

3.3 RL-based approach to TIDs

Based on the proposed method for predicting learning effectiveness, the study further proposes the TID method. At present, the development of targeted PTI strategies based on the actual conditions and characteristics of different students is the key issue for PTI. Furthermore, intelligence and dynamization are the important features of precision intervention decision-making, and the research introduces RL to construct the TID method to address these two features. RL, as an important branch of machine learning, has been widely used in sequential decision-making. Usually, RL utilizes the interaction between decision-making intelligences and the environment, and the continuous trial-and-error process to construct the TID method. In computational methods for understanding and automating goal-directed learning and decision-making, usually RL utilizes decision-making intelligences to interact with the environment and learn to obtain the optimal strategy in continuous trial and error [34,35,36]. From a methodological point of view, the application of RL in TID has the advantages of fitting the needs of precise teaching intervention, providing dynamic teaching intervention, facilitating the exploration of potential key factors affecting teaching intervention, and enriching the research methods of precise teaching intervention. Thus, the problem of precise instructional intervention is basically to solve the optimal choice function in order to maximize the learning effect, as indicated by Eq. (4), based on the theoretical model in Figure 1.

(4) a = arg max a E L B 1 = α 1 ( G 1 ) , B 2 = α 2 ( G 2 ) , , B K = α K ( G K ) ,

where a denotes the optimal decision function and L denotes the learning effect. B denotes the instructional intervention, G K denotes the instructional history vector, which consists of G K . α denotes the decision function, K denotes the instructional stage, and E denotes the mathematical expectation. Eq. (4) is derived based on the Q-learning approach to RL and expresses the recursive relationship between the expected payoffs under the current state-action pair and the payoffs of subsequent states-actions. The related Q-function is optimized recursively, starting from the last stage and moving forward one step at a time until the best choice for the entire process is eventually found. The TID technique is constructed by the study using Q-learning, which is popular in RL. Figure 4 illustrates the total method flow.

Figure 4 
                  Process of teaching intervention decision-making method based on Q-learning.
Figure 4

Process of teaching intervention decision-making method based on Q-learning.

As illustrated in Figure 4, the method consists of first building an accurate statistical model of TID using Q-learning, then defining the Q-function, estimating the Q-function, and solving the optimal decision function. The Q-table, a central component of RL, stores the expected utility, or Q-value, of each state-action pair. In the context of educational interventions, states represent a student’s current learning situation, while actions represent possible instructional interventions. A state is an abstract representation of a student’s learning situation, including multiple dimensions such as the student’s current grade, study habits, engagement, and homework completion. For example, a state can be a vector containing a student’s most recent test score, attendance, activity on an online learning platform, and so on. Actions represent pedagogical interventions that teachers can take, such as providing additional tutoring, adjusting the difficulty of the course content, adding practice sessions, etc. Actions are selected based on teacher expertise and observation of student learning. The Q-value is initialized to zero or a small random number, indicating that there is no a priori knowledge of the expected utility of each state-action pair in the absence of experience. Its update follows the Q-learning algorithm, which learns the optimal utility of each state-action pair through trial and error. Reward functions, on the other hand, are defined based on student behavior and learning outcomes, with the goal of encouraging positive learning behaviors and improving learning effectiveness. Therefore, Eq. (5) displays the corresponding value expression for the decision function α in the creation of an accurate TID statistical model.

(5) J α = E α k = 1 K X k = k = 1 K X k α D α .

where J α denotes the value corresponding to the decision function α . D α denotes the distribution of the decision function generating instructional interventions corresponding to the generated random variables. X denotes the value-added learning effect. Eq. (6) illustrates the expression of a particular recursive connection that the value function fulfills since it is recursive.

(6) J α ( g k ) = max b k E [ X k + J k + 1 ( G k + 1 ) G k = g k , B k = b k ] ,

where b k denotes the student history vector, which consists of g k . b k denotes the elements in the instructional intervention B . The value function is the foundation of the RL method, which can be solved recursively in the opposite direction, and it can be done iteratively since it satisfies Eq. (6). The Q-function definition is defined accordingly in the last instructional stage K , which is expressed as shown in Eq. (7).

(7) F K ( g K , b K ) = E [ X K G K = g K , B K = b K ] α K ( g K ) = arg max b K F K ( g K , b K ) J K ( g K ) = max b K F K ( g K , b K ) ,

where F K ( g K , b K ) represents the expectation of the learning outcome that can be obtained by applying instructional intervention B K given the full history G K conditions prior to stage K . Moreover, the defined expression carried out at stage k is shown in Eq. (8).

(8) F k ( g k , b k ) = E [ X k + J k + 1 ( G k + 1 ) G k = g k , B k = b k ] = E [ X k + J k + 1 ( g k + 1 , b k , T k + 1 ) G k = g k , B k = b k ] α k ( g k ) = arg max b k F k ( g k , b k ) J k ( g k ) = max b k F k ( g k , b k ) ,

where T denotes the characteristics. F K ( g K , b K ) in Eqs. (7) and (8) is called the Q-function. If F K + 1 is defined to be zero, the Q-function definition for stage k is expressed as in Eq. (9).

(9) F k ( g k , b k ) = E [ X k + max b k + 1 F k + 1 ( g k + 1 , b k + 1 ) G k = g k , B k = b k ] .

Eq. (10), which defines the multi-stage optimal decision function using the Q-function, reflects this.

(10) a = ( a 1 ( G 1 ) , a 2 ( G 2 ) , , a K ( G K ) ) a k ( g k ) = arg max b k F ( g k , b k ) ,

where a denotes the vector consisting of multi-stage optimal decision functions. In Q-function estimation, the study utilizes a linear model for estimation, which is expressed as shown in Eq. (11).

(11) F k ( g k , b k ; ζ k , γ k ) = ζ k T G k + B k ( γ k G k ) ,

where ζ and γ denote the parameter estimates. Therefore, the estimates of the regression parameters can be solved by recursion forward layer by layer from the last stage. Specifically, the least absolute contraction and selection operators can be utilized to obtain the two parameter estimates when the observed data of multiple students are given. The observed data are expressed as shown in Eq. (12).

(12) { T 1 j , B k j , L 1 j , , T k j , B k j , L k j , , T K j , B K j , L K j } ,

where j denotes the number of students. The ideal decision function for the relevant stage, which is stated as given in Eq. (13), is ultimately obtained by estimating the Q-function in the optimal decision function solution process.

(13) a k ( g k ) = arg max b k F k ( g k , b k ζ k , γ k ) .

Since B k = { 0 , 1 } , therefore based on Eq. (11) the optimal decision function for F k ( g k , b k ζ k , γ k ) maximization can be obtained to satisfy the equation. Its expression is shown in Eq. (14).

(14) a k ( g k ) = U ( γ k T G k > 0 ) ,

where U denotes the schematic function, i.e., this function takes the value of 1 when it holds, and 0 when vice versa. Based on this, the optimal decision function expression for all K stages is obtained as shown in Eq. (15).

(15) a = ( a 1 ( g 1 ) , , a k ( g k ) , , a K ( g K ) ) .

Eq. (15) allows for the precise targeting of instructional interventions at any point in the learning process, based on behavioral performance and student attributes.

4 Performance analysis of theoretical models of precision instructional interventions

To verify the validity of the PTI model, the study first uses simulation to validate the learning effectiveness prediction and teaching intervention methods of the model content, followed by its actual validation in high school mathematics and university linear algebra teaching practice.

4.1 Validation of learning effectiveness prediction methods and instructional interventions

To validate the effectiveness of the learning effectiveness prediction method, the study conducts experimental verification and analysis using a simulated dataset. Based on the behavioral patterns of students with different academic levels (high, medium, and low achievement), three types of student learning trajectories are manually designed. Specifically, four behavioral characteristics are defined: daily study time, task completion rate, post-class practice accuracy, and resource use frequency. Each feature is sampled according to different distribution patterns (e.g., normal distribution, skewed distribution) corresponding to the learning behaviors of high, average, and low performing students, respectively. During data processing, the behavioral features of each group of students are encoded into matrix-structured time series data, which are further transformed into input feature maps through image processing to serve as input samples for the QTR model.

The hardware configuration includes an Intel Core i7-12700 processor, 16 GB of RAM, and an NVIDIA GeForce RTX 3060 graphics card running Windows 11. The programming environment is based on Python 3.9, with major libraries including NumPy (v1.23.5), Pandas (v1.5.3), and Matplotlib (v3.7.1) for data processing and visualization, Statsmodels (v0.13.5) for regression modeling, and Scikit-learn (v1.2.2) for model evaluation and auxiliary processing. All experiments are conducted in a Jupyter Notebook environment.

The study first compares the three patterns of behavior corresponding to different levels of academic achievement, using GPA as the metric of evaluation, specifically referring to the 0.05 quantile for low achievement, the 0.5 quantile for moderate achievement, and the 0.95 quantile for high achievement. The results are shown in Figure 5.

Figure 5 
                  Comparison of student behavior patterns with different learning outcomes. (a) Behavioral patterns of students with poor academic performance. (b) Behavioral patterns of students with average academic performance. (c) Behavioral patterns of students with excellent academic performance.
Figure 5

Comparison of student behavior patterns with different learning outcomes. (a) Behavioral patterns of students with poor academic performance. (b) Behavioral patterns of students with average academic performance. (c) Behavioral patterns of students with excellent academic performance.

In Figure 5, different colors indicate the frequency of positive behaviors such as eating on time, studying in the library, and the number of times students enter the library. In Figure 5(a), the frequency of positive behaviors of students with poor academic performance is maintained between 0.0 and 0.5. In Figure 5(b), the frequency of most of the students with moderate academic performance is maintained between 0.6 and 0.8. In Figure 5(c), the frequency of most of the students with good academic performance is higher than 0.8. Taken together, the frequency of positive behaviors of students with good academic performance is higher, and this result is very important for personalized teaching. Therefore, Figure 4 is used as a basis to analyze it using the QTR model proposed by the study. The results are shown in Figure 6.

Figure 6 
                  Comparison of parameter estimation between TR and QTR. (a) QTR model (0.95) estimation results. (b) QTR model (0.50) estimation results. (c) QTR model (0.05) estimation results. (d) Estimated results of the TR model.
Figure 6

Comparison of parameter estimation between TR and QTR. (a) QTR model (0.95) estimation results. (b) QTR model (0.50) estimation results. (c) QTR model (0.05) estimation results. (d) Estimated results of the TR model.

Compared with the results of TR parameter estimation in Figure 6(d), the estimation results of QTR under different parameters in Figure 6(a) to Figure 6(c) are more superior. The QTR model is more capable of capturing the implicit relationship between behavioral patterns and different GPAs. Moreover, it is able to base its determination on the results of different students’ GPAs. This side-by-side comparison confirms the validity of the study’s proposed method of predicting students’ learning effectiveness using the GTR model. In order to further validate the results, the study selected 11 students and predicted the grades of the 11 students using TR and QTR with root mean squared error (RMSE) as the assessment index. Table 1 presents the findings.

Table 1

Comparison of results in predicting academic performance of 11 students using different methods

Real results Predicted grades
QTR (0.95) QTR (0.5) QTR (0.05) TR
85.33 94.7 58.72 67.47 68.45
88.94 97.42 55.08 65.92 68.45
81.46 82.18 59.01 85.09 74.90
81.02 83.57 47.86 68.22 61.99
82.40 97.98 57.22 69.94 71.02
80.60 90.82 60.08 72.07 72.53
79.05 89.51 42.64 47.85 55.00
79.55 89.73 52.13 64.19 65.32
76.58 80.58 53.46 67.47 66.27
69.73 82.11 42.02 60.97 58.84
66.93 84.10 49.86 54.56 56.09
RMSE 25.00 20.56 20.68 35.81

In Table 1, TR can only predict the approximate grades of the 11 students and cannot make predictions based on the students’ own learning conditions. QTR, on the other hand, can not only predict the students’ own LS and characteristics, but also has superior predictive validity. Taken together, the RMSE values of the results predicted by QTR are 25.00, 20.55, and 20.68%, which are lower than TR’s 35.81%, demonstrating the validity of the study’s proposed prediction of learning effectiveness using QTR. On this basis, the study starts to verify the superiority of the proposed TID method using Q-learning, which sets the teaching stage as 2. The two-stage separate (A) and single-stage methods (B) are introduced to be compared with the research method (C) in terms of the optimal decision function solving with the number of students as 200, 400, and 800. The comparison metrics, meanwhile, are determined by calculating the estimated error rate (ER) and the value ratio (VR) between the set optimal decision function and the derived estimated optimal decision function (set VR1 and VR2 on a two-stage basis). As a result, Figure 7 displays the simulation comparison outcomes of several approaches.

Figure 7 
                  Simulation comparison results of different methods. (a) Comparison results when the number of students is 200. (b) Comparison results when the number of students is 400. (c) Comparison results when the number of students is 800.
Figure 7

Simulation comparison results of different methods. (a) Comparison results when the number of students is 200. (b) Comparison results when the number of students is 400. (c) Comparison results when the number of students is 800.

In Figure 7(a), when the number of students is 200, the ER1 value of Method C is 0.023 ± 0.015, which is lower than method A’s 0.031 ± 0.020, and the simulation of method B fails without results. Moreover, the VR1 value is 1.000 ± 0.023, which is higher than 0.844 ± 0.013 and 0.720 ± 0.018 for Methods A and B, respectively. In Figure 7(b), when the number of students is 400, the ER1 value for Method C is 0.018 ± 0.009, which is lower than 0.026 ± 0.013 for Method A, and the simulation of Method B fails to show results. While at VR1 value of 1.000 ± 0.016, it is higher than the comparison method and shows the same result at the number of students of 800 in Figure 7(c). Whereas the VR2 values of Method C in stage 2 are all higher than the comparison method, ER2 remains the same. When combined, the research methods are preferable in that the ideal decision function has a lower ER and its estimated value is closer to the optimal decision function value than it is to its set value.

4.2 Empirical analysis of the precision instructional interventions model

Based on the validation of the two modules in the theoretical model, the study began to verify the validity of the PTI model. An intelligent teaching and tutoring network platform is chosen as an auxiliary learning tool for the study, which used a class of 60 first-grade students from a high school in the capital city of an eastern Chinese province. The teacher teaches the lessons, and the students complete the exercises on the platform after class. The study mainly focuses on the teaching practice of high school mathematics, and the practice lasts for the entire first year of high school. The platform’s learning record data comprise 13 practice records, for a total of 676 records, while the classroom data comprise the grades and the outcomes of three tests (the first semester’s final exam, the entrance exam, and the second semester’s final exam). Among them, the dynamic changes in students' status and characteristics during the 13 exercises are shown in Figure 8.

In Figure 8, longer boxes indicate more dispersed data and shorter boxes indicate more concentrated data. In Figure 8(a), the overall changes in the total questions and the total correct rate over the 13 exercises are not significant, with the former remaining roughly around 0.4 and the latter remaining roughly around 0.7. Whereas, there are some fluctuations in the average length, which remains roughly between 0.2 and 0.6. In Figure 8(b), there are large variations in the correct rates of the three types of questions: locator questions, easy-to-learn and easy-to-fail questions, and study questions. Taken together, there are differences in the actual mastery of knowledge points by different students. Therefore, there are differences in the actual practice needs of study questions, so the study takes the total study questions as the intervention. Consequently, Figure 9 displays the math test results for the students on the three exams taken before and after the intervention.

Figure 8 
                  Dynamic changes in student status and characteristics during 13 exercises. (a) The results of total number of questions, total accuracy, and average duration. (b) The results of positioning accuracy, easy to learn and error prone question accuracy, and learning question accuracy.
Figure 8

Dynamic changes in student status and characteristics during 13 exercises. (a) The results of total number of questions, total accuracy, and average duration. (b) The results of positioning accuracy, easy to learn and error prone question accuracy, and learning question accuracy.

Figure 9 
                  Mathematics scores of students in three exams before and after intervention. (a) Comparison between entrance exam and first semester final exam. (b) Comparison between entrance exam and second semester final exam. (c) Comparison of grade ranking in three math exams.
Figure 9

Mathematics scores of students in three exams before and after intervention. (a) Comparison between entrance exam and first semester final exam. (b) Comparison between entrance exam and second semester final exam. (c) Comparison of grade ranking in three math exams.

In Figure 9, the means of numbers 1 to 3 correspond to the entrance exam, the entrance exam, the final exam of the first semester and the final exam of the second semester. The overall exam scores of the students are nearly normal distributed in Figure 9(a), there is a slight difference between the entrance exam and the first semester final exam. Furthermore, there are more scores between 100 and 120, and the overall scores remain between 45 and 99. Figure 9(b) shows that after the second entrance exam, students’ overall performance improved dramatically, maintaining a range of scores between 60 and 129, and the pass rate increases from 77 to 81%. The grade rank of the final test score for the second semester improved greatly, and the grade rank of the three test scores increases sequentially in Figure 9(c). When the research technique is used for a precise exercise intervention for one academic year, the total learning effect of the students in the class is greatly improved, demonstrating the usefulness of the research method.

In order to further analyze the specific effects of the PTI model on students’ LSs and characteristics, the study set up four different exercise strategies, namely, overloading both semesters of study question practice. It is not overloaded in the first semester and not in the second semester. There is an overload in the first semester and no overload in the second semester. There is no overload in either semester. The four strategies are set as g–j and some students are randomly selected from the class to compare the two semester learning images. Among them, the comparison results of learning images of strategies g and h are shown in Figure 10.

Figure 10 
                  Comparison results of learning images using strategies g and h. (a) Under strategy g, students learn images in the first semester. (b) Under strategy h, students learn images in the first semester. (c) Student's second semester learning images under strategy g. (d) Student's second semester learning images under strategy h.
Figure 10

Comparison results of learning images using strategies g and h. (a) Under strategy g, students learn images in the first semester. (b) Under strategy h, students learn images in the first semester. (c) Student's second semester learning images under strategy g. (d) Student's second semester learning images under strategy h.

In Figure 10, one student is randomly selected from the strategy g and h administrations. The total questions, total correct rate, average time, correct rate of locus questions, correct rate of questions that are easy to learn, and correct rate of study questions are all indicated by the horizontal D–I. Every exercise is indicated by vertical numbers. In Figure 10(a), the student scored 98 in math in the first semester and ranked only 1,004th in her grade. In Figure 10(b), the student had a relatively poor percentage of locus questions correct and learning questions correct in the first semester and a grade rank of 403. In Figure 10(c), the student’s final math score in the second semester after the implementation of strategy g rose to 102, with a grade rank of 842, indicating that the superabundance of study questions significantly improved his math performance. In Figure 10(d), the student’s correct rate of locating questions and correct rate of learning questions after the superabundance of practice is significantly improved, and the grade ranking is increased to 78. Moreover, the comparison results of strategy i and j learning images are shown in Figure 11.

Figure 11 
                  Comparison results of learning images for strategy i and j. (a) Image of the first student in strategy i for two semesters. (b) Image of the second student in strategy i for two semesters. (c) Two semester images of students under strategy j.
Figure 11

Comparison results of learning images for strategy i and j. (a) Image of the first student in strategy i for two semesters. (b) Image of the second student in strategy i for two semesters. (c) Two semester images of students under strategy j.

In Figure 11, two students are randomly selected for strategy i administration and one student is selected for strategy j. In Figure 11(a), the first student maintains a high level of correctness in study problems, locus problems, etc., and has had excessive practice in the first semester. However, the math scores are significantly lower in the second semester without the effects of too much practice. The percentage of correct learning problems drops from staying above 0.8 to 0–0.2. In Figure 11(b), the second student has relatively fair correct rates for each topic in the first semester. All correct rates increase after the second semester, at which point the average length of practice increases significantly, suggesting that over-practice needs to be combined with the average length of practice. This result is consistent with the optimal intervention decision function given in the theoretical model of the study. In Figure 11(c), this student has a very high percentage of correct answers for each of the two semesters of homework training and a shorter average study time. Therefore, it is not necessary to do more practice on the study topics used to check for gaps. In math, he has a 129 on the math final for the first two semesters, but improves his grade level ranking by 50 places.

In conclusion, appropriate practice training for these five randomly selected students can effectively increase the accuracy and improve the quality of practice. It can be concluded that accurate practice interventions for different students can effectively improve their academic performance, demonstrating the validity of the PTI model proposed in the study. To further verify the validity and generalization of the theoretical model, the study applies it to a linear algebra course at a teacher training university. The number of practical teaching students in the course totaled 280, and the main data are obtained from students’ campus card records and smart teachers’ linear algebra classes. The campus card data spans 75 days, with a total of 59,445 records collected, and the smart classroom experimental data contain early warning interventions and classroom grades. The study began by analyzing the 12-week breakfast behaviors of the three categories of students in relation to the 12-week library study rate of all students. Figure 12 displays the results.

Figure 12 
                  The 12-week breakfast behavior of three types of students and the 12-week library learning rate of all students. (a) Frequency of breakfast behaviors among three types of students. (b) Entropy of breakfast behavior among three types of students for 1–12 weeks. (c) The library learning rate of all students for 12 weeks.
Figure 12

The 12-week breakfast behavior of three types of students and the 12-week library learning rate of all students. (a) Frequency of breakfast behaviors among three types of students. (b) Entropy of breakfast behavior among three types of students for 1–12 weeks. (c) The library learning rate of all students for 12 weeks.

In Figure 12(a) and (b), the overall top performers eat breakfast more frequently and with more regular behavior, and the differences between the three categories are more pronounced. In Figure 12(c), there is little variation in library study over the 12 weeks, with only a small number of students going to the library regularly and maintaining it for a longer period of time. Therefore, all three can be analyzed as characteristics of the subsequent study. In the experiment, the students are divided equally into two groups. The experimental group is subjected to the early warning intervention, while the control group is not subjected to the early warning intervention. The results of the performance of the students in the two groups are shown in Figure 13.

Figure 13 
                  Comparison of results between two groups of students. (a) Student grades in the experimental group. (b) Student grades in the control group.
Figure 13

Comparison of results between two groups of students. (a) Student grades in the experimental group. (b) Student grades in the control group.

In Figure 13(a), the average score of the experimental group of students after the early warning intervention is 80, with an overall more pronounced increase in the scores of the poorer students, and a smaller change in the good students because they did not need the early warning intervention per se. Without the early warning intervention, the control group’s average score in Figure 12(b) is 71. All things considered, the experimental group outperforms the control group by 9 points, demonstrating the efficacy of the early warning intervention. By comparing the research model with the advanced synthetic minority oversampling technique with the models of random forest (K), decision tree (L), and extreme gradient boost with the model of students’ behavioral data (M), the study aims to further validate the superiority of the theoretical model proposed in the study in terms of precise intervention in students’ instructional decision-making. The comparison additionally introduces another class of students in addition to the analyzed class for comprehensive analysis, and the two classes are set as Class 1 and Class 2. The results of the exact intervention are shown in Table 2.

Table 2

Comparison of results of precision interventions in teaching with different methods

Accuracy Precision Recall F1-Score
Class 1
K 0.674 0.673 0.673 0.677
L 0.785 0.784 0.776 0.712
M 0.841 0.812 0.796 0.741
Research model 0.915 0.901 0.894 0.854
Class 2
K 0.663 0.622 0.602 0.612
L 0.701 0.700 0.694 0.677
M 0.812 0.810 0.800 0.745
Research model 0.964 0.915 0.905 0.900

In Table 2, in Class 1, the actual accuracy, precision, recall, and F1 values of the research model for teaching precision intervention decisions are 91.5, 90.1, 89.4, and 85.4%, respectively, which are significantly higher than the comparison model. In Class 2, the four values of the research model are 96.4, 91.5, 90.5, and 90.0%, which are also higher than the comparison model. Taken together, the research model has a high degree of accuracy in TID and can effectively optimize students’ LS.

4.3 Complexity analysis of the proposed method

The core of the proposed QTR model lies in solving the quantile regression problem. For each quantile, the time complexity of QTR is mainly determined by the dimensions of the matrix and the number of samples. Therefore, assuming the matrix dimensions are m × n and the number of samples is k , the time complexity of QTR is O ( k × m × n ) . The time complexity of the RL model mainly depends on the state space S and the action space A , with a time complexity of O ( S × A ) . Thus, the ultimate complexity of the method proposed in the study is the superposition of the two. In the worst-case scenario, the model’s time complexity is O ( k × m × n × S × A ) . It can be concluded that in the case of high-dimensional data and large state-action spaces, the computational cost of the model can be very high.

The QTR model requires storage for the matrix data for each sample as well as the regression coefficient matrix. Consequently, its space complexity is primarily determined by the dimensions of the matrix and the number of samples. In the worst-case scenario, the space complexity of the QTR model is O ( k × m × n ) . The space complexity of the RL is O ( S × A ) . The method proposed in the study needs to store both the matrix data of the QTR model and the Q-table of the RL model. Therefore, in the worst-case scenario, the model’s space complexity is O ( k × m × n + S × A ) .

From the above analysis, it can be concluded that the model has a large overhead in terms of both time and space complexity. Subsequently, dimensionality reduction techniques are considered to reduce the matrix dimensions of the QTR model, and approximate dynamic programming is used to handle the large state-action spaces of the RL. At the same time, incremental learning methods will be employed to gradually update the model parameters, thus reducing memory usage and computational time.

4.4 Selection of evaluation indicators and sensitivity analysis

Finally, the reasons for selecting the metrics are discussed in detail and the sensitivity analysis is performed. Accuracy, precision, recall, and F1 score are selected as the main evaluation metrics. Among the aforementioned metrics, accuracy is a measurement of the classification ability of the model. It calculates the precision and recall of the model in its entirety. The F1 score is a harmonic average of accuracy and recall, providing a comprehensive efficiency measurement. At the same time, the weight of each index is adjusted to observe the change in model performance. The weight of each indicator is changed from 0.25 to 1, and the weight of other indicators will be reduced accordingly. The specific results are shown in Table 3.

Table 3

Model sensitivity analysis

Metric weight adjustment Accuracy Precision Recall F1-score Model performance change
Original weights 0.915 0.901 0.894 0.854 None
Accuracy = 0.5 0.908 0.895 0.888 0.847 Slight decrease
Precision = 0.5 0.902 0.905 0.890 0.851 Slight variation
Recall = 0.5 0.898 0.889 0.902 0.845 Slight variation
F1-score = 0.5 0.903 0.897 0.891 0.894 Slight variation

In Table 3, when the weight of each index is adjusted from 0.25 to 0.5, the overall performance of the model (as measured by the F1 score) decreases slightly, but there is little change. This shows that the model is robust in increasing the weight of each index. The slight decline in accuracy and precision may be attributable to the model’s prediction tendency within specific categories. Conversely, the increase in recall indicates that the model can maintain a high recall rate while enhancing the recognition of a limited number of categories. This indicates that the proposed model has good stability and reliability, and can effectively capture the nuances of educational intervention outcomes.

On this basis, the study selects three different educational datasets to test the performance of the model. The K-12 Mathematics Education Dataset consists of mathematics course performance data from a school district in California, including 1,200 students, with assessment standards including midterm exams, final exams, and regular homework scores. The K-12 Science Education Dataset is a set of data concerning the achievements of 1,500 students enrolled in a school district in Texas. The data comprise information regarding science courses, with an emphasis on laboratory reports, theoretical exams, and course participation. The Higher Education Computer Science Dataset contains course performance data from the computer science program at a Massachusetts university, including 800 students, with assessment standards that include programming assignments, project design, and final exams. The validation results of the three datasets are shown in Figure 14.

Figure 14 
                  Verification results of the model in different datasets.
Figure 14

Verification results of the model in different datasets.

In Figure 14, the accuracy of the model proposed in the study is above 0.87 in all three datasets, and the F1 scores are all above 0.85. Among them, the results of the K-12 Science Education dataset are marginally lower than those of the Mathematics dataset, possibly due to the heterogeneity of science courses and the subjective nature of experimental operations, which increases the complexity of model predictions. In contrast, the model performs best on the Higher Education Computer Science dataset, which may be related to the self-learning ability of college students and the systematic design of the curriculum. Overall, the model proposed in the study shows good performance on different educational datasets, demonstrating the generalizability and robustness of the model.

5 Discussion

The study proposed the integration of RL and QTR to achieve precise instructional interventions. A substantial body of research demonstrated the efficacy of RL-based instructional intervention methods in enhancing student learning outcomes [5]. Wang et al. further validated the effectiveness of RL approaches in optimizing adaptive instructional strategies within online learning environments [26]. Building on this foundation, the present study incorporated quantile regression techniques and developed an instructional intervention decision framework based on the integration of RL and QTR [17]. Compared to traditional TR methods, the proposed approach exhibited superior stability and robustness in predicting learning outcomes across groups of students and in guiding personalized instructional interventions.

The results confirmed the effectiveness and superiority of the proposed method. However, it was important to acknowledge and discuss the limitations and challenges associated with this integration. Educational data typically had characteristics that changed over time, as student performance and behavioral patterns evolved. If the model did not adequately account for these changes, it may introduce bias. Therefore, future work will explore adaptive models that can dynamically adapt to changes in data distribution to ensure that intervention decisions remain relevant and effective in the fluctuating educational environment. In the educational context, the criteria for success and optimal outcomes may change as educational goals and policies evolve. Consequently, the model may need to be updated to reflect these changes, requiring an adaptive reward system.

In the previous text, the study conducted a comprehensive review of related work in the areas of RL and educational intervention strategies. A deeper exploration of prior research revealed that while RL was applied in educational settings, there was limited exploration of its combination with QTR to deal with non-stationary data and evolving reward structures. The contribution of the proposed method to educational intervention strategies lies in its ability to provide personalized and dynamic instructional interventions, consistent with the growing body of research advocating personalized learning and adaptive educational practices. By addressing the limitations of current TID technologies and incorporating advanced computational methods, the study contributes to the development of more effective and intelligent educational decision-making tools. On this basis, future research can focus on extending the theoretical model to a wider range of educational contexts and scenarios. This includes testing the model’s effectiveness across different levels of education, types of courses, and diverse student populations, as well as exploring the model’s potential to handle more complex educational data and achieve finer-grained, personalized interventions.

6 Conclusion

The study focused on optimizing personalized instructional interventions in large-scale educational environments by proposing a TID method based on the integration of RL and QTR models, validated by simulation experiments and real classroom data. The main contributions of this research were reflected in the following three aspects: The construction of a PTI theoretical framework based on the integration of RL and QTR, which enhanced the personalization and dynamic adaptability of TID. The design of a learning effectiveness prediction method was predicated on feature imaging and quantile modeling. This method enabled fine-grained modeling of the relationships between different student groups’ learning behaviors and academic outcomes. It was also subject to comprehensive validation across multiple scenarios and datasets. This validation demonstrated the superiority and generalizability of the proposed method in improving the accuracy of learning outcome predictions and the effectiveness of instructional interventions.

Based on the results, the following practical implications are suggested: In large-scale educational applications, dynamic student learning features should be used with the RL-QTR model to achieve precise instructional interventions. For student groups with significant heterogeneity in learning behavior, it is recommended to use quantile modeling approaches to develop differentiated learning paths. In designing intelligent learning platforms, emphasis should be placed on strengthening data collection and feature processing modules to support continuous optimization of intelligent decision models.

  1. Funding information: Author states no funding involved.

  2. Author contributions: Jifeng Gong: conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, resources, software, supervision, validation, visualization, and writing – review and editing. Author has accepted responsibility for the entire content of this manuscript and approved its submission.

  3. Conflict of interest: Author states no conflict of interest.

  4. Data availability statement: All data generated or analyzed during this study are included in this published article.

References

[1] Hebbi C, Mamatha H. Comprehensive dataset building and recognition of isolated handwritten Kannada characters using machine learning models. Artif Intell Appl. 2023;1(3):179–90. 10.47852/bonviewAIA3202624.Search in Google Scholar

[2] Fan W, Li Z, Zhang J. Construction and practice of digital transformation of education under the background of big data. Adv Ind Eng Manag. 2023;12(1):28–33.Search in Google Scholar

[3] Yu J, Couldry N. Education as a domain of natural data extraction: Analysing corporate discourse about educational tracking. Inf Commun Soc. 2022;25(1):127–44. 10.1080/1369118X.2020.1764604.Search in Google Scholar

[4] Qin M. Application of efficient recognition algorithm based on deep neural network in English teaching scene. Connect Sci. 2022;34(1):1913–28. 10.1080/09540091.2022.123456.Search in Google Scholar

[5] Shin J, Chen F, Lu C, Bulut O. Analyzing students’ performance in computerized formative assessments to optimize teachers’ test administration decisions using deep learning frameworks. J Comput High Educ. 2022;9(1):71–91. 10.1007/s40692-021-00196-7.Search in Google Scholar

[6] Abualadas HM, Xu L. Achievement of learning outcomes in non-traditional versus traditional anatomy teaching in medical schools: a mixed method systematic review. Clin Anat. 2023;36(1):50–76. 10.1002/ca.23942.Search in Google Scholar PubMed PubMed Central

[7] Kittur J, Bekki J, Brunhaver S. Development of a student engagement score for online undergraduate engineering courses using learning management system interaction data. Comput Appl Eng Educ. 2022;30(3):661–77. 10.1002/cae.22479.Search in Google Scholar

[8] Trakunphutthirak R, Lee VCS. Application of educational data mining approach for student academic performance prediction using progressive temporal data. J Educ Comput Res. 2022;60(3):742–76. 10.1177/07356331211048777.Search in Google Scholar

[9] Teng MF, Qin C, Wang C. Validation of metacognitive academic writing strategies and the predictive effects on academic writing performance in a foreign language context. Metacogn Learn. 2022;17(1):167–90. 10.1007/s11409-021-09278-4.Search in Google Scholar PubMed PubMed Central

[10] AdrianChin YK, JosephNg PS, Eaw HC, Loh YF, Shibghatullah AS. JomDataMining: Academic performance and learning behaviour dubious relationship. Int J Bus Inf Syst. 2022;41(4):548–68. 10.1504/IJBIS.2022.127555.Search in Google Scholar

[11] Lee M, Kim H, Wright E. The influx of International Baccalaureate programmes into local education systems in Hong Kong, Singapore, and South Korea. Educ Rev. 2022;74(1):131–50. 10.1080/00131911.2021.1891023.Search in Google Scholar

[12] Usher M, Hershkovitz A, Forkosh-Baruch A. From data to actions: instructors’ decision making based on learners’ data in online emergency remote teaching. Br J Educ Technol. 2021;52(4):1338–56. 10.1111/bjet.13108.Search in Google Scholar

[13] Carter M, Webster A, Stephenson J, Waddy N, Stevens R, Clements M, et al. Decision-making regarding adjustments for students with special educational needs in mainstream classrooms. Res Pap Educ. 2022;37(5):729–55. 10.1080/02671522.2020.1864768.Search in Google Scholar

[14] Yulianti K, Denessen E, Droop M, Veerman GJ. School efforts to promote parental involvement: The contributions of school leaders and teachers. Educ Stud. 2022;48(1):98–113. 10.1080/03055698.2020.1740978.Search in Google Scholar

[15] Gesel SA, LeJeune LM, Chow JC, Sinclair AC, Lemons CJ. A meta-analysis of the impact of professional development on teachers’ knowledge, skill, and self-efficacy in data-based decision-making. J Learn Disabil. 2021;54(4):269–83. 10.1177/0022219420970196.Search in Google Scholar PubMed

[16] Pesce C, Lakes KD, Stodden DF, Marchetti R. Fostering self‐control development with a designed intervention in physical education: A two‐year class‐randomized trial. Child Dev. 2021;92(3):937–58. 10.1111/cdev.13445.Search in Google Scholar PubMed

[17] Gion C, McIntosh K, Falcon S. Effects of a multifaceted classroom intervention on racial disproportionality. Sch Psychol Rev. 2022;51(1):67–83. 10.1080/2372966X.2020.1788906.Search in Google Scholar

[18] Jungjohann J, Gebhardt M. Dimensions of classroom-based assessments in inclusive education: A teachers’ questionnaire for instructional decision-making, educational assessments, identification of special educational needs, and progress monitoring. J Spec Educ. 2023;38(1):131–44. 10.52291/ijse.2023.38.12.Search in Google Scholar

[19] Kim DS, Kim T. Moment representations of fully degenerate Bernoulli and degenerate Euler polynomials. Russ J Math Phys. 2024;31(4):682–90. 10.1134/S1061920824040071.Search in Google Scholar

[20] Lysytska O, Mykytiuk S, Chastnyk O, Mykytiuk S. Foreign language teaching modes and adaptive methods in emergency education: Evaluation of first-hand experience. Multisci J. 2025;7(2):2025069. 10.31893/multiscience.2025069.Search in Google Scholar

[21] Haas C. Social origin and students’ trajectory patterns at German universities: a sequence-analytical approach. SozW Soz Welt. 2023;74(3):431–65. 10.5771/0038-6073-2023-3.Search in Google Scholar

[22] Womack B, Shi J. Socio-economic status, educational debt, and career choices of social work students in the Southeast United States. Soc Work Educ. 2023;42(1):127–44. 10.1080/02615479.2022.2053098.Search in Google Scholar

[23] Nunes L, Marcuzzi R, Chen X, Behley J, Stachniss C. SegContrast: 3D point cloud feature representation learning through self-supervised segment discrimination. IEEE Robot Autom Lett. 2022;7(2):2116–23. 10.1109/LRA.2022.3142440.Search in Google Scholar

[24] Liu H, Liu T, Zhang Z, Sangaiah AK, Yang B, Li Y. ARHPE: asymmetric relation-aware representation learning for head pose estimation in industrial human–computer interaction. IEEE Trans Ind Inf. 2022;18(10):7107–17. 10.1109/TII.2022.3143605.Search in Google Scholar

[25] Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang MH, et al. Learning enriched features for fast image restoration and enhancement. IEEE Trans Pattern Anal Mach Intell. 2023;45(2):1934–48. 10.1109/TPAMI.2022.3167175.Search in Google Scholar PubMed

[26] Carayannis EG, Campbell DF, Grigoroudis E. Helix trilogy: The triple, quadruple, and quintuple innovation helices from a theory, policy, and practice set of perspectives. J Knowl Econ. 2022;13(3):2272–301. 10.1007/s13132-021-00813-x.Search in Google Scholar

[27] Ren Z, Wan J, Deng P. Machine-learning-driven digital twin for lifecycle management of complex equipment. IEEE Trans Emerg Top Comput. 2022;10(1):9–22. 10.1109/TETC.2022.3143346.Search in Google Scholar

[28] Kovačević J, Mujkić A, Kapo A. Examining school leadership in a transitional context: A mixed-methods study of leadership practices and school cultures as mechanisms of educational change. Educ Manag Adm Leadersh. 2023;51(1):219–44. 10.1177/1741143220971286.Search in Google Scholar

[29] Cunningham JE, Chow JC, Meeker KA, Taylor A, Hemmeter ML, Kaiser AR. A conceptual model for a blended intervention approach to support early language and social-emotional development in toddler classrooms. Infant Young Child. 2023;36(1):53–73. 10.1097/IYC.0000000000000232.Search in Google Scholar

[30] Zhang S, Xia Y, Xia Y, Wang J. Matrix-form neural networks for complex-variable basis pursuit problem with application to sparse signal reconstruction. IEEE Trans Cybern. 2022;52(7):7049–59. 10.1109/TCYB.2020.3042519.Search in Google Scholar PubMed

[31] Yuan J, Weng Y. Support matrix regression for learning power flow in distribution grid with unobservability. IEEE Trans Power Syst. 2022;37(2):1151–61. 10.1109/TPWRS.2021.3107551.Search in Google Scholar

[32] Wang Z, Ma D, Gong G, Xue E. New construction of complementary sequence sets and complete complementary codes. IEEE Trans Inf Theory. 2021;67(7):4902–28. 10.1109/TIT.2021.3079124.Search in Google Scholar

[33] Peng L, Tan XY, Xiao PW, Rizk Z, Liu XH. Oracle inequality for sparse trace regression models with exponential β-mixing errors. Adv Math Sci Eng. 2023;39(10):2031–53. 10.1007/s10114-023-2153-3.Search in Google Scholar

[34] Zhu Z, Lin K, Jain AK, Zhou J. Transfer learning in deep reinforcement learning: A survey. IEEE Trans Pattern Anal Mach Intell. 2023;45(11):13344–62. 10.1109/TPAMI.2023.3292075.Search in Google Scholar PubMed PubMed Central

[35] Zhang Z, Zhang D, Qiu RC. Deep reinforcement learning for power system applications: An overview. CSEE J Power Energy Syst. 2020;6(1):213–25. 10.17775/CSEEJPES.2019.00920.Search in Google Scholar

[36] Haydari A, Yılmaz Y. Deep reinforcement learning for intelligent transportation systems: A survey. IEEE Trans Intell Transp Syst. 2022;23(1):11–32. 10.1109/TITS.2020.3008612.Search in Google Scholar

Received: 2025-02-08
Revised: 2025-05-07
Accepted: 2025-05-12
Published Online: 2025-09-08

© 2025 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

  1. Research Articles
  2. Generalized (ψ,φ)-contraction to investigate Volterra integral inclusions and fractal fractional PDEs in super-metric space with numerical experiments
  3. Solitons in ultrasound imaging: Exploring applications and enhancements via the Westervelt equation
  4. Stochastic improved Simpson for solving nonlinear fractional-order systems using product integration rules
  5. Exploring dynamical features like bifurcation assessment, sensitivity visualization, and solitary wave solutions of the integrable Akbota equation
  6. Research on surface defect detection method and optimization of paper-plastic composite bag based on improved combined segmentation algorithm
  7. Impact the sulphur content in Iraqi crude oil on the mechanical properties and corrosion behaviour of carbon steel in various types of API 5L pipelines and ASTM 106 grade B
  8. Unravelling quiescent optical solitons: An exploration of the complex Ginzburg–Landau equation with nonlinear chromatic dispersion and self-phase modulation
  9. Perturbation-iteration approach for fractional-order logistic differential equations
  10. Variational formulations for the Euler and Navier–Stokes systems in fluid mechanics and related models
  11. Rotor response to unbalanced load and system performance considering variable bearing profile
  12. DeepFowl: Disease prediction from chicken excreta images using deep learning
  13. Channel flow of Ellis fluid due to cilia motion
  14. A case study of fractional-order varicella virus model to nonlinear dynamics strategy for control and prevalence
  15. Multi-point estimation weldment recognition and estimation of pose with data-driven robotics design
  16. Analysis of Hall current and nonuniform heating effects on magneto-convection between vertically aligned plates under the influence of electric and magnetic fields
  17. A comparative study on residual power series method and differential transform method through the time-fractional telegraph equation
  18. Insights from the nonlinear Schrödinger–Hirota equation with chromatic dispersion: Dynamics in fiber–optic communication
  19. Mathematical analysis of Jeffrey ferrofluid on stretching surface with the Darcy–Forchheimer model
  20. Exploring the interaction between lump, stripe and double-stripe, and periodic wave solutions of the Konopelchenko–Dubrovsky–Kaup–Kupershmidt system
  21. Computational investigation of tuberculosis and HIV/AIDS co-infection in fuzzy environment
  22. Signature verification by geometry and image processing
  23. Theoretical and numerical approach for quantifying sensitivity to system parameters of nonlinear systems
  24. Chaotic behaviors, stability, and solitary wave propagations of M-fractional LWE equation in magneto-electro-elastic circular rod
  25. Dynamic analysis and optimization of syphilis spread: Simulations, integrating treatment and public health interventions
  26. Visco-thermoelastic rectangular plate under uniform loading: A study of deflection
  27. Threshold dynamics and optimal control of an epidemiological smoking model
  28. Numerical computational model for an unsteady hybrid nanofluid flow in a porous medium past an MHD rotating sheet
  29. Regression prediction model of fabric brightness based on light and shadow reconstruction of layered images
  30. Dynamics and prevention of gemini virus infection in red chili crops studied with generalized fractional operator: Analysis and modeling
  31. Review Article
  32. Haar wavelet collocation method for existence and numerical solutions of fourth-order integro-differential equations with bounded coefficients
  33. Special Issue: Nonlinear Analysis and Design of Communication Networks for IoT Applications - Part II
  34. Silicon-based all-optical wavelength converter for on-chip optical interconnection
  35. Research on a path-tracking control system of unmanned rollers based on an optimization algorithm and real-time feedback
  36. Analysis of the sports action recognition model based on the LSTM recurrent neural network
  37. Industrial robot trajectory error compensation based on enhanced transfer convolutional neural networks
  38. Research on IoT network performance prediction model of power grid warehouse based on nonlinear GA-BP neural network
  39. Interactive recommendation of social network communication between cities based on GNN and user preferences
  40. Application of improved P-BEM in time varying channel prediction in 5G high-speed mobile communication system
  41. Construction of a BIM smart building collaborative design model combining the Internet of Things
  42. Optimizing malicious website prediction: An advanced XGBoost-based machine learning model
  43. Economic operation analysis of the power grid combining communication network and distributed optimization algorithm
  44. Sports video temporal action detection technology based on an improved MSST algorithm
  45. Internet of things data security and privacy protection based on improved federated learning
  46. Enterprise power emission reduction technology based on the LSTM–SVM model
  47. Construction of multi-style face models based on artistic image generation algorithms
  48. Special Issue: Decision and Control in Nonlinear Systems - Part II
  49. Animation video frame prediction based on ConvGRU fine-grained synthesis flow
  50. Application of GGNN inference propagation model for martial art intensity evaluation
  51. Benefit evaluation of building energy-saving renovation projects based on BWM weighting method
  52. Deep neural network application in real-time economic dispatch and frequency control of microgrids
  53. Real-time force/position control of soft growing robots: A data-driven model predictive approach
  54. Mechanical product design and manufacturing system based on CNN and server optimization algorithm
  55. Application of finite element analysis in the formal analysis of ancient architectural plaque section
  56. Research on territorial spatial planning based on data mining and geographic information visualization
  57. Fault diagnosis of agricultural sprinkler irrigation machinery equipment based on machine vision
  58. Closure technology of large span steel truss arch bridge with temporarily fixed edge supports
  59. Intelligent accounting question-answering robot based on a large language model and knowledge graph
  60. Analysis of manufacturing and retailer blockchain decision based on resource recyclability
  61. Flexible manufacturing workshop mechanical processing and product scheduling algorithm based on MES
  62. Exploration of indoor environment perception and design model based on virtual reality technology
  63. Tennis automatic ball-picking robot based on image object detection and positioning technology
  64. A new CNN deep learning model for computer-intelligent color matching
  65. Design of AR-based general computer technology experiment demonstration platform
  66. Indoor environment monitoring method based on the fusion of audio recognition and video patrol features
  67. Health condition prediction method of the computer numerical control machine tool parts by ensembling digital twins and improved LSTM networks
  68. Establishment of a green degree evaluation model for wall materials based on lifecycle
  69. Quantitative evaluation of college music teaching pronunciation based on nonlinear feature extraction
  70. Multi-index nonlinear robust virtual synchronous generator control method for microgrid inverters
  71. Manufacturing engineering production line scheduling management technology integrating availability constraints and heuristic rules
  72. Analysis of digital intelligent financial audit system based on improved BiLSTM neural network
  73. Attention community discovery model applied to complex network information analysis
  74. A neural collaborative filtering recommendation algorithm based on attention mechanism and contrastive learning
  75. Rehabilitation training method for motor dysfunction based on video stream matching
  76. Research on façade design for cold-region buildings based on artificial neural networks and parametric modeling techniques
  77. Intelligent implementation of muscle strain identification algorithm in Mi health exercise induced waist muscle strain
  78. Optimization design of urban rainwater and flood drainage system based on SWMM
  79. Improved GA for construction progress and cost management in construction projects
  80. Evaluation and prediction of SVM parameters in engineering cost based on random forest hybrid optimization
  81. Museum intelligent warning system based on wireless data module
  82. Special Issue: Nonlinear Engineering’s significance in Materials Science
  83. Experimental research on the degradation of chemical industrial wastewater by combined hydrodynamic cavitation based on nonlinear dynamic model
  84. Study on low-cycle fatigue life of nickel-based superalloy GH4586 at various temperatures
  85. Some results of solutions to neutral stochastic functional operator-differential equations
  86. Ultrasonic cavitation did not occur in high-pressure CO2 liquid
  87. Research on the performance of a novel type of cemented filler material for coal mine opening and filling
  88. Testing of recycled fine aggregate concrete’s mechanical properties using recycled fine aggregate concrete and research on technology for highway construction
  89. A modified fuzzy TOPSIS approach for the condition assessment of existing bridges
  90. Nonlinear structural and vibration analysis of straddle monorail pantograph under random excitations
  91. Achieving high efficiency and stability in blue OLEDs: Role of wide-gap hosts and emitter interactions
  92. Construction of teaching quality evaluation model of online dance teaching course based on improved PSO-BPNN
  93. Enhanced electrical conductivity and electromagnetic shielding properties of multi-component polymer/graphite nanocomposites prepared by solid-state shear milling
  94. Optimization of thermal characteristics of buried composite phase-change energy storage walls based on nonlinear engineering methods
  95. A higher-performance big data-based movie recommendation system
  96. Nonlinear impact of minimum wage on labor employment in China
  97. Nonlinear comprehensive evaluation method based on information entropy and discrimination optimization
  98. Application of numerical calculation methods in stability analysis of pile foundation under complex foundation conditions
  99. Research on the contribution of shale gas development and utilization in Sichuan Province to carbon peak based on the PSA process
  100. Characteristics of tight oil reservoirs and their impact on seepage flow from a nonlinear engineering perspective
  101. Nonlinear deformation decomposition and mode identification of plane structures via orthogonal theory
  102. Numerical simulation of damage mechanism in rock with cracks impacted by self-excited pulsed jet based on SPH-FEM coupling method: The perspective of nonlinear engineering and materials science
  103. Cross-scale modeling and collaborative optimization of ethanol-catalyzed coupling to produce C4 olefins: Nonlinear modeling and collaborative optimization strategies
  104. Special Issue: Advances in Nonlinear Dynamics and Control
  105. Development of a cognitive blood glucose–insulin control strategy design for a nonlinear diabetic patient model
  106. Big data-based optimized model of building design in the context of rural revitalization
  107. Multi-UAV assisted air-to-ground data collection for ground sensors with unknown positions
  108. Design of urban and rural elderly care public areas integrating person-environment fit theory
  109. Application of lossless signal transmission technology in piano timbre recognition
  110. Application of improved GA in optimizing rural tourism routes
  111. Architectural animation generation system based on AL-GAN algorithm
  112. Advanced sentiment analysis in online shopping: Implementing LSTM models analyzing E-commerce user sentiments
  113. Intelligent recommendation algorithm for piano tracks based on the CNN model
  114. Visualization of large-scale user association feature data based on a nonlinear dimensionality reduction method
  115. Low-carbon economic optimization of microgrid clusters based on an energy interaction operation strategy
  116. Optimization effect of video data extraction and search based on Faster-RCNN hybrid model on intelligent information systems
  117. Construction of image segmentation system combining TC and swarm intelligence algorithm
  118. Particle swarm optimization and fuzzy C-means clustering algorithm for the adhesive layer defect detection
  119. Optimization of student learning status by instructional intervention decision-making techniques incorporating reinforcement learning
  120. Fuzzy model-based stabilization control and state estimation of nonlinear systems
  121. Optimization of distribution network scheduling based on BA and photovoltaic uncertainty
Downloaded on 3.10.2025 from https://www.degruyterbrill.com/document/doi/10.1515/nleng-2025-0155/html
Scroll to top button