Home Reinforcement learning with Gaussian process regression using variational free energy
Article Open Access

Reinforcement learning with Gaussian process regression using variational free energy

  • Kiseki Kameda EMAIL logo and Fuyuhiko Tanaka
Published/Copyright: March 21, 2023
Become an author with De Gruyter Brill

Abstract

The essential part of existing reinforcement learning algorithms that use Gaussian process regression involves a complicated online Gaussian process regression algorithm. Our study proposes online and mini-batch Gaussian process regression algorithms that are easier to implement and faster to estimate for reinforcement learning. In our algorithm, the Gaussian process regression updates the value function through only the computation of two equations, which we then use to construct reinforcement learning algorithms. Our numerical experiments show that the proposed algorithm works as well as those from previous studies.

MSC 2010: 68

1 Introduction

Reinforcement learning is learning how to behave to maximize reward. One of the most striking examples of the application of reinforcement learning is AlphaGo [1,2], which defeated a professional Go player. Various function approximations have been used in reinforcement learning algorithms, one of which is based on Gaussian process regression [3]. Gaussian process regression is a Bayesian nonparametric method often used as a standard nonlinear regression model and has some desired properties such as a low possibility of overfitting and the ability to express estimation uncertainty.

Reinforcement learning algorithms are often based on estimating a value function [4]. In classical reinforcement learning, the model of the value function is represented in table form. In other words, the model is represented as a matrix of pairs of states and actions. However, as the set of states and actions becomes larger, estimating the value function becomes harder. We solve this problem by representing the value function by functional approximation [4,5]. One of the functional approximations used in reinforcement learning is Gaussian process regression, and its features are expected to be helpful. The principal advantage is the high degree of freedom of the model obtained by the kernel function. Through Bayesian learning, estimation uncertainty is expressed naturally.

However, due to its high computational cost, Gaussian process regression has not been suitable for reinforcement learning. In addition, the existing online algorithms in Gaussian process regression have been very complicated to implement. Thus, we need to develop a simple algorithm for online Gaussian process regression that is easier to implement for reinforcement learning.

Prior research on model-free reinforcement learning using Gaussian process regression includes state-action-reward-state-action (SARSA)-based methods and Q-learning-based methods. There are several methods based on SARSA, such as GP-SARSA [6,7] and iGP-SARSA [8]. However, these methods have problems, such as high computational costs. The GPQ [9] is an algorithm based on Q-learning and uses the sparse online Gaussian processes method [10]. These algorithms are based on complex methods to reduce the computational complexity of Gaussian process regression.

In this article, to overcome the above shortcomings, we propose a mini-batch-learnable variational free energy (VFE) method and a reinforcement learning algorithm based on the VFE method and Q-learning. The VFE method approximates the posterior distribution by variational inference. The offline VFE method [11] is widely used to reduce the computational complexity of Gaussian process regression.

The VFE method is one of the methods using inducing points and is expressed by a simple formula. The inducing points must be provided to estimate using this method. Choosing the inducing points may be difficult for some environments, but the method of using them is easier to estimate. For example, it is more difficult to select inducing points in higher-dimensional environments. The computational complexity of our method is the same as the offline VFE method. The computational complexity is O ( N ) , where N is the number of data. Mini-batch learning is more efficient than online learning in reinforcement learning.

Our main contributions are as follows:

  1. We extend the VFE method to allow online and mini-batch learning.

  2. We propose a reinforcement learning algorithm using the mini-batch-learnable VFE method.

In our experiments, we confirm that the proposed method can be learned just as well as the existing methods. We compare the proposed reinforcement learning algorithm with the algorithms in previous studies by considering a two-dimensional grid global environment as an example. Numerical results show that the proposed algorithm works as well as the current GPQ but is simpler and faster.

In Section 2, we explain the basics of reinforcement learning and Gaussian process regression in some detail. We then propose our reinforcement learning algorithm using the online Gaussian process regression with a mini-batch-learnable VFE method in Section 3. Section 4 shows the numerical results of experiments in a two-dimensional grid world. Finally, concluding remarks are given in Section 5.

2 Background

2.1 Reinforcement learning

In reinforcement learning, the environment is often modeled using the Markov decision process (MDP) [12]. The MDP is represented by M = S , A , R , p , γ . Here, S and A denote the set of states and actions, R : S R is a function of immediate reward, p : S × A × S [ 0 , 1 ] is the transition probability, and γ [ 0 , 1 ) is the discount rate. Let π : S × A [ 0 , 1 ] be the action rule of the agent called the policy. We have to estimate the value function from the trajectory { s 0 , a 0 , r 0 , s 1 , a 1 , r 1 , } generated from the MDP and the policy. We define the discounted reward sum as R t = r t + γ r t + 1 + . In reinforcement learning, the goal is to find the action rule that maximizes the reward, i.e., to find π = max π E π [ R 0 ] , where E π [ ] represents the expected value under the given policy π . To find π , we introduce a value function Q π ( s , a ) = E π [ R 0 s 0 = s , a 0 = a ] . Since π can be obtained from the optimal value function Q = max π Q π ( s , a ) , finding π reduces to finding Q . Q-learning [13] is often used as a method to find Q . If the value function is in table form, we can update the value function with Q t + 1 ( s t , a t ) = Q t ( s t , a t ) + α t [ r t + γ max a Q t ( s t + 1 , a ) Q t ( s t , a t ) ] . In the case of function approximation, the value of Q ( s t , a t ) is updated to be closer to the value on the right-hand side.

2.2 Gaussian process regression

Gaussian process regression [3] is regarded as a Bayesian nonparametric regression, where we estimate the function y = f ( x ) from the input variable x to the output variable y . Let D = { X , Y } , X = { x n R p } n = 1 N , and Y = { y n R } n = 1 N be the data of N pairs of input and output variables. Assume that the output variable is accompanied by normally distributed noise, and the model is y i = f ( x i ) + ε i , ( i = 1 , , N ) , ε i iid N ( 0 , σ 2 ) . The goal is to construct the predictive distribution of the output y for a new input x . Let f N ( 0 , K N N ) be the prior distribution. The covariance function is defined as [ K N N ] i , j = k ( x i , x j ) , where k ( , ) denotes a kernel function. In this article, we use the radial basis function (RBF) kernel k ( x 1 , x 2 ) = α 2 exp [ x 1 x 2 2 β 2 ] . The predictive distribution is obtained from the formula for the posterior distribution of the normal distribution. The predictive distribution of y is

(1) p ( y Y , x , X ) = N ( y μ , Σ ) , μ = k N ( K N N + σ 2 I ) 1 Y , Σ = k ( x , x ) + σ 2 I k N ( K N N + σ 2 I ) 1 k N T ,

with k N = ( k ( x , x 1 ) , , k ( x , x N ) ) . Only matrix multiplication is required to calculate the predictive distribution, and the computational cost is O ( N 3 ) . One way to reduce the computational cost is to use inducing points. The partially independent training condition [14], the fully independent training condition [15], and the VFE [11] are well-known methods that use inducing points. In methods using inducing points, the computational cost is O ( N M 2 ) , where M is the number of inducing points, and, in general, N > M .

We introduce the VFE method and explain this method in detail. Let Z = { z j R p } j = 1 M be the set of inducing points. The model for Gaussian process regression using inducing points is

(2) p ( Y f ) = N ( Y f , σ 2 I ) ,

(3) p ( f u ) = N ( f K M N T K M M 1 u , K N N K M N T K M M 1 K M N ) ,

(4) p ( u ) = N ( u 0 , K M M ) ,

with u = ( f ( z 1 ) , , f ( z M ) ) , [ K M M ] i , j = k ( z i , z j ) , and [ K M N ] i , j = k ( z i , x j ) . For details, see [11]. Then, the predictive distribution of y is

(5) p ( y Y ) = N ( μ ( x ) , k ( x ) ) ,

(6) μ ( x ) = k M K M M 1 μ ,

(7) k ( x ) = k ( x , x ) k M K M M 1 ( I + Σ K M M 1 ) k M T ,

(8) μ = 1 σ 2 K M M K M M + 1 σ 2 K M N K M N T 1 K M N Y ,

(9) Σ = K M M K M M + 1 σ 2 K M N K M N T 1 K M M ,

with k M = ( k ( x , z 1 ) , , k ( x , z M ) ) . In the VFE method, we can estimate the predictive distribution by computing these equations. If we used all data each time for mini-batch learning, the computational cost would be high. Therefore, in the next section, we will extend the method so that it can be estimated without changing the computational cost.

2.3 Related work

Here, we introduce previous research on reinforcement learning using Gaussian process regression. Reinforcement learning algorithms can be divided into two categories: model-based and model-free. In model-based reinforcement learning algorithms, Gaussian process regression can be used not only for the value function but also for the environment model, such as [1618]. Since the model of the environment is also estimated, the computational cost is higher, but it is not necessary to learn the model of the environment.

Next, we discuss model-free reinforcement learning algorithms. Reinforcement learning algorithms can be categorized into two types: on-policy learning and off-policy learning. SARSA [4] is a typical algorithm for on-policy learning. GP-SARSA [6,7] is an algorithm based on SARSA that learns a value function using a Gaussian process. iGP-SARSA [8] is a method that improves the exploration of GP-SARSA. On the other hand, Q-learning is a typical off-policy learning algorithm. GPQ [9] is a learning method based on Q-learning in which the value function is represented by a Gaussian process.

Next, we describe online learning for Gaussian process regression. GPQ uses the sparse online Gaussian processes method [10] to construct the algorithm. While the Gaussian regression algorithm used by these methods is complex, this article proposes an algorithm that can be updated with only two formulas. The difference between these Gaussian regression methods and the proposed method is the use of inducing points. Offline learning methods for Gaussian process regression with inducing points include VFE [11], fully independent training conditional (FITC) [15], and partially independent training conditional (PITC) [14]. However, reinforcement learning algorithms require online or mini-batch training. Online learning methods have been proposed for FITC and PITC without changing the computational cost [19].

In this article, we use the VFE, which is widely used and can be computed with a simple formula. We also propose mini-batch learning, which has not been proposed in previous studies. The relationship between the previous study and the proposed method is summarized in Table 1. We construct a Q-learning algorithm using our proposed mini-batch VFE. The difference from previous studies is that our proposed mini-batch VFE is used, but the original structure is the same as that of Q-learning.

Table 1

The relationship between the previous study and the proposed method

Online learning Mini-batch learning
FITC (Bijl et al., 2015) [19]
PITC (Bijl et al., 2015) [19]
VFE Proposed method Proposed method

3 Reinforcement learning with Gaussian process regression using VFE

We extend the VFE formulas (5)–(9) to be updatable online. We have used the methods of previous studies [19] as a guide. We rewrite the covariance function for an online update, where Σ N for N pairs of data is given as follows:

(10) Σ N = K M M K M M + 1 σ 2 K M N K M N T 1 K M M .

Let x + and y + be the new input data. We transform the covariance function Σ N + 1 with N + 1 pairs of data as follows:

(11) Σ N + 1 = K M M K M M + 1 σ 2 K M , N + 1 K M , N + 1 T 1 K M M

(12) = K M M K M M + 1 σ 2 K M N K M N T + 1 σ 2 K M + K M + T 1 K M M ,

where K M + = ( k ( x + , z 1 ) , , k ( x + , z M ) ) .

By using the formula for the inverse of the sum of two matrices, we have

(13) Σ N + 1 = Σ N Σ N P Σ N 1 + tr ( Σ N P ) ,

(14) P = 1 σ 2 K M M 1 K M + K M + T K M M 1 .

Using this formula, Σ can be updated online. Then, μ can also be transformed as follows:

(15) μ N + 1 = 1 σ 2 K M M K M M + 1 σ 2 K M , N + 1 K M , N + 1 T 1 K M , N + 1 y N + 1

(16) = 1 σ 2 Σ N + 1 Σ N 1 μ N + 1 σ 2 Σ N + 1 K M M 1 K M + y +

(17) = 1 σ 2 I Σ N P 1 + tr ( Σ N P ) μ N + 1 σ 2 Σ N + 1 K M M 1 K M + y + .

From this equation, we can update μ N from Σ N , Σ N + 1 , and μ N . This update method allows us to learn with computational cost O ( N M 2 ) , which is similar to offline learning. Furthermore, this method requires less memory because it does not store the input data but only keeps Σ N and μ N .

Next, we will extend this online VFE learning to allow mini-batch learning. In other words, we consider the case in which there is more than one piece of data coming in to be updated. Let D be the set of all data up to the present, and let D + be the set of data to be updated. The covariance function for a set of data D D + is denoted by Σ N + . Then, Σ N + is given as follows:

(18) Σ N + = K M M K M M + 1 σ 2 K M N K M N T + 1 σ 2 K M + K M + T 1 K M M ,

where [ K M + ] i , j = k ( x i , x j + ) , x i D , x j + D + . Since K M + is a matrix, the inverse of the sum of the matrices is represented in a different form.

(19) Σ N + = Σ N 1 σ 2 Σ N K M M 1 K M + Q 1 K M + T K M M T Σ N , Q = I + 1 σ 2 K M + T K M M 1 Σ N K M M 1 K M + .

Using this update formula, Σ N + can be updated. We transform μ N + in the same way as in online learning, where it is given as follows:

(20) μ N + = 1 σ 2 I 1 σ 2 Σ N K M M 1 K M + Q 1 K M + T K M M T μ N + 1 σ 2 Σ N + K M M 1 K M + y + .

By using (19) and (20), we can directly obtain Σ N + and μ N + for the set of data D D + from Σ N and μ N for the set of data D . Furthermore, the number of elements in D + does not have to be a fixed value during the learning process. Both online learning and mini-batch learning have the same computational complexity and return the same estimated results as offline VFE formulas (5)–(9).

We propose an algorithm for Q-learning using this mini-batch-learnable VFE method. The algorithm for learning GP based on the supervised data obtained by Q-learning has been proposed in GPQ [9]. While the GPQ is based on the sparse online Gaussian process regression algorithm [10], we use the formula proposed earlier. The proposed algorithm is presented as Algorithm 1. Lines 2–6 of Algorithm 1 are the same as in Q-learning, and equations (19) and (20) of the proposed method are used in the updating part of the value function in lines 7–10.

Q-learning with mini-batch-learnable VFE
1: For the first data, use the offline VFE formulas (5) through (9).
2: for each time step t do
3: Choose a t from s t , using ε -greedy exploration
4: Take action a t , observe r t , s t + 1
5: y t = r + γ max a Q ( s t + 1 , a )
6: x t = s t , a t
7: Add ( y t , x t ) to D +
8: if D + = N batch then
9: Compute μ N + and Σ N + according to (20) and (19)
10: Reset D +
11: end if
12: end for

To learn with the mini-batch-learnable VFE method, a set of inducing points must be given. The inducing points should be evenly distributed across the product set of states and actions. Kernel functions can be selected according to the environment. In our algorithm, the supervised data are generated in the same way as in normal Q-learning. The generated supervised data are then used to learn the value function in the Gaussian process regression. The batch size should be changed depending on the environment. Empirically, it is more efficient to consider a large batch size in a complex environment. The larger the batch size, the faster the learning proceeds for the number of data. However, note that too large a batch size may slow the convergence of the proposed algorithm. The computational cost of Algorithm 1 is O ( N M 2 ) . Since the proposed algorithm only computes two formulas in the update, we expect the proposed algorithm to be faster than the GPQ method.

4 Experiments

In this section, we present several experiments that show that the proposed algorithm can perform as well as existing algorithms. We use two-dimensional grids [9] for our experiments. The state of this environment is represented by a 5 × 5 grid. Agents in this environment start at ( 1 , 1 ) and can move in four directions: up, down, left, and right. The transition is noisy, i.e., with a probability of 0.1, that the agent remains in its current state. The agent can obtain reward 1 in state ( 5 , 5 ) . This environment has been used in previous experiments [9]. We use the GPQ method and tabular Q-learning for comparison.

In this experiment, γ is set to 0.99. For all methods, we used the ε -greedy algorithm, the exploration rate of which is 1 / t 0.3 . The learning rate used in Q-learning is set to 0.5 / t 0.1 . We use the same RBF kernel as k ( x 1 , x 2 ) = exp [ ( x 1 x 2 ) 2 / 2 ] and variance σ 2 = 1 for both methods. We set the GPQ parameter, β tol , to 0.75 and the kernel budget to 25. The batch size N batch is set to 16. The same estimation can be done with N = 1 , i.e., with online learning, but the calculation time increases. The number of inducing points in the proposed method is 36, and the inducing points are arranged in a grid pattern in this experiment. This is the best combination of parameter settings for each algorithm in our experiments.

We perform ten independent experiments for each method. The average number of steps required to reach the goal is shown in Figure 1. We can see that each algorithm can learn the optimal behavior. Figure 2 shows that the resulting value estimates are similar for both the proposed method and GPQ. Numerically, there is no significant difference in the speed of convergence for each algorithm, and these algorithms perform similarly. The proposed algorithm computes only two equations without branches, and this experiment shows that for the same data size, our method terminates more than twice as fast as the GPQ method. The computational cost of the online Gaussian process regression algorithm used by GPQ is O ( N ) . Since the computational cost of both methods is O ( N ) for the number of data, the difference in execution time depends on the size of the hyperparameters and the number of formulas to be calculated.

Figure 1 
               Average number of steps required to reach the goal over ten independent trials.
Figure 1

Average number of steps required to reach the goal over ten independent trials.

Figure 2 
               The resulting value estimates of the proposed method (left) and the resulting value estimates of GPQ (right) after 100 episodes. The goal is located at the bottom right.
Figure 2

The resulting value estimates of the proposed method (left) and the resulting value estimates of GPQ (right) after 100 episodes. The goal is located at the bottom right.

A limitation of the proposed method comes from the selection of inducing points, which must be given before learning. In a complex problem, the performance of our algorithm depends on the set of inducing points. Increasing the number of inducing points gives a wider estimated range of states and actions, but the time required for learning is longer. This problem can be solved by changing the scale of the data if the range of states and actions is not large.

5 Conclusion

In reinforcement learning, an algorithm has been proposed in which the value function is represented by Gaussian process regression. Gaussian process regression is expected to have advantages because of its high expressivity by kernel functions and Bayesian learning. However, the algorithms proposed in previous studies use complex online Gaussian process regression methods. We propose an online or mini-batch Gaussian process regression with VFE and inducing points for easier learning with Gaussian process regression. This method of Gaussian regression requires only the computation of two equations. We then construct a Q-learning algorithm using these two equations. Our experiments show that the algorithm can learn as well as those from previous studies.

The advantage of our algorithm is that it is easy to implement while expressing the value function in Gaussian process regression. In addition, our algorithm uses mini-batch learning, and experiments show that it can be estimated more efficiently than online learning.

Our proposed algorithm has the limitation that inducing points must be given before learning. In the case of an environment where inducing points are difficult to give, for example, when the behavior or state is high-dimensional, it becomes difficult to use our proposed algorithm.

Improving the proposed method by taking advantage of Gaussian process regression will be conducted in a future study. Even though we have been able to learn with Gaussian process regression, we have not been able to take full advantage of Gaussian process regression. Exploration for reinforcement learning is important in gathering useful data for updating the value function, and we believe that the ability to express uncertainty in the value function can be used for exploration. In addition, the choice of inducing points also affects the estimation. How inducing points are selected in reinforcement learning algorithms will also be investigated in a future study. Finally, experiments were conducted using the classical reinforcement learning environment that has been used in previous studies. Experiments in other environments should be the subject of future research.

Acknowledgment

This work was supported by JST SPRING (Grant Number JPMJSP2138), and JSPS KAKENHI (Grant Number JP19K11860).

  1. Conflict of interest: The authors declare that they have no conflict of interest.

References

[1] Silver D, Huang A, Maddison C, Guez A, Sifre L, Driessche G, et al. Mastering the game of GO with deep neural networks and tree search. Nature. 2016;529:484–9. 10.1038/nature16961Search in Google Scholar PubMed

[2] Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, et al. Mastering the game of GO without human knowledge. Nature. 2017;550:354–9. 10.1038/nature24270Search in Google Scholar PubMed

[3] Rasmussen CE, Williams CKI. Gaussian processes for machine learning. Cambridge, MA, USA: MIT Press; 2006. 10.7551/mitpress/3206.001.0001Search in Google Scholar

[4] Sutton SR, Barto GA. Reinforcement learning: an introduction. 2nd edition. Cambridge, MA, USA: MIT Press; 2018. Search in Google Scholar

[5] Szepesvári C. Algorithms for reinforcement learning. Synthesis lectures on artificial intelligence and machine learning. San Rafael, CA, USA: Morgan and Claypool Publishers; 2010. 10.1007/978-3-031-01551-9Search in Google Scholar

[6] Engel Y, Mannor S, Meir R. Bayes meets Bellman: The Gaussian process approach to temporal difference learning. In: International Conference on Machine Learning; 2003. p. 154–61. Search in Google Scholar

[7] Engel Y, Mannor S, Meir R. Reinforcement learning with Gaussian processes. In: International Conference on Machine Learning; 2005. p. 201–8. 10.1145/1102351.1102377Search in Google Scholar

[8] Chung JJ, Lawrance RJN, Sukkarieh S. Gaussian processes for informative exploration in reinforcement learning. In: 2013 IEEE International Conference on Robotics and Automation; 2013. p. 2633–9. 10.1109/ICRA.2013.6630938Search in Google Scholar

[9] Chowdhary G, Liu M, Grande R, Walsh T, How J, Carin L. Off-policy reinforcement learning with Gaussian processes. IEEE/CAA J Automat Sinica. 2014;1(3):227–38. 10.1109/JAS.2014.7004680Search in Google Scholar

[10] Csató L, Opper M. Sparse online Gaussian processes. Neural Comput. March 2002;14(3):641–68. 10.1162/089976602317250933Search in Google Scholar PubMed

[11] Titsias M. Variational learning of inducing variables in sparse Gaussian processes. In: Artificial intelligence and statistics. Cambridge, MA, USA: MIT press; 2009. p. 567–74. Search in Google Scholar

[12] Puterman LM. Markov decision processes: Discrete stochastic dynamic programming. Hoboken, NJ, USA: John Wiley and Sons, Inc., 1994. 10.1002/9780470316887Search in Google Scholar

[13] Watkins CJCH, Dayan P. Q-learning. In: Machine learning; 1992. p. 279–92. 10.1023/A:1022676722315Search in Google Scholar

[14] Quinonero-Candela J, Rasmussen CE. A unifying view of sparse approximate Gaussian process regression. J Machine Learn Res. 2005;6:1939–59. Search in Google Scholar

[15] Snelson E, Ghahramani Z. Sparse Gaussian processes using pseudo-inputs. In: Advances in neural information processing systems. Cambridge, MA, USA: MIT press; 2006. p. 1257–64. Search in Google Scholar

[16] Rasmussen CE, Kuss M. Gaussian processes in reinforcement learning. In: Advances in neural information processing systems. Cambridge, MA, USA: MIT press; 2003. p. 751–9. Search in Google Scholar

[17] Deisenroth MP, Rasmussen CE. PILCO: a model-based and data-efficient approach to policy search. In: International Conference on Machine Learning; 2011. p. 465–72. Search in Google Scholar

[18] Jung T, Stone P. Gaussian processes for sample efficient reinforcement learning with RMAX-like exploration. In: Proceedings of the European Conference on Machine Learning; 2010. p. 601–16. 10.1007/978-3-642-15880-3_44Search in Google Scholar

[19] Bijl H, Wingerden J, Schön BT, Verhaegen M. Online sparse Gaussian process regression using FITC and PITC approximations. IFAC-PapersOnLine. 2015;48(28):703–8. 10.1016/j.ifacol.2015.12.212Search in Google Scholar

Received: 2022-05-30
Revised: 2022-11-10
Accepted: 2022-12-14
Published Online: 2023-03-21

© 2023 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

  1. Research Articles
  2. Salp swarm and gray wolf optimizer for improving the efficiency of power supply network in radial distribution systems
  3. Deep learning in distributed denial-of-service attacks detection method for Internet of Things networks
  4. On numerical characterizations of the topological reduction of incomplete information systems based on evidence theory
  5. A novel deep learning-based brain tumor detection using the Bagging ensemble with K-nearest neighbor
  6. Detecting biased user-product ratings for online products using opinion mining
  7. Evaluation and analysis of teaching quality of university teachers using machine learning algorithms
  8. Efficient mutual authentication using Kerberos for resource constraint smart meter in advanced metering infrastructure
  9. Recognition of English speech – using a deep learning algorithm
  10. A new method for writer identification based on historical documents
  11. Intelligent gloves: An IT intervention for deaf-mute people
  12. Reinforcement learning with Gaussian process regression using variational free energy
  13. Anti-leakage method of network sensitive information data based on homomorphic encryption
  14. An intelligent algorithm for fast machine translation of long English sentences
  15. A lattice-transformer-graph deep learning model for Chinese named entity recognition
  16. Robot indoor navigation point cloud map generation algorithm based on visual sensing
  17. Towards a better similarity algorithm for host-based intrusion detection system
  18. A multiorder feature tracking and explanation strategy for explainable deep learning
  19. Application study of ant colony algorithm for network data transmission path scheduling optimization
  20. Data analysis with performance and privacy enhanced classification
  21. Motion vector steganography algorithm of sports training video integrating with artificial bee colony algorithm and human-centered AI for web applications
  22. Multi-sensor remote sensing image alignment based on fast algorithms
  23. Replay attack detection based on deformable convolutional neural network and temporal-frequency attention model
  24. Validation of machine learning ridge regression models using Monte Carlo, bootstrap, and variations in cross-validation
  25. Computer technology of multisensor data fusion based on FWA–BP network
  26. Application of adaptive improved DE algorithm based on multi-angle search rotation crossover strategy in multi-circuit testing optimization
  27. HWCD: A hybrid approach for image compression using wavelet, encryption using confusion, and decryption using diffusion scheme
  28. Environmental landscape design and planning system based on computer vision and deep learning
  29. Wireless sensor node localization algorithm combined with PSO-DFP
  30. Development of a digital employee rating evaluation system (DERES) based on machine learning algorithms and 360-degree method
  31. A BiLSTM-attention-based point-of-interest recommendation algorithm
  32. Development and research of deep neural network fusion computer vision technology
  33. Face recognition of remote monitoring under the Ipv6 protocol technology of Internet of Things architecture
  34. Research on the center extraction algorithm of structured light fringe based on an improved gray gravity center method
  35. Anomaly detection for maritime navigation based on probability density function of error of reconstruction
  36. A novel hybrid CNN-LSTM approach for assessing StackOverflow post quality
  37. Integrating k-means clustering algorithm for the symbiotic relationship of aesthetic community spatial science
  38. Improved kernel density peaks clustering for plant image segmentation applications
  39. Biomedical event extraction using pre-trained SciBERT
  40. Sentiment analysis method of consumer comment text based on BERT and hierarchical attention in e-commerce big data environment
  41. An intelligent decision methodology for triangular Pythagorean fuzzy MADM and applications to college English teaching quality evaluation
  42. Ensemble of explainable artificial intelligence predictions through discriminate regions: A model to identify COVID-19 from chest X-ray images
  43. Image feature extraction algorithm based on visual information
  44. Optimizing genetic prediction: Define-by-run DL approach in DNA sequencing
  45. Study on recognition and classification of English accents using deep learning algorithms
  46. Review Articles
  47. Dimensions of artificial intelligence techniques, blockchain, and cyber security in the Internet of medical things: Opportunities, challenges, and future directions
  48. A systematic literature review of undiscovered vulnerabilities and tools in smart contract technology
  49. Special Issue: Trustworthy Artificial Intelligence for Big Data-Driven Research Applications based on Internet of Everythings
  50. Deep learning for content-based image retrieval in FHE algorithms
  51. Improving binary crow search algorithm for feature selection
  52. Enhancement of K-means clustering in big data based on equilibrium optimizer algorithm
  53. A study on predicting crime rates through machine learning and data mining using text
  54. Deep learning models for multilabel ECG abnormalities classification: A comparative study using TPE optimization
  55. Predicting medicine demand using deep learning techniques: A review
  56. A novel distance vector hop localization method for wireless sensor networks
  57. Development of an intelligent controller for sports training system based on FPGA
  58. Analyzing SQL payloads using logistic regression in a big data environment
  59. Classifying cuneiform symbols using machine learning algorithms with unigram features on a balanced dataset
  60. Waste material classification using performance evaluation of deep learning models
  61. A deep neural network model for paternity testing based on 15-loci STR for Iraqi families
  62. AttentionPose: Attention-driven end-to-end model for precise 6D pose estimation
  63. The impact of innovation and digitalization on the quality of higher education: A study of selected universities in Uzbekistan
  64. A transfer learning approach for the classification of liver cancer
  65. Review of iris segmentation and recognition using deep learning to improve biometric application
  66. Special Issue: Intelligent Robotics for Smart Cities
  67. Accurate and real-time object detection in crowded indoor spaces based on the fusion of DBSCAN algorithm and improved YOLOv4-tiny network
  68. CMOR motion planning and accuracy control for heavy-duty robots
  69. Smart robots’ virus defense using data mining technology
  70. Broadcast speech recognition and control system based on Internet of Things sensors for smart cities
  71. Special Issue on International Conference on Computing Communication & Informatics 2022
  72. Intelligent control system for industrial robots based on multi-source data fusion
  73. Construction pit deformation measurement technology based on neural network algorithm
  74. Intelligent financial decision support system based on big data
  75. Design model-free adaptive PID controller based on lazy learning algorithm
  76. Intelligent medical IoT health monitoring system based on VR and wearable devices
  77. Feature extraction algorithm of anti-jamming cyclic frequency of electronic communication signal
  78. Intelligent auditing techniques for enterprise finance
  79. Improvement of predictive control algorithm based on fuzzy fractional order PID
  80. Multilevel thresholding image segmentation algorithm based on Mumford–Shah model
  81. Special Issue: Current IoT Trends, Issues, and Future Potential Using AI & Machine Learning Techniques
  82. Automatic adaptive weighted fusion of features-based approach for plant disease identification
  83. A multi-crop disease identification approach based on residual attention learning
  84. Aspect-based sentiment analysis on multi-domain reviews through word embedding
  85. RES-KELM fusion model based on non-iterative deterministic learning classifier for classification of Covid19 chest X-ray images
  86. A review of small object and movement detection based loss function and optimized technique
Downloaded on 9.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jisys-2022-0205/html
Scroll to top button