Abstract
This tutorial review provides a comprehensive overview of machine learning (ML)-based model predictive control (MPC) methods, covering both theoretical and practical aspects. It provides a theoretical analysis of closed-loop stability based on the generalization error of ML models and addresses practical challenges such as data scarcity, data quality, the curse of dimensionality, model uncertainty, computational efficiency, and safety from both modeling and control perspectives. The application of these methods is demonstrated using a nonlinear chemical process example, with open-source code available on GitHub. The paper concludes with a discussion on future research directions in ML-based MPC.
1 Introduction
Model predictive control (MPC) is an advanced control technique that has seen extensive use in industrial applications since the 1980s. Unlike traditional control methods, MPC uses a dynamic model of the system of interest to predict future behavior and compute optimal control actions. Upon receiving information on the system’s current state, the MPC will generate a sequence of optimal control actions over a predefined time horizon, based on the predictions made by the dynamic model. The receding horizon feature of MPC implies that only the first control input in the optimal control action sequence is implemented, and a new sequence is calculated at the next time step with updated system information. As the MPC relies heavily on the predictions of the dynamic model for computation, a highly accurate model is vital for satisfactory control performance. Traditionally, these models are derived predominantly from fundamental theories. For example, the dynamic behaviors of chemical processes are theoretically described using mass and energy balance equations.
However, the derivation of these first-principles models can often be tedious and costly, especially for complex nonlinear systems. Empirical methods that construct dynamical models from data pose as an alternative to the theoretical approach. Although the use of machine learning (ML) tools such as artificial neural networks (ANNs) has been applied to chemical systems over the past decades (Hoskins and Himmelblau 1988; Nascimento et al. 2000), the recent success of deep learning models has reignited interest in adopting ML models for MPC applications. Moreover, with increasing access to industrial data and enhanced computational capabilities, a paradigm shift from a theory-based modeling approach to a data-centric approach has been observed. In particular, recurrent neural networks (RNNs), a subbranch of deep learning models, known for its ability to process time series data, have shown promising results in modeling the dynamics of complex chemical systems for MPC applications (Limon et al. 2017; Su et al. 1992; Terzi et al. 2021; Wu et al. 2019d; You and Nikolaou 1993). In addition to simulation studies, ML-based MPC has been successfully applied to real-life systems such as a paper machine (Lanzetti et al. 2019), an experimental electrochemical reactor (Luo et al. 2023), a yeast biofermentation bioreactor (Nagy 2007), and a continuous pharmaceutical manufacturing process (Wong et al. 2018). Despite these advancements, the implementation of ML-based MPC in industrial settings is still far from being realized. In a strengths, weaknesses, opportunities and threats (SWOT) analysis performed by Dobbelaere et al. (2021), on the use of ML tools in chemical engineering, the authors identified the lack of interpretability of black-box ML models and the challenges in obtaining sufficient and reliable data as the main obstacles that prevent the application of ML models. In addition, controller robustness, operation safety, and system stability are some of the common issues that have been raised in the discussion of ML-based MPC (Bonassi et al. 2022; Brunke et al. 2022; Hewing et al. 2020; Nian et al. 2020; Schweidtmann et al. 2021). In this review article, we consolidate some of the common challenges faced in the industrial implementation of ML-based MPC and categorize them according to their theoretical and practical aspects.
The theoretical challenge of ML-based MPC lies in the mathematical guarantee of closed-loop stability under ML-based MPC (Berberich et al. 2020). Closed-loop stability is essential to ensure safe, efficient, and reliable operation of control systems. ML models are typically developed using a set of training data that is representative of the underlying data distribution. Even with sufficient training, some ML models may struggle to generalize to new, unseen data beyond the training set. This may result in poor controller performance and stability issues in ML-based MPC. Thus, understanding the generalization performance of ML models is a key challenge in guaranteeing closed-loop stability.
On the other hand, the practical challenges of implementing ML-based MPC in industrial settings are significantly more diverse, arising from the different stages in the development of ML-based MPC. The formation of an ML-based MPC can be divided chronologically into three phases: data collection, modeling, and execution phases. Each phase presents a unique set of challenges that need to be addressed to ensure the feasibility of ML-based MPC. Data scarcity and data corruption are common issues that plague the data collection process in industrial settings. As ML models are highly dependent on the quantity and quality of the data used for training, the question of how to develop accurate ML models under such circumstances poses a major concern in the application of ML-based MPC (Thebelt et al. 2022). Additionally, practical challenges often arise when modeling large-scale systems in industries. The curse of dimensionality, a phenomenon in which an increase in data dimensions results in an exponential growth in data requirement and a reduction in the efficiency and effectiveness of ML algorithms, presents a significant challenge in modeling large-scale systems. Thus, how to effectively capture the dynamics of complex large-scale systems and bypass the curse of dimensionality is another key challenge in the development of ML-based MPC for industrial applications. The heavy computational burden and sluggish processing speed of ML-based MPC is a well-acknowledged problem (Wu et al. 2019d). In addition, as noted by Mesbah et al. (2022), model uncertainty and process disturbances are unavoidable issues in controller implementation, how to speed up ML-based MPC calculations and improve the robustness of ML-based MPC are valid concerns that require attention during the execution phase of ML-based MPC. Overall, safety is an overarching concern that applies to all stages of the development of ML-based MPC (Brunke et al. 2022; Hewing et al. 2020). Safe data collection, safe modeling, and safe implementation are vital to ensure safe operation under ML-based MPC. Finally, the lack of interpretability of black-box ML models raises a general concern among global community (Bonassi et al. 2022; Dobbelaere et al. 2021; Schweidtmann et al. 2021; Shang and You 2019). The lack of understanding of the decision process and internal workings of data-driven models can impede users’ trust towards these models, especially for control applications where safety is paramount. Thus, enhancing the transparency of ML models is a critical step towards gaining industrial approval of ML-based MPC.
Substantial reviews on the application of ML models to process system engineering have been discussed by Daoutidis et al. (2023), Everett (2021), Khan and Ammar Taqvi (2023), Lee et al. (2018), Mowbray et al. (2022), Pan et al. (2022), and Shang and You (2019). However, since the objective of these reviews was to provide an overview on the development of ML models and to analyze existing and potential applications of ML models in the industry, discussions on ML-based MPCs were limited. On the other hand, reviews specific to ML-based MPC focused on various aspects of ML-based MPC (Abdullah and Christofides 2023a; Berberich and Allgöwer 2024; Bonassi et al. 2022; Brunke et al. 2022; Dev et al. 2021; Gonzalez et al. 2023; Lu et al. 2019; Mesbah et al. 2022; Nian et al. 2020; Norouzi et al. 2023; Ren et al. 2022; Tang and Daoutidis 2022). For instance, Mesbah et al. (2022) and Nian et al. (2020) examine the applications of a particular type of ML-based MPC: reinforcement learning (RL)-based MPC, while Brunke et al. (2022) explores its safety aspect. While Ren et al. (2022) focuses on a broader category of ML models, neural networks (NNs), providing a tutorial review on their modeling approaches in MPC, there was limited mention of the challenges related to the implementation of ML-based MPC. Bonassi et al. (2022) consolidated recent efforts in the development of RNN-based MPC and discussed issues related to RNN-based MPC in terms of safe verification and interpretability of the RNN model, as well as stability and robustness of RNN-based MPC. Similarly, Berberich and Allgöwer (2024) focus on closed-loop stability and guarantee of ML-based MPC. However, to the best of the authors’ knowledge, a comprehensive overview of the challenges faced in the implementation of ML-based MPC has yet to be established. Hence, this review aims to complement the existing literature by providing an extensive review of the theoretical and practical challenges in ML-based MPC, specifically NN-based MPC, as well as a summary of the current efforts taken to address each of these challenges. The review also presents a dual perspective to approach some of the practical issues, namely from both modeling and control viewpoints.
This article is organized as follows: preliminary knowledge on the class of systems considered and an introduction to neural networks and ML-based MPC is provided in Section 2. In Section 3, the theoretical challenges of ML-based MPC and current advances in characterizing the generalization performance of ML models and analyzing the closed-loop stability of ML-based MPC are reviewed. In Section 4, practical challenges and potential solutions to resolve these issues are discussed. This includes topics such as data scarcity, data quality, the curse of dimensionality, model uncertainty, computational efficiency, and safety concerns of ML-based MPC. In Section 5, novel ML modeling and ML-MPC control methods mentioned in Section 4 are applied to a nonlinear chemical process to demonstrate their effectiveness. Finally, Section 6 concludes with an outlook on the future directions of ML-based MPC.
2 Preliminaries
2.1 Notation
The notation
2.2 Class of systems
The class of continuous-time nonlinear systems considered is described by the following system of first-order nonlinear ordinary differential equations (ODEs):
where
2.3 Supervised learning – neural networks
Machine learning can be divided into four learning types: supervised, unsupervised, semi-supervised, and reinforcement learning. In supervised learning, the dataset
2.3.1 FNN and RNN
The formulation of a one-hidden-layer FNN is provided below, with hidden states
where
where
RNNs are a type of neural network that uses sequential data or time-series data. While the hidden layer configuration varies among different NNs, the output layer configuration remains consistent for FNNs and RNNs. The key difference between an FNN and an RNN lies in the direction of information flow. A figure showing the network structures of an FNN and an RNN is presented in Figure 1.

A feedforward neural network (left) and a recurrent neural network (right).
From Figure 1, the flow of information in an FNN is observed to be unidirectional, where information is passed sequentially in the forward fashion through the input layer, hidden layer, and finally to the output layer. On the other hand, RNNs are designed for sequential data where the order of inputs is important. RNNs represent an improvement over FNNs in the sense that RNNs have connections that create loops within the networks. The recurrent structure of the RNNs allows them to retain information from previous time steps which facilitates the capturing of patterns in sequential data. Therefore, RNNs are widely utilized for modeling nonlinear dynamic processes, particularly in applications that require time series predictions where RNNs have demonstrated their efficiency to capture nonlinear behaviors over some time period. To illustrate the difference, we consider a one-hidden-layer RNN. The computation of RNN hidden states
where
The aforementioned configuration constitutes the standard and simplest RNN structure. Since the first introduction of RNNs in the 1980s, many variations of the conventional RNNs have emerged and demonstrated exceptional capabilities in processing time series data. Popular variants of RNNs include the long short-term memory (LSTM) network (Hochreiter and Schmidhuber 1997) and gated recurrent unit (GRU) (Cho et al. 2014). LSTMs and GRUs are gated variants of RNNs that modify the computation of RNN hidden states. A standard LSTM cell uses three gates, namely the forget, input, and output gates, to update its cell and hidden states. The update equations are listed below:
where
Compared to LSTMs, GRUs have a more simplified structure with only two gates, the update
with weight matrices

Network architecture of a simple recurrent neural network cell, a long short-term memory and a gated recurrent unit.
While RNNs, LSTMs, and GRUs are all types of neural networks used for processing sequential data, they have different structures and mechanisms to handle the vanishing gradient problem and maintain long-term dependencies. In general, RNNs are suitable for simpler tasks, LSTMs can be used for tasks that require long-term memory, and GRUs may achieve a balance of efficiency and performance. We will primarily focus on simple RNNs when addressing theoretical and practical challenges in the following sections. However, it should be noted that the methods discussed in this manuscript can also be readily applied to other types of RNNs, such as LSTMs and GRUs.
2.4 ML-based MPC
To simplify the notation for MPC using RNN models, we represent the RNN model in the following continuous-time form for the nominal system of Eq. (1) (i.e.,
where
A general tracking model predictive control design is given by the following optimization problem:
where the objective function seeks to minimize the integral of the cost function
As it is a well-known fact that the MPC formulation given by Eq. (8) is not always stabilizing, several approaches have been proposed in the literature to achieve closed-loop stability. One approach involves using infinite prediction horizons or carefully designed terminal penalty terms; for a comprehensive review of these methods, see Bitmead et al. (1990) and Mayne et al. (2000). Another approach is to enforce stability constraints directly within the MPC optimization problem (Chen and Allgöwer 1998; Mayne et al. 2000). Ensuring closed-loop stability under MPC with terminal constraints extends standard MPC by incorporating additional constraints on the system’s state at the end of the prediction horizon. Proper design of terminal constraints and terminal regions is crucial to ensure that the system reaches a desired or safe operating point within a finite time horizon.
Another important approach in stabilizing MPC is Lyapunov-based MPC (LMPC). LMPC provides an explicit characterization of the stability region and guarantees controller feasibility and closed-loop stability. In the context of predictive control for the system of Eq. (1), the LMPC is designed based on an existing explicit control law
where
In the optimization problem of Eq. (9), the objective function of Eq. (9a) is the integral of
While ML-based MPC offers promising opportunities for addressing complex control problems by using data-driven models, several challenges need to be addressed to ensure the effectiveness and reliability of ML-based MPC approaches. In this review paper, we will discuss the state-of-the-art technologies that tackle the emerging challenges in ML-MPC.
Remark 1.
In addition to the general set-point tracking MPC, machine learning models can also be used in zone-tracking MPC (Ferramosca et al. 2010; González and Odloak 2009). Unlike MPC that tracks the set-point, zone-tracking MPC allows the system to operate within a predefined region. It has been reported that a single NN model may not be sufficient to fully capture the system dynamics over the entire operating region consisting of multiple operating points, especially for complex nonlinear processes (Huang et al. 2023; Wu et al. 2019d). The recent work of Huang et al. (2023) divided the operating region into three overlapping sub-regions and trained separate LSTM models, which were combined in the first layer of a neural network. This two-layer NN, integrated into a zone-tracking MPC system, improved open-loop prediction and provided feedback control for irrigation.
3 Theoretical challenges in ML-MPC
The stability of MPC is a fundamental consideration in its application. ML models trained on historical data may struggle to generalize to new or unseen operating conditions, leading to poor performance and stability issues in MPC. Developing a better understanding and techniques for improving the generalization of ML models is a key challenge in ML-based MPC.
3.1 Generalization performance
Early work in ML-MPC often assumes that the model-plant mismatch is bounded, and therefore, the closed-loop stability of MPC holds through the robust design of MPC. However, this assumption may not be true for practical ML models. The generalization error in machine learning refers to the expected error of a model on unseen data drawn from the same distribution as the training data. In other words, it measures how well a trained model performs on new, unseen data points. Generalization error is a critical concept in machine learning because the ultimate goal is to build models that can make accurate predictions on data that they have not seen during training. Initial research on the generalizability of ML models was developed using Vapnik–Chervonenkis (VC) dimension, a method that characterizes the capacity and complexity of models (Sontag 1998b; Vapnik et al. 1994). However, due to the simplified assumptions underlying the VC dimension approach, the derived generalization error bounds can be overly conservative (Chen et al. 2019). Thus, alternatives such as probably approximately correct (PAC) – Bayesian method (Neyshabur et al. 2017) and empirical Rademacher complexity approach under the PAC Learning framework (Bartlett et al. 2017) were introduced and adopted in recent years. Numerous studies have analyzed the generalizability of RNNs, predominantly focusing on their performance in classification tasks (Chen et al. 2019; Sontag 1998a; Wei and Ma 2019; Zhang et al. 2018). Recent works by Akpinar et al. (2019) and Wu et al. (2021b), have proposed PAC analysis frameworks to derive generalization error bounds for RNNs in regression problems. For demonstration purposes, this review will follow the empirical Rademacher complexity approach presented in Wu et al. (2021b) to derive the generalization error bound of RNNs. Interested readers may refer to works by Koiran and Sontag (1998) and Sontag (1998a) for the VC dimension method, Zhang et al. (2018) for the PAC-Bayesian approach, and Akpinar et al. (2019) for the traditional PAC framework in the derivation of the generalization error bound for RNNs.
3.1.1 Generalization error
Given a single-layer RNN model trained with
The loss function is denoted as
Assumption 1.
All inputs into the RNN model are bounded, that is, for all
Assumption 2.
The Frobenius norms of all the weight matrices are bounded in the following manner:
Assumption 3.
The nonlinear activation
Assumption 4.
Data samples used for training, validation, and testing purposes are drawn from the same distribution.
All the assumptions made adhere to common practice in machine learning theory. Specifically, the first two assumptions assume the boundedness of RNN inputs and weights, a condition typically satisfied in many modeling tasks where a finite class of neural network hypotheses is used to model nonlinear systems based on data collected from a finite set. The third assumption can be satisfied by activation functions such as ReLU and its variants. It is used for the derivation of generalization error in this section, and can be omitted when using other proof techniques, as demonstrated in Golowich et al. (2018). The last assumption is a fundamental and necessary condition for analyzing generalization performance, which is adopted in many machine learning works that consider the application of machine learning models to the same process without disturbances or model uncertainties. However, in the presence of disturbances that cause variations in process dynamics over time, the generalization error can still be derived for machine learning models by accounting for the drift in distribution. We will discuss this further when introducing online machine learning in Section 4.4.1.
The following text entails the essential definitions and lemmas widely used within the theoretical framework of machine learning.
Definition 1.
The expected loss/error or generalization error of a function
where the vector spaces of all possible inputs and outputs are denoted by
However, since the joint probability distribution
Definition 2.
The empirical error or risk of a dataset with
In order to ensure that the RNN model can capture the nonlinear dynamics of the system of Eq. (1) and generalize well to unseen operating conditions, it is essential to show that the generalization error
In this study, the loss function used is the mean squared error (MSE). While it can be readily demonstrated that the MSE loss function
where
Further analysis shows that the generalization error can be perceived as a combination of the approximation and the estimation error. The breakdown of the generalization error of a given neural network function
where
The approximation error can be thought of as how far the local optimal
3.1.2 Upper bound for generalization error
Generalization error is a measure of how well a model generalizes from the training data to unseen data. While it is typically estimated using a separate validation dataset or through techniques such as cross-validation, deriving a theoretical understanding is also important as it can help improve the architecture and training of ML models to achieve the desired generalization performance. The computer science community has made great efforts to derive an upper bound for the generalization error of various neural network models. Specifically, the following lemma characterizes the upper bound of the generalization error using the Rademacher complexity
Definition 3.
(Empirical Rademacher Complexity) The empirical Rademacher complexity of a given hypothesis class
where
Lemma 1 .
(c.f. Theorem 3.3 in Mohri et al. (2018)) Consider a class of loss functions
It can be seen from Eq. (17) that the generalization error bound depends on the empirical error (the first term), Rademacher complexity (the second term), and an error function associated with confidence
Intuitively, the Rademacher complexity measures the maximum correlation between functions in the hypothesis class
Theorem 1 .
(c.f. Theorem 1 in Wu et al. (2021b)) Given an i.i.d training samples of size
where
Remark 2.
The generalization error bound of Eq. (18) implies that the following attempts can be taken to reduce the generalization error: (1) minimize the empirical loss
Note that for neural networks with different architectures (e.g., types of NNs, number of layers and neurons, activation functions, etc.), we will have different values for Rademacher complexity. For instance, Golowich et al. (2018) derived the Rademacher complexity upper bounds for a multi-layer FNN (see Eq. (19)).
The FNN has
3.2 Closed-loop stability
Closed-loop stability in MPC is important for ensuring the safety and reliability of plant operations, and some recent efforts have been made to investigate the stability of ML-based MPC (Limon et al. 2017; Meng et al. 2022; Soloperto et al. 2022; Wu et al. 2019c, 2021b). This section will explore and analyze the closed-loop stability of ML-based MPC based on the LMPC formulation of Eq. (9). Specifically, in the LMPC of Eq. (9), the constraint of Eq. (9e) ensures that the time derivative of the Lyapunov function
Additionally, the feasibility of LMPC is inherently guaranteed by the nonlinear control law
A key step for closed-loop stability under MPC is to ensure that the discrepancy between NN predictions and the actual state evolution is bounded. If we consider a deterministic error bound, that is, the error between NN predictions and the true evolution of states is bounded for all times, the ML-MPC of Eq. (9) guarantees closed-loop stability by designing the nonlinear control law
4 Practical challenges in applications of ML-MPC
In addition to the theoretical understanding of the generalization performance of ML models and the resulting closed-loop stability properties, there exist many other practical challenges for the implementation of ML-based MPC systems in real-world systems.
4.1 Data scarcity
The quantity and quality of the data used for model development are paramount for the performance and accuracy of ML models. In Section 3.1, we see that an increase in the number of training samples is helpful in reducing the generalization error of the model, thereby improving its accuracy. However, in practice, it can be difficult to gather a substantial amount of data samples to meet the requirements of developing an accurate ML model. This is especially true for complex systems with a large number of feature variables, where data collection can be costly and limited. Hence, this section presents an overview of some popular techniques to address data scarcity in machine learning.
4.1.1 Physics-informed machine learning
Physics-informed machine learning (PIML) is an emerging ML technique that seeks to improve the accuracy, robustness, interpretability, and physical consistency of ML models by integrating physics laws and domain knowledge into the learning process (Karniadakis et al. 2021). According to Karniadakis et al. (2021), physics can be incorporated into ML models in three ways: (1) through observational data that reflect the fundamental physics laws, (2) by implicitly integrating domain knowledge into ML models by customizing the model architecture, and (3) by explicitly embedding physics into the model algorithm, typically through additional loss functions and constraints. In particular, physics-informed neural networks (PINNs) are a prominent subclass of PIML that have gained traction in recent times due to their ability to learn effectively in small data regimes (Raissi et al. 2019; Zheng et al. 2023). Since their advent, PINNs have been extensively applied in engineering, ranging from the modeling of chemical/biochemical processes (Rogers et al. 2023; Subraveti et al. 2022) and reactors (Bangi et al. 2022; Patel et al. 2023; Wu et al. 2024) to process control (Antonelo et al. 2024; Arnold and King 2021; Wang and Wu 2024b; Zheng et al. 2023).
PINNs operate by integrating physical laws directly into the learning process, where physics, in the form of ordinary or partial differential equations (ODEs/PDEs), is embedded into the loss function. This can be illustrated using the formulation of a physics-informed RNN (PIRNN) model provided in Zheng et al. (2023). Given a PIRNN that uses the current state of a system
where
The loss term consists of two terms, where the subscripts
The computation of the ODE residual

Structure of the PIRNN model encompassing data-driven and physics-informed regularization terms into the loss function.
A forward problem is a problem that involves solving for the outputs of a system when the inputs and governing equations are known. On the other hand, an inverse problem seeks to infer unknown inputs or parameters from the observed outputs by working backward. While the discussion above introduced and explored the applications of PINNs to forward problems, PINNs have also been extensively applied to inverse problems. For example, PINNs demonstrated remarkable accuracy when used to derive velocity and pressure fields from fluid flow images, such as temperature gradient maps that depict the flow in an espresso cup (Cai et al. 2021; Raissi et al. 2020).
Following the study of a generalization error bound for purely data-driven RNN models, recent efforts have been made to analyze the generalization performance of PINNs. For example, in Mishra and Molinaro (2022, 2023), the generalization error analysis for a general class of PINNs approximating solutions of the forward and inverse problems for PDEs was performed, respectively. Furthermore, in Zheng et al. (2023), the results of generalization errors were developed specifically for PIRNNs. As PINNs were developed from domain knowledge, such as the first-principles model, the accuracy of these theoretical models may affect the generalization performance of PINNs. The limitations of the theoretical models can be addressed by applying the PINNs in an inverse manner, using observed data. For example, Zheng and Wu (2023) proposed an inverse problem of PIRNN to improve the first-principles model used in the loss function of PIRNN using real-time data.
In addition, different types of domain knowledge can be incorporated into the design of PINNs. For example, for systems with physical constraints on states and/or inputs, additional loss terms can be included in the loss function to enforce the constraints on the relevant states/inputs. The following loss term is an example that imposes non-negative constraints on the states (Wu et al. 2023a):
where
In addition to incorporating physics-induced loss terms, domain knowledge such as process structure knowledge of a process network can be used to improve the design of the NN architecture to reflect the underlying physics (de Giuli et al. 2024; Wu et al. 2020). In many industrial chemical processes, operations in the upstream phase of production have a direct impact on those in the downstream phase, whereas the reverse influence is often negligible, unless recycling is involved. While theoretical models (if available) are able to capture the relationship between the upstream and downstream processes in their equations, this relational information is rarely utilized in data-driven models. In Wu et al. (2020), a partially-connected RNN structure that mirrors the process network of a two continuous stirred tank reactors (CSTR) system, was proposed. Figure 4 shows how a standard RNN model, with a fully connected structure, can be decoupled into a partially-connected structure. Unlike the fully connected structure where all inputs affect all output, in the partially connected structure, the output of the first RNN layer

Structure of a partially-connected RNN for a two-CSTR system.
4.1.2 Transfer learning
Another perspective to address data scarcity in data-centric approaches would be to reuse models already developed for similar tasks. This is the key concept of transfer learning (TL), where knowledge learned from a task (source) can be transferred to a related task (target) to boost performance or reduce the data requirement for the new task (Alhajeri et al. 2024; Thebelt et al. 2022). Pan and Yang (2009) provided a comprehensive overview of the types of TL tasks. In summary, TL can be classified into transductive, inductive, and unsupervised TL, based on the availability of label information from source and target domains. If only the source domain labels are available, i.e., no label from target domain, then this is a transductive TL problem. On the other hand, if the target domain labels are available, then this is an inductive TL task. If both the source and target domain labels are unavailable, then this constitutes an unsupervised TL task. In this review, we will focus our discussion on the inductive TL problem, where labeled data from both the source and the target domains are available.
Consider an inductive TL task of transferring knowledge from a source task to a target task. The process first involves developing a pre-trained model (e.g., RNN model) on a large dataset from the source domain. Thereafter, the pre-trained RNN model is adapted to the target domain. Formally, we define
In modeling nonlinear processes, it is assumed that both the source and target processes can be represented in the form of Eq. (1), but with different process dynamics. For instance, in the case of a chemical reactor, the source process with similar configurations might involve the same reactor type but under varying operating conditions and reactions, different types of reactors performing the same reactions, or even different reactors under different conditions (see Figure 5). Therefore, the distributions of

Transfer knowledge from source processes to a target process, where the source processes from bottom to top represent various scenarios: a different reaction in different types of reactors, the same reaction in the same type of reactor under different operating conditions, and different reactions in the same reactor, respectively.
In general, TL approaches can be generalized to instance-based, feature-based, parameter-based, and relational-based approaches (Pan and Yang 2009; Zhuang et al. 2020). In the instance-based approach, each training sample from the source domain is assigned a weight in the calculation of the loss function. The weights will be optimized to minimize discrepancies between samples from the source and target domains (Huang et al. 2006). The feature-based method seeks to find representative features that describe both the source and target domains well. This involves strategies such as feature mapping, feature extraction, and feature encoding. The parameter transfer approach aims to transfer knowledge by sharing the model parameters of the source with the target domain. Finally, the relational-based approach focuses on the transfer of learned relationships between entities in the source domain to the target domain.
TL methods such as feature-based (Bi et al. 2020; Guo et al. 2020; Zhu et al. 2022) and parameter-based (Briceno-Mena et al. 2022; Lee et al. 2022; Xiao et al. 2023) approaches have been widely adopted in the field of engineering. Specifically, parameter sharing is a common TL strategy for NN models. In detail, during the fine-tuning process, a portion of the model’s parameters will be adjusted to adapt the pre-trained model to the new task while the remaining parameters are kept unchanged, preserving the knowledge gained during the initial pre-training. For example, in Xiao et al. (2023), a single hidden layer RNN model was first trained with labeled source data from distribution
Although the implementation of transfer learning methods is straightforward, a critical challenge is that transferred knowledge can sometimes harm the performance of the target task. This phenomenon is also known as negative transfer (Wang et al. 2019). According to Wang et al. (2019), factors that lead to negative transfer include the choice of algorithm, differences in source and target distributions, and the size of the labeled target dataset. Therefore, how to quantify the effectiveness of knowledge transfers from the source to the target domain and select a relevant source domain remain important questions. To this end, the generalization error of transfer learning methods has been developed to show the performance of TL models in target learning tasks (Ben-David et al. 2006, 2010; Blitzer et al. 2007; David et al. 2010; Mansour et al. 2009; Xiao et al. 2023). Specifically, Xiao et al. (2023) established a quantitative analysis for the generalization error of TL models for regression tasks and shows its dependence on a number of factors, including the discrepancy distance between source and target distributions, and the complexity of networks.
As an extension to the classic single-source single-target TL problem, multi-source TL, where two or more sources are used for knowledge transfer, has been proposed to enhance the robustness of TL models (Mansour et al. 2008; Tian et al. 2022; Xiao et al. 2024; Yao and Doretto 2010). Furthermore, TL can be coupled with PIML to accelerate and improve prediction accuracy in the presence of data scarcity (Wu et al. 2023b; Xiao and Wu 2023).
4.1.3 Synthetic data generation
Data augmentation is a technique used to artificially increase the size and diversity of a dataset by applying various transformations or perturbations to existing data samples. This helps improve the generalization and robustness of machine learning models, especially when the original dataset is small or imbalanced. Generative AI methods have been used to enrich data sets for the design of new molecules/materials (Abbasi et al. 2022; Batra et al. 2020; Han et al. 2019; Kim et al. 2020; Nouira et al. 2018; Schilter et al. 2023). However, we observe that applications of generative AI to modeling of nonlinear dynamic systems in the chemical industry are still at an early stage of development. Many efforts in synthetic data generation focus on soft sensor development (Lee and Chen 2023; Xie et al. 2019; Zhu et al. 2021), fault diagnosis model development, and risk analysis model development (He et al. 2020; Qin and Zhao 2022) for process industries. For example, the sampling rates of quality variables and process variables are usually inconsistent in process industries due to the high cost of the acquisition of quality data. Additionally, missing data due to sensor failures may occur in process industries, leading to insufficient training data. Therefore, synthetic data generation is applied as an augmentation of incomplete and unlabeled process signal data. Various generative models have been proposed for data augmentation such as generative adversarial networks (GAN), variational autoencoders (VAE), normalizing flow (NF) models, Gaussian mixture models, hidden Markov models, latent Dirichlet allocation, and Boltzmann machines (Bond-Taylor et al. 2021; Harshvardhan et al. 2020). Specifically, GAN, VAE, and NF have been applied in some recent works in chemical engineering for synthetic data generation, e.g., He et al. (2020), Lee and Chen (2023), Qin and Zhao (2022), Xie et al. (2019), Zhang et al. (2021b, 2024), and Zhu et al. (2021).
GANs are a class of generative models inspired by the concept of Nash equilibrium in game theory (Goodfellow et al. 2014). A typical GAN consists of two networks: a generator
VAE is another class of generative models that combines Bayesian inference with deep networks (Kingma and Welling 2013). Typically structured like an autoencoder, a VAE consists of an encoder and a decoder. Given real data
Remark 3.
In addition to GANs and VAEs, traditional methods, such as bootstrapping in statistics and interpolation techniques (e.g., linear or quadratic interpolation between nearby points), are also commonly used for data augmentation. Bootstrapping generates new datasets by resampling with replacement from the original data, while interpolation creates synthetic data points by estimating values between existing data. These approaches, though simpler than modern generative models, can be effective for certain applications where data patterns are relatively well understood and straightforward.
4.1.4 Active learning
Active learning is another approach that intelligently selects data points that are associated with high uncertainty or low confidence predictions by the current model for labeling and inclusion in training. Unlike synthetic data generation, the goal of active learning is to choose as few labeled samples as possible to minimize the cost of obtaining real data. Active learning methods can generally be classified into three categories: pool-based sampling, stream-based selective sampling, and membership query synthesis (Settles 2009). In pool-based sampling, a large set of unlabeled data is available, from which samples are drawn iteratively at no cost. On the other hand, stream-based selective sampling involves drawing unlabeled samples one at a time. In membership query synthesis, the active learner generates synthetic samples and requests labels for them. In Zhao et al. (2022b), pool-based active learning is used to enrich the training set for modeling a nonlinear chemical process by iteratively identifying the training data that improve model performance most efficiently. Additionally, active learning has the potential to be intergrated with machine-learning-aided optimal experiment design strategies aimed at minimizing time and costs associated with experiments in chemical engineering problems, such as identifying the proper kinetic model structure (Sangoi et al. 2022, 2024).
4.2 Data quality
Another common issue faced in the development and implementation of ML models for real-world applications is the quality of the data. The presence of noise in sensor data is almost inevitable due to factors such as sensor limitation, environmental conditions, measurement error, etc. Since noisy data can impede the learning process of ML models, this section will explore popular and innovative solutions to handle and mitigate the issue of noise-corrupted data.
4.2.1 Conventional approaches
Dropout method is a popular regularization technique used in the training of neural networks to prevent overfitting (Abdullah et al. 2022a; Hinton et al. 2012; Srivastava et al. 2014). Overfitting occurs when a model learns the training data too well, including its noise and outliers, which results in poor generalization to new, unseen data. Introduced by Hinton et al. (2012), dropout involves randomly “dropping out” a fraction of the neurons in the network during each training iteration. This means that for each forward and backward pass, certain neurons are temporarily removed from the network, along with all their incoming and outgoing connections. The neurons to be dropped out are chosen at random with a probability
Furthermore, Monte Carlo (MC) dropout, a technique used to estimate uncertainty in deep neural networks, can be used to develop stochastic neural networks that characterize the uncertainties of prediction (Gal and Ghahramani 2016a,b). In contrast to the standard dropout, which is only applied during the training phase, the MC dropout applies dropout during both the training and inference phases. Hence, predictions made by NN models using the MC dropout are not deterministic. While the standard dropout helps mitigate the impact of data overfitting by learning a deterministic model, the MC dropout learns a stochastic model that can quantify the system’s uncertainty. This ability to estimate uncertainty is particularly valuable for improving controller design under uncertainty. Therefore, MC dropout has been adopted in many chemical process modeling works when training datasets (e.g., sensor measurements) is corrupted with noise, and there is a need to estimate prediction uncertainty (Alhajeri et al. 2022; Wu et al. 2021a).
Specifically, the MC dropout method aims to find the posterior distribution of the model weights
where
4.2.2 Co-teaching method
Co-teaching is an innovative way to address noise in labeled data. The idea behind co-teaching stems from the observations that deep learning models tend to fit simple patterns at the early stage of the training process and progressively learn more complex nuances as training continues (Han et al. 2018). Based on the belief that the training loss is related to the level of noise in the data sample, i.e., noise-free or ‘clean’ data samples are more likely to have small training loss and vice versa, co-teaching is designed to have two simultaneously trained NNs. A schematic of the co-teaching method with two NNs, A and B, is shown in Figure 6. For every mini batch training iteration, the models identify and collect a small portion of the data samples with small training losses. Subsequently, the models exchange the identified dataset with ‘clean’ samples and update their weights based on the exchanged dataset. The process is repeated until all training epochs have been completed.

The symmetric co-teaching framework that trains two networks (A and B) simultaneously.
Although the co-teaching method was initially proposed for classification problems with noisy labels, co-teaching has been successfully adapted for regression problems, such as modeling of nonlinear processes in the presence of noise (Abdullah et al. 2022b; Wu et al. 2021c). In addition to the standard co-teaching algorithm highlighted in this section, variants such as asymmetric co-teaching (Yang et al. 2020), stochastic co-teaching (de Vos et al. 2023; Robinet et al. 2022), and co-teaching+ (Yu et al. 2019) have been proposed to improve the accuracy of the model. In essence, by leveraging on the peer network’s perspective, co-teaching is an effective method to reduce the influence of noisy data and improves the overall generalization capability of the NN model.
4.2.3 Lipschitz-constrained NN
To reduce the effect of data noise on the generalization performance of the model, another approach is to design an inherently robust NN. In particular, Lipschitz-based NN have demonstrated robustness and trustworthiness, especially in handling adversarial attacks. Since many real-world systems (e.g., chemical processes) are Lipschitz continuous, developing a Lipschitz-constrained NN provides a promising solution to addressing data noise in the training set. In mathematical terms, given a function
Here, we give an example of using the SpectralDense layer that was proposed by (Serrurier et al. 2021). Specifically, SpectralDense layers are dense layers such that (1) the activation function
where Eq. (23a) applies when
Definition 4.
Let
where
Since every SpectralDense layer
The SpectralDense LCNN approach was adopted in Tan and Wu (2024) and Tan et al. (2024b) to handle noisy data when modeling chemical processes. The proposed LCNN demonstrated higher accuracy and generalization performance compared to the conventional Dense NN trained on the same set of noisy data. This highlights the effectiveness of enforcing Lipschitz continuity in NN designs in handling noisy data.
4.3 Curse of dimensionality: ML-MPC of large-scale systems
The curse of dimensionality is a practical challenge for modeling of large-scale systems. It refers to the phenomena that arise when working with high-dimensional systems, the requirement for data amount grows exponentially, leading to more complex network structures, longer training time, and poorer model performance, as the number of dimensions (features or variables) increases. This issue can be addressed from the modeling and control perspectives, respectively.
4.3.1 Model perspective: reduced-order modeling
Reduced-order modeling (ROM) is a powerful technique to address the curse of dimensionality in high-dimensional systems by reducing the complexity of the system while preserving its essential behavior. In the context of machine learning and data analysis, reduced-order modeling aims to capture the most significant features or dynamics of the data while reducing the dimensionality of the problem. Common dimensionality reduction techniques include methods such as principal component analysis (PCA) (e.g., Hassanpour et al. 2020 integrates PCA with neural networks), which identifies a linear transformation that maps data from a higher-dimensional space to a lower-dimensional space with minimal information loss by minimizing the squared sum of the orthogonal distances between the measured data points and a straight line, and, more recently, autoencoders. Dimensionality reduction is particularly advantageous in process systems engineering where time-scale separation is a common phenomenon in unit models such as distillation columns and catalytic reactors (Chang and Aluko 1984), which can justify the use of reduced-order models. Specifically, if such a timescale separation is not factored into the design of a standard nonlinear feedback controller, the controller may become ill-conditioned due to the stiff ordinary differential equations that arise, resulting in performance deterioration and possibly even unstable closed-loop dynamics (Kokotović et al. 1999).
Reduced-order modeling for two-time-scale systems using sparse identification of nonlinear dynamics (SINDy) was studied in Abdullah et al. (2021a,b). In Abdullah et al. (2021a), the mathematical framework of singular perturbations was utilized to decompose the original two-time-scale system into two lower-order subsystems, each separately modeling the slow and fast dynamics of the original multiscale system. Specifically, after a brief transient period, the fast states converge to a slow manifold and can be algebraically related to the slow states using nonlinear functional representations. To capture the nonlinear relationship between the slow and fast states, nonlinear principal component analysis (NLPCA), developed by Dong and McAvoy (1996), was applied in Abdullah et al. (2021a), following which SINDy was used to derive well-conditioned, reduced-order ODE models for the slow states. The reduced-order SINDy models, owing to their numerical stability, allowed for integration with much larger time steps. Once the slow states were predicted with the SINDy ODE model, NLPCA was used to algebraically predict the fast states without any integration. NLPCA is one of the manifestations or interpretations of a nonlinear extension of the aforementioned linear dimensionality reduction technique PCA and is fundamentally an autoencoder with a nonlinear activation function. The use of a feedforward neural network in NLPCA renders it a static model at the cost of reduced complexity.
The aforementioned SINDy modeling approach for multiscale systems was later used in Abdullah et al. (2021b) to develop a controller based on the slow dynamics. The reduced-order model-based controller, due to its lower complexity and computational cost, was able to outperform a full-order first-principles model-based controller as the former could use a longer prediction horizon in the model predictive control scheme, which is impacted significantly by the prediction horizon length. While there is an inevitable loss of accuracy in a reduced-order model, for tasks involving optimization such as process intensification and optimal control, the computational tractability of solving the mathematical optimization problem is of greater priority, justifying the construction and deployment of such reduced-order models in process systems engineering. SINDy was further developed to handle noisy data and real-time changes in process dynamics using subsampling, co-teaching, error-triggered model update mechanisms, and partial model update algorithms (Abdullah and Christofides 2023b; Abdullah et al. 2022a,b), all of which are techniques that can be extended to the reduced-order modeling framework of Abdullah et al. (2021a,b) as well.
In addition to PCA and SINDy, the autoencoder (AE) is an unsupervised learning model that adopts an FNN architecture to perform tasks such as dimensionality reduction (Kramer 1991). A typical AE comprises two components, the encoder and the decoder. The encoder, parameterized by
where
where
where
The close resemblance between AE and the commonly used linear dimensionality reduction method, PCA, is noted (Xiu et al. 2020). In a case where only the linear activation function is used in the AE calculation, the encoder output will correspond directly to the principal components in PCA. However, due to its flexibility in adopting a broad range of activation functions and producing complex nonlinear representations of the input data, AE is generally preferred over PCA for more effective dimensionality reduction. The benefits of incorporating AEs into machine learning tasks include reduced training duration, enhanced robustness against overfitting, and improved convenience in data visualization. Due to its advantages, NN models coupled with AE have been widely applied in the field of engineering, in areas such as process monitoring (Cheng et al. 2019; Lee et al. 2019), fault diagnosis (Zhang and Qiu 2022; Zheng and Zhao 2020), and process modeling (Na et al. 2018; Saraswathi K et al. 2020).
Since recursive prediction of RNNs could be time-consuming for high-dimensional systems, we can integrate RNNs with AE by developing RNNs that predict future states in the latent space. Figure 7 illustrates the structure of the integrated AE and RNN (termed AERNN model) (Zhao et al. 2022a; Zheng et al. 2022a). Specifically, an AE was developed to encode and decode state variables

Structure of autoencoder RNN model.
Although AE and other similar feature extraction techniques (e.g., PCA) are able to transform high-dimensional feature datasets into a lower-dimensional space whilst retaining most of the crucial information from the dataset, these reduced-order representations often do not carry any physical meaning. The lack of interoperability of the encoded features makes these approaches less appealing, particularly in safety–critical chemical processes, where a physical interpretation of the features is necessary, especially those features identified to be “important”. Hence, another perspective on reduced-order modeling arises from the idea of selecting a subset of relevant features for model construction. In feature selection, the chosen features should show a high correlation with the system output and exhibit strong interpretability to support further analysis. In general, feature selection methods can be categorized into three classes based on their selection and learning processes, namely, wrapper, filter, and embedded methods (Chandrashekar and Sahin 2014). In summary, the wrapper methods scan for the best performing subset of features among potential subsets of features, using a specific search algorithm and a pre-defined ML algorithm (Karagiannopoulos et al. 2007). However, as the input dimension increases, the search space grows exponentially, making the wrapper methods increasingly computationally intensive. In contrast, filter methods do not require the use of ML algorithms in the selection process. Filter methods select relevant features by ranking and ordering the features using a suitable ranking criterion that measures the correlation between the input features and the target output, e.g., the Pearson correlation coefficient (Rendall et al. 2019). Features with scores below a predefined threshold are removed from the feature set, and the remaining features are used to substitute the original high-dimensional feature set in the modeling process (Degeest et al. 2019). The embedded methods, as the name suggests, directly integrate the feature selection process into the model training process to save computational time. In particular, regularization models are the most commonly used embedded methods, where the loss function is modified such that the model learns the important features while minimizing the fitting error (Li et al. 2017). Interested reader may refer to Karagiannopoulos et al. (2007) and Li et al. (2017) for detailed reviews on the various feature selection methods and Zhao et al. (2023) on their applications in reduced-order RNN models.
4.3.2 Control perspective: distributed MPC
From a control point of view, another way to mitigate the curse of dimensionality in large-scale systems is by applying distributed control techniques. In distributed control systems, the system is divided into smaller subsystems, where individual controllers are designed for each subsystem. Although control calculations are performed on separate processors, controllers in distributed control systems are able to communicate and cooperate with each other to achieve the objectives of the closed-loop plant. By decomposing the system into smaller subunits, the complexity and computational demands to model and control the process network can be significantly reduced. In this regard, distributed MPCs (DMPC) and decentralized MPCs using ML models have been developed in Chen et al. (2020a,b). A schematic of two DMPC architectures is shown in Figure 8. The key difference between sequential and iterative DMPCs is that the communication between two MPCs in a sequential DMPC framework is one-way only, while the controllers in iterative DMPCs communicate with each other to cooperatively optimize the control actions. Since the designs of ML-based DMPCs closely follow those using first-principles models, the formulations of ML-DMPCs are omitted here. Various alternative configurations of DMPC systems have been proposed in literature, each varies in terms of the coordination and communication schemes between the subsystems’ MPC. Readers are directed to Christofides et al. (2013), Scattolini (2009), and Stewart et al. (2010) for comprehensive reviews of DMPC.

Schematic diagrams of (a) sequential distributed MPC and (b) iterative distributed MPC systems.
4.4 Model uncertainty and process disturbances
Machine learning models are generally developed using historical data, and cannot take model uncertainty and process disturbances into account. To build more robust and reliable models that adapt to the variations in system dynamics, online learning and robust control can be adopted to improve the performance of models and controllers, respectively.
4.4.1 Model perspective: online update of ML models
Online machine learning refers to a paradigm of machine learning where models are continuously updated as new data becomes available, often in a streaming fashion. Unlike traditional batch learning, where models are trained on fixed datasets, online learning enables models to adapt and evolve over time as they receive new data points. This approach is particularly useful for machine learning modeling of systems with time-varying dynamics due to disturbances. In-depth discussions on online learning and its theoretical analysis can be found in Hoi et al. (2021), Rakhlin et al. (2010), and Shalev-Shwartz (2012), respectively. An early attempt to implement online learning in the predictive models of MPC was recorded in Murray-Smith et al. (2003), where the authors added new process information to the training set at every timestep and adjust the model’s hyperparamters accordingly. More recent work on integrating online learning into MPC applications can be found in Bhadriraju et al. (2019), Bradford et al. (2020), Limon et al. (2017), and Ning and You (2021).
Updating machine learning models online can be handled using various strategies. An effective method is error-triggered updates, where the model is updated only when the prediction error exceeds a certain threshold. This helps in making efficient use of computational resources and ensures that the model remains accurate and up-to-date with minimal overhead. Error-triggered online learning typically involves the following steps recorded in Abdullah and Christofides (2023b), Wu et al. (2019b), and Zheng et al. (2022c): (1) start with an initial model pre-trained on historical data, (2) monitor incoming data and compute the prediction error using new data, and (3) if the current prediction error or accumulated prediction error in a sliding time window exceeds the predefined threshold, update the model using the new data. Furthermore, when incorporating online machine learning models into MPC, event-triggered mechanism designed based on stability criteria can be adopted to update models (Wu et al. 2019b). While online learning helps maintain the accuracy of machine learning models in dynamic environments, potential drift in the underlying data distribution over time due to the change of process dynamics under disturbances can pose a significant challenge to the performance of updated models. To this end, some recent works have investigated the generalization performance of online learning models that take independent and identically distributed (i.i.d.) real-time data and non-i.i.d. real-time data for online learning, respectively (Hu and Wu 2024; Hu et al. 2023a,b). These two cases represent the scenarios where system dynamics remain unchanged and where they change over time, respectively. Specifically, it is shown in Hu and Wu (2024) that the generalization performance of online learning models depends on several factors, including the divergence between historical data and real-time data distributions, network complexity, and sample size.
4.4.2 Control perspective: robust MPC and tube-based MPC
From the control perspective, we can design robust MPC and tube-based MPC to account for plant-model mismatch in uncertain systems. Specifically, tube-based MPC addresses the uncertainty in system dynamics by considering a range of possible future trajectories rather than a single trajectory. It creates a “tube” around the nominal trajectory, within which the actual trajectory is expected to lie. Tube-based MPC uses techniques like robust optimization or stochastic optimization to compute the tube around the nominal trajectory. Tube-based MPC often involves solving optimization problems with constraints that ensure that the system remains within the defined tube despite uncertainties. Machine learning techniques have been incorporated into tube-based MPC to further improve the characterization of uncertainties and robustness. Recent developments in tube-based MPC using machine learning include work by Gao et al. (2024), Zhang et al. (2022), and Zheng et al. (2022b).
Robust MPC directly incorporates uncertainty into the control law formulation. It aims to optimize control inputs such that the system remains stable and satisfies performance criteria under the worst-case scenario of uncertainty. Robust MPC typically involves solving optimization problems with robust constraints or using techniques like min-max optimization to find control inputs that perform well under uncertainty. In recent works by Berberich et al. (2020), Chen and You (2021), Hu and You (2023), Mahmood et al. (2023), and Manzano et al. (2020), robust data-driven MPCs and robust learning-based MPCs have been developed to enhance the robustness of controllers against uncertainties such as prediction errors from machine learning models and process disturbances. For example, Chen and You (2021) used ML to learn uncertainties, and designed a robust MPC for greenhouse in-door climate control problems. Mahmood et al. (2023) developed a robust data-driven-based MPC based on the minimax approach for temperature control and optimization of energy consumption.
4.5 Computational efficiency
ML-based MPC is generally solved slowly due to the complexity of ML models (the ML model is often required to be evaluated multiple times during optimization, which can significantly increase computational burden), and the nonconvexity of optimization problem.
4.5.1 Model perspective: optimization and convexification
One approach to improve the computational efficiency of ML-MPC is to simplify the model architecture, such as by reducing the number of neurons or layers. While manually doing this can be challenging, automated tools and techniques can assist in finding an optimal configuration, thereby reducing computational overhead. Reduced-order modeling that has been introduced in the previous section could be a solution to large-scale nonlinear systems. Additionally, hyperparameter optimization could be one solution to finding the optimal hyperparameters for ML models. Some common techniques for hyperparameter optimization of ML models include grid search (Bergstra et al. 2011) and Bayesian optimization (e.g., tools such as Optuna (Akiba et al. 2019) and Hyperopt (Bergstra et al. 2013)). An analysis and comparison of common hyperparameter optimization approaches for developing an LSTM forecast model for a cyber-physical production system can be found in Pravin et al. (2022).
Another approach is to build input-convex ML models (Amos et al. 2017; Chen et al. 2018c; Wang et al. 2025; Yang and Bequette 2021). An input-convex model in the context of machine learning refers to a model whose loss function is convex with respect to its input. This property can be highly beneficial for optimization because convex functions have a single global minimum, making the optimization process more straightforward and ensuring that gradient-based methods converge reliably. Certain linear models, such as linear regression and logistic regression, are inherently convex because their loss functions are convex with respect to the model parameters. However, for a more general class of nonlinear ML models, it is possible to design neural network architectures and loss functions to be input convex under certain conditions. This design can significantly improve training stability and convergence. We provide an example of enforcing input convexity in FNNs. Following the same idea, input-convex RNNs and LSTMs have been designed in some recent works (Chen et al. 2018c; Wang et al. 2025). The output of each layer of input-convex FNN follows:
and with
Remark 4.
It is important to note that while ICNN models offer benefits such as global optimality and stability, they may lose accuracy when applied to highly non-convex functions due to their inherent convexity. However, for many practical systems that are not highly non-convex, ICNN models can provide a computationally efficient alternative for ML model-based optimization problems while maintaining the desired accuracy. In practical applications, comparing the testing losses of ICNN models with traditional FNN models can be an effective way to evaluate the performance of ICNN models. If they yield similar accuracy, ICNN models can be considered a good approximation for nonlinear systems. Additionally, partially input convex architecture of ICNN (PICNN) can be utilized to further restore the representation ability of ICNN models by making the output a convex function to some elements of the input (Amos et al. 2017). Therefore, developing ICNN models requires a delicate balance between convexity and representation power to ensure optimal performance for various applications.
4.5.2 Control perspective: explicit ML-MPC
Explicit MPC provides another solution from the control perspective to improve computational efficiency. In explicit MPC, the control law is precomputed and stored as a piecewise function of the system state. This precomputation allows for real-time implementation with constant-time complexity, regardless of the system’s complexity or prediction horizon. By eliminating the need for online optimization during operation, explicit MPC can achieve faster control loop execution times, making it suitable for applications with stringent real-time requirements.
The explicit control law is derived using multi-parametric programming algorithms, which include multi-parametric linear/quadratic programming (mpLP/mpQP) and multi-parametric mixed-integer linear/quadratic programming (mpMILP/mpMIQP) (Pistikopoulos et al. 2020). Unlike typical optimization problem where the parameters, e.g., system state
As it can be time-consuming to solve ML-based MPC (Wu et al. 2019d), there has been a growing interest in converting ML-MPC into an explicit ML-MPC for faster computation. However, the black-box nature of ML models creates obstacles in the path towards explicit ML-MPC. As ML models can be difficult to express explicitly, i.e., do not have explicit expressions, it is a challenge to adopt existing explicit MPC algorithms for ML-MPC. An approach to bypass this problem is to utilize the unique property of the ReLU activation function and represent the ML model as a mixed-integer linear programming (MILP) problem. The MILP problem is then incorporated into the formulation of an explicit ML-MPC and solved using mpMILP (Chen et al. 2018b; Grimstad and Andersson 2019; Katz et al. 2020). An althernative approach to solve the explicit ML-MPC using multi-parametric nonlinear programming (mpNLP) methods. As deriving the exact solutions to mpNLP is still an unsolvable problem, existing mpNLP algorithms generally use either piecewise linearization or quadratic constraints to approximate the strong nonlinear terms (Kassa and Kassa 2016; Pappas et al. 2021). In Wang et al. (2024a,b), the authors developed explicit ML-MPC for ML models with a general class of nonlinear activation functions by first approximating ML models by piecewise linear functions. The corresponding mpNLP problems are then approximated into mpLP/mpQP problems which can be solved efficiently by existing algorithms.
In addition, compared to some works that develop an ML model to learn state-input relationship that can be used to replace the controller in a closed-loop system, explicit MPC offers transparency and interpretability since the control law in explicit MPC is represented as a piecewise function of the system state, for which engineers can easily analyze and understand how control actions are determined based on the current state of the system. This interpretability is valuable for troubleshooting, tuning, and verifying the controller’s behavior in practical applications.
Remark 5.
The main challenge with explicit ML-MPC is that ML models are typically black-box models without an explicit functional form (or they are too complex to be directly incorporated into explicit MPC solvers). Nonlinearity adds another layer of difficulty, as a nonlinear ML model leads to mpNLP, which is generally hard to solve. Potential solutions include using the unique properties of ReLU activation functions for ML models to formulate a mixed-integer linear programming (MILP) problem, or approximating nonlinear ML models with piecewise linear functions, allowing for the formulation of mpLP or mpQP problems.
4.6 Safe and secure ML-MPC
While most existing research of ML-MPC in engineering disciplines has focused on improving its prediction accuracy and performance, safety and security are emerging research areas of significant importance. The misuse of ML-MPC technologies could lead to unsafe, and potentially catastrophic, consequences in safety-critical systems, causing environmental damage, capital loss, and human injuries.
4.6.1 Safety
Ensuring safety of ML-MPC includes safe learning (data collection), safe modeling, and safe implementation. Safe data collection is often not the most critical issue in supervised learning because datasets are provided for offline learning. The data used in supervised learning is usually pre-collected, cleaned, and labeled, which reduces the risks associated with data collection. However, the importance of safe data collection becomes much more pronounced in other machine learning techniques, such as RL. Specifically, in RL, an agent interacts with an environment to learn optimal actions through trial and error. This interaction can involve significant risks, especially in real-world applications like autonomous driving, robotics, and chemical plants, where unsafe actions can lead to accidents, injuries, or other severe consequences. To ensure safe exploration, safe RL has recently been studied, where various techniques such as reward shaping and safety constraints through barrier functions have been developed to limit the action and state space (Garcıa and Fernández 2015; Kim and Kim 2022; Wang and Wu 2024a).
Safe modeling in supervised learning often refers to ensuring the robustness and reliability of predicted outputs. This involves several strategies and techniques with the goal of ensuring that the model predictions are consistent, reliable, and conform to necessary constraints in real-world systems. To achieve safe modeling in terms of reliable predictions, we can impose hard constraints on NN outputs through the design of activation functions, or incorporate the constraints as a regularization term (similarly to physics-informed ML introduced in Section 4.1.1). Additionally, robustness requires that the prediction of ML models is robust to small perturbations in input data. Some common techniques include adversarial training that intentionally introduces adversarial examples in the training process to improve its robustness, and novel design of NN architectures with inherent robustness such as Lipschitz-constrained NNs that have been introduced in Section 4.2.3.
Lastly, from the control perspective, safe implementation of ML models in MPC requires the improvement of existing controllers to account for the impact of safety as the last line of defense. Due to the approximation of ML models, ML-based MPC may lead to suboptimal or even unreasonable control actions that may cause unsafe operations. To mitigate these risks, safety constraints have been incorporated into the design of MPCs, ensuring that control actions and the resulting state evolution remain within safe bounds. For example, barrier functions can be used to design MPCs to effectively prevent the system from violating safety constraints by heavily penalizing states near the constraint boundaries. While there are various types of barrier functions, we provide an example of the control barrier function (CBF) for the nonlinear affine control system
Definition 5.
Given a set of unsafe states in state-space
To further reinforce closed-loop stability while ensuring safety simultaneously, CBFs can be integrated with control Lyapunov functions via weighted sums. As a result, control Lyapunov-barrier functions (CLBFs) that was proposed in Romdlony and Jayawardhana (2016) has been used to design safe MPC in Wu et al. (2018, 2019a). The definition of CLBFs is given as follows:
Definition 6.
Consider the nonlinear system
where
4.6.2 Data security
Data security is also an emerging challenge in the design and implementation of ML-MPC. Data risks can arise during both the offline modeling stage and the online implementation of ML-MPC. Specifically, since model training often involves collecting and processing data in a centralized manner (e.g., on a central server), communication channels can be vulnerable to attacks during data collection process, making the data susceptible to breaches, tampering, and unauthorized access. Ensuring data security is essential to protect sensitive information and maintain the integrity of the training process. As discussed in Parker et al. (2023), cybersecurity of industrial control systems can be improved through a variety of fundamental operation and control methods that address the following aspects: security by design, advanced recovery, advanced threat detection, secure remote access, and combined safety. Specifically, we can improve data security in both the learning and implementation stages of ML-MPC. For example, unlike the conventional ML approaches for modeling a nonlinear process network with multiple subsystems, where the training process is performed on a central server with training data collected from all subsystems, federated learning (FL), an emerging distributed ML framework to preserve data privacy, distributes the training data across multiple local subsystems, and subsequently, aggregate the submodels trained locally for each subsystem to create a global FL model (Zhang et al. 2021a; Zhao et al. 2018). Since FL only exchanges the NN weight information, and maintains local data in local systems without sharing with each other (see Figure 9), data security is significantly improved under the FL framework. In Xu and Wu (2024), FL was applied to model the distributed nonlinear systems with guaranteed data privacy for ML methods, and then incorporated into the design of MPC.

A schematic of federated learning, where
In addition to data security in the learning stage, the smooth operation of ML-MPC in real-time heavily depends on the accuracy of recorded data and the reliability of networked communication channels. Any compromise in the integrity or confidentiality of this data due to unauthorized access or manipulation by malicious entities can lead to serious consequences, impacting operational safety and economic performance. As sophisticated cyber-attacks pose risks to system information, there is a need to develop ML-MPC that ensures the confidentiality of industrial data. A promising solution to tackle this challenge is the adoption of an encrypted control system (Farokhi et al. 2017; Kim et al. 2016), offering a versatile and effective means to improve data security and confidentiality. It can be seamlessly implemented across various systems without requiring system-specific modifications, thus addressing the core challenge of secure data transmission in networked systems.
The work of Suryavanshi et al. (2023) presents the closed-loop architecture of an encrypted MPC. As depicted in Figure 10, the sensor signals

Illustration of the data transfer process in an encrypted MPC system (Suryavanshi et al. 2023).
After obtaining the encrypted data, it undergoes decryption, resulting in quantized states
5 Applications of ML-MPC to a chemical process example
In this section, we use a nonlinear chemical process to demonstrate the performance of various ML modeling and ML-MPC control methods, addressing different practical challenges discussed in previous sections. We begin with a brief introduction to developing conventional RNN models for nonlinear dynamic systems. We then explore advanced RNN models incorporating physics-informed ML, transfer learning, dropout, co-teaching, Lipschitz-constrained architecture, input-convex structure, online learning, and federated learning. These advanced methods are designed to tackle practical issues such as data scarcity, noise, robustness, convexity, model uncertainties, and data security. Following this, we show several novel designs of ML-MPCs that enhance computational efficiency, process operational safety, and cybersecurity. Additionally, the Python codes for some of the aforementioned ML and ML-MPC methods are provided for reference.
5.1 Process description
Consider a well-mixed, non-isothermal CSTR where an irreversible second-order exothermic reaction is taking place. The reaction involves the conversion of the reactant
where
Parameter values of the CSTR.
|
|
|
|
|
|
|
|
|
|
|
|
|
The CSTR has an unstable steady-state
The control objective is to operate the CSTR at the unstable equilibrium point
5.2 Development of RNNs
The RNN models in this example are developed to predict the states for the next sampling period
5.2.1 Physics-informed RNNs
In this section, we will introduce the development and construction of PIRNN using the results from Zheng et al. (2023). In Zheng et al. (2023), three RNN models were developed, namely, a standard RNN, PIRNN, and a purely physics driven RNN model (termed PIRNN without
As mentioned in Section 4.1.1, the introduction of initial conditions/collocation points into the loss function help to incorporate physics into the NN model. In this example, the collocation points comprise the initial system states

Collocation points sampled uniformly across the stability region
The standard RNN model was trained solely with the process data. It is a purely data-driven model that serves as a baseline to evaluate the predictive capabilities of the PIRNN models in regions beyond the range provided by the training data. Additionally, a purely physics-based RNN model, i.e., PIRNN without
The open-loop state profiles predicted by the three models are presented in Figure 12. It can be seen from Figure 12 that the prediction performance of the standard RNN model starts to deviates from ground truth from

Comparison of open-loop state profiles (i.e.,
5.2.2 Transfer learning RNNs
In the presence of data scarcity, transfer learning can be used to accelerate the training process and improve the generalization performance of RNNs for a target process with limited data using the pre-trained model for a similar source process with sufficient data. The model development framework and results of transfer learning based RNN models presented in this section are taken from Xiao et al. (2023). In Xiao et al. (2023), one source CSTR was selected for the construction of a TL-based model of a target CSTR process. Except for the ideal gas constant
Table 2 presents the training time and testing errors of TL-RNN and standard RNN trained different sizes of the target data set. Given sufficient target data (i.e., 16,800 training samples and 7,200 testing samples), it can be observed from Table 2 that the prediction performance of TL-RNN and RNN models are comparable (i.e., TL-RNN testing error =
Testing errors of standard and TL-RNNs.
Data set size | Training time (s) | Testing error | |
---|---|---|---|
TL-RNN | 24,000 | 153.24 |
|
Standard RNN | 24,000 | 187.80 |
|
TL-RNN | 3,200 | 43.91 |
|
Standard RNN | 3,200 | 48.70 |
|
5.2.3 Dropout and co-teaching RNNs with noisy data
Since neural networks are demonstrated to be able to mitigate the impact of Gaussian noise to some extent. In this section, we will use the findings in (Wu et al. 2021c) to understand the capability of the LSTM model to handle non-Gaussian noise, as well as how approaches such as dropout and co-teaching can help to improve the learning performances of the LSTM models developed with noisy data.
We will first explore the effect of implementing the MC dropout, proposed in Gal and Ghahramani (2016a,b), in an LSTM model. By treating the LSTM weights as random variables and finding the posterior distribution of the weights by sampling the network with randomly dropped out weights during testing, the MC dropout method helps quantify the uncertainty in the prediction and uses the information to update the weights. The open-loop prediction results of the standard LSTM and the dropout LSTM are presented in Figure 13 for comparison. As the predictions made by the LSTM model using MC dropout are stochastic in nature, the LSTM predictions were executed repeatedly 300 times to generate the predicted state trajectories distribution. The mean state trajectory is represented as a red line, and the

State profiles predicted by the dropout LSTM and the standard LSTM, where the red line is dropout LSTM, the black, dashed line is the ground truth, the yellow line is the standard LSTM, and the blue, dotted line is the noisy state measurement.
The effect of incorporating co-teaching into LSTM was also studied in Wu et al. (2021c). As mentioned in Section 4.2.2, co-teaching involves training two NN models. In Wu et al. (2021c), the co-teaching process starts by training the two LSTM models with a noisy dataset. Subsequently, the models iteratively identify and exchange clean data sequences and update their weights accordingly. This allows the co-teaching models to capture a balanced pattern that accounts for both noisy and clean data. To understand the effectiveness of the co-teaching method, the testing performances of the standard LSTM, dropout LSTM and co-teaching LSTM models were compared and are listed in Table 3. For fair comparison, all LSTM models were trained and tested on the same noisy dataset. The LSTM models also shared the same network structure and hyperparameters, i.e., the same number of neurons, layers, epochs, and activation functions. The difference between the predicted state trajectories and the underlying (noise-free) state trajectories, i.e., MSE, was chosen as the criterion for performance evaluation, where a smaller MSE value signifies better model performance.
Statistical analysis of the open-loop predictions under non-Gaussian noise.
Methods | MSE
|
MSE
|
---|---|---|
1a) LSTM: noise-free data only | 0.0011 | 8.2056 |
1b) LSTM: mixed data | 0.0258 | 22.1795 |
1c) LSTM: noisy data only | 0.0328 | 29.5571 |
2) Co-teaching LSTM | 0.0053 | 7.2123 |
3) Dropout LSTM | 0.0052 | 19.2123 |
As shown in Table 3, the standard LSTM models had the highest MSE out of the three methods. Furthermore, when comparing standard LSTM trained noisy data (i.e., 1c in Table 3) with mixed LSTM trained data (i.e., 1b in Table 3), a slight improvement in model prediction was observed, highlighting the adverse impact noisy data have on model accuracy. These observations imply that the standard modeling approach cannot achieve the desired model accuracy without a high-quality dataset. However, co-teaching LSTM and dropout models developed with the same noisy dataset outperformed the standard LSTM models, with co-teaching LSTM achieving the best performance among all models, demonstrating the effectiveness of the co-teaching model in mitigating the impact of noisy labels.
5.2.4 Lipschitz-constrained NNs
By improving the model’s robustness to noisy data, LCNN is an alternative approach to address noise in datasets. Specifically, LCNNs are designed to control and bound the Lipschitz constant of the neural network such that the models will be less sensitive and more robust to changes in the input. As there are various ways to construct an LCNN, we will use the SpectralDense layer method mentioned in Section 4.2.3 for demonstration and share the simulation results from Tan et al. (2024b). To assess the model’s performance, we compared the LCNN models to the standard FNNs. Specifically, LCNNs were developed with SpectralDense hidden layers, while the conventional Dense FNNs were developed using the dense layers from Tensorflow with ReLU activation functions. Both SpectralDense LCNNs and Dense FNNs were trained on the same datasets, corrupted with Gaussian noise of a standard deviation of 0.1 or 0.2. Moreover, both models shared the same network structure of two hidden layers of the same size, either 640 or 1,280 neurons. The models were trained using the optimizer
The testing errors of the models are provided in Table 4. As seen in Table 4, the testing errors of Dense FNNs, with orders of magnitude ranging between
Comparison of the testing errors (TEs) and Lipschitz constants (LCs) for various hidden layer architectures and standard deviation (SD) of noise introduced into the training dataset.
Hidden layers | Noise SD | LCNN TE
|
Dense TE
|
LCNN LC | Dense LC
|
---|---|---|---|---|---|
(640, 640) | 0.1 | 3.617 | 0.4573 | 1.119 | 2.447 |
(1,280, 1,280) | 0.1 | 3.850 | 0.6928 | 1.116 | 2.682 |
(640, 640) | 0.2 | 2.088 | 1.692 | 1.114 | 2.511 |
(1,280, 1,280) | 0.2 | 1.393 | 3.095 | 1.107 | 9.285 |
5.2.5 Error-triggered online learning
Neural networks are generally trained offline using historical data, and cannot capture the real-time dynamics subject to process disturbances. Hence, online learning and updating of ML models can be a viable solution for systems with time-varying dynamics. To illustrate how online learning can help address process disturbances, we consider the CSTR of Eq. (31) with model variations caused by the following disturbances: (1) As a result of an upstream disturbance, the feed flow rate

Closed-loop simulation results under LMPC using online update of RNN models. (a) The state-space profiles for the closed-loop CSTR under the LMPC of Eq. (9) with and without online update of RNN model for the initial condition (−1.5, 70). (b) Value of the prediction error E rnn (t) for the closed-loop system of Eq. (31) under the LMPC of Eq. (9) with error-triggered online update of RNN models.
Figure 14b shows the evolution of the moving-horizon error detector
Remark 6.
Due to space constraints, we are unable to present all the NN modeling approaches that address each practical issue discussed in this article. Readers who are interested in reduced-order modeling, ML-based distributed MPC, and federated learning methods, which often require more complex process networks for demonstration, can refer to the references provided in the corresponding sections.
5.3 NN-based MPC
After we obtain the NN models that learn the dynamics of the CSTR of Eq. (31), NN-based MPC can be developed to control the system by manipulating
5.3.1 Convex MPC using input-convex NNs
While neural networks offer advantages in process modeling, ensuring computational efficiency is crucial for real-time optimization and control tasks. In a chemical plant, numerous operations require real-time or near-real-time control to maintain product quality, safety, and operational efficiency. Swift decision-making is pivotal for safety in chemical processes, as delays in addressing reactant changes can result in undesired reactions or unsafe conditions. Inspired by the fact that convex optimization is easier to solve than non-convex optimization, in this subsection, our goal is to preserve the convexity in neural-network-based predictive control that will be discussed later by developing input-convex NNs where the neural network outputs remain convex with respect to the input. Specifically, in addition to the input-convex feedforward neural network introduced in Section 4.5.1, there are a variety of input-convex NNs in the family of RNNs such as input-convex RNNs and input-convex LSTMs. Specifically, we develop input-convex LSTM (ICLSTM), following the formulation in Wang et al. (2025), and compare its performance in closed-loop control with the MPC using plain LSTM model. Subsequently, we consider a simple MPC scheme using a neural network model as the prediction model given by the following optimization problem:
where
Convergence runtime of MPCs using LSTM and ICLSTM.
|
Plain LSTM | ICLSTM |
|
---|---|---|---|
Time (s) | % Decrease | Time (s) | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Average | 1,186.6 |
|
712.5 |
5.3.2 Safe ML-based MPC
Safe MPCs should be developed to ensure that process operations remain within the safe operating region, particularly when there are potential unsafe operating conditions in chemical processes. CLBF functions can be incorporated into the MPC scheme (termed CLBF-MPC) to regulate the CSTR of Eq. (31) to the steady-state while avoiding the unsafe operation at the same time. The CLBF-MPC scheme is formulated by the following optimization problem (Wu et al. 2019a):
where
We consider a bounded unsafe region
and
In Figure 15, it is demonstrated that for all initial states

Closed-loop state trajectories for the system of Eq. (31) under the CLBF-MPC using an RNN model. The initial conditions are marked by circles, and the set of bounded unsafe states
Remark 7.
Although a CSTR example was used to illustrate the applications of various machine learning modeling and ML-based MPC methods, it is important to note that ML-based MPC can be applied to a variety of complex chemical engineering problems. Due to space constraints, we will not provide a detailed discussion in the review; however, we have provided some examples on the application of ML models, as well as, ML-based MPC methods to model and control complex systems, for interested readers seeking more examples in this area. For example, neural network models have been applied to model an industrial ethylene splitter in Jalanko et al. (2021) and experimental electrochemical reactors in Çıtmacı et al. (2022) and Luo et al. (2022). Moreover, an LSTM-based MPC method has been developed in Luo et al. (2023) for the same electrochemical reactor of Çıtmacı et al. (2022). Other notable works on ML-based MPC include: using an LSTM-based economic MPC to control the heating, ventilation, and air conditioning (HVAC) system of a building in Ellis and Chinde (2020), and using an ANN-based MPC to control the film properties in the thin film chemical deposition of quantum dots in Sitapure and Kwon (2022).
6 Conclusion and outlook
The tutorial provided an overview of machine learning-based model predictive control methods, highlighting both theoretical insights and practical challenges associated with the development of NNs and the incorporation of NNs into MPC. Closed-loop stability of ML-based MPC was first established based on the generalization error analysis for NNs. Various ML methods such as physics-informed ML, transfer learning, and novel designs of NN architectures were discussed alongside advanced control methods to address the practical challenges including data scarcity, data quality, the curse of dimensionality, model uncertainty, computational efficiency, and safety in ML-MPC. Finally, a chemical process example was studied to demonstrate the effectiveness of various ML-MPC methods to address the aforementioned practical issues.
In addition to the topics covered in this paper, several emerging areas in ML-based MPC require significant attention for future research. For example, explainable AI (XAI) is critical to improving the transparency, trustworthiness, and usability of ML models in MPC. By understanding how a neural network arrives at its predictions, users can trust more on the model, and identify the errors more effectively in real-world applications. Although neural networks are powerful tools for learning complex patterns and making predictions across various domains, they are typically developed as black-box models with inherent complexity, which makes it challenging to understand the reasoning behind their outputs. Physics-informed ML provides one solution to incorporate domain knowledge into NN models, yet it does not completely address the challenge of model explainability. One common approach for XAI is SHapley Additive exPlanations (SHAP). SHAP is a method based on cooperative game theory that assigns each feature an importance value for a particular prediction. It provides a unified framework to explain the output of any machine learning model by attributing the prediction outcome to different input features. However, developing suitable XAI methods to explain predictions, limitations, and resulting behaviors of neural network models in MPC remains an ongoing challenge.
Regarding physics-informed machine learning, while this review paper discusses several approaches integrating physics knowledge (e.g., first-principles models and structural process knowledge) into NN development, there are numerous types of knowledge that can improve model performance. In many ML-MPC applications, NN models are initially trained offline until achieving sufficiently small errors before incorporation into MPC. However, this process involves extensive data collection and training, potentially consuming time and resources. Therefore, a future direction is to integrate stability requirements into NN model development, ensuring that NN models naturally meet the MPC stability criteria and can be easily implemented within MPC frameworks (Tan et al. 2024a). Additionally, for modeling distributed systems, knowledge of network structure (i.e., units [nodes] and their relationships [edges]) can be integrated into the development of graph neural networks (GNNs) to improve the modeling accuracy. Overall, there are various types of domain knowledge that can be integrated into neural networks tailored to specific ML-MPC applications in different ways (e.g., loss function, network architecture, weight constraints, learning algorithms, etc.).
To successfully implement ML-MPC in real-world large-scale systems, addressing adaptability and scalability is important to ensure computational efficiency and maintaining performance across diverse applications. Transfer learning offers a promising approach by leveraging knowledge from one process to another in modeling and control tasks for process scale-up. However, finding a suitable source process that closely matches the target process can be challenging in practice. Inspired by the success of large language models in many recent studies and applications, a compelling future direction is to develop a single, universal neural network (referred to as a foundation model) capable of rapidly adapting to model any new chemical process (Wang and Wu 2024c). Foundation models have shown success in fields such as computer science, chemistry, and material sciences. In the field of chemical engineering, large language models have been applied in Hirtreiter et al. (2024) to generate control structures for process flow diagrams (PFDs) from PFDs without control structures, as part of an effort to automate the generation of piping and instrumentation diagrams (P&IDs). However, the application of foundation models to chemical process modeling and control is still in its infancy (Decardi-Nelson et al. 2024). This is partly due to the complexity of chemical engineering, which involves large-scale industrial processes characterized by proprietary complex data that is rarely shared publicly by industries. Additionally, adapting ML-based MPC from a small-scale to a large-scale system involves several key considerations such as real-time computation requirements, availability of sensor data and sensor-related issues (e.g., missing, delayed, and asynchronous measurements), and optimization of MPC hyper-parameters across different scales. Addressing these challenges not only enables more efficient utilization of data but also improves the applicability of ML-MPC in various chemical engineering applications.
Funding source: A*STAR MTC YIRG 2022, Singapore
Award Identifier / Grant number: M22K3c0093
Funding source: MOE AcRF Tier 1 FRC Grant, Singapore
Award Identifier / Grant number: 22-5367-A0001
Funding source: Department of Energy
Funding source: NRF-CRP
Award Identifier / Grant number: 27-2021-0001
Funding source: National Science Foundation
-
Research ethics: Not applicable.
-
Informed consent: Not applicable.
-
Author contributions: Z.W. and P.D.C. conceptualized the review and oversaw all aspects of the project. Z.W. and W.W. were responsible for the initial literature review. Z.W. took the lead in writing the original manuscript, with significant inputs from W.W. and F.A., Y.W., F.A., A.A. and Y.K. contributed significantly to the review and editing process. All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Use of Large Language Models, AI and Machine Learning Tools: None declared.
-
Conflict of interest: The authors state no conflict of interest.
-
Research funding: Financial support from the National Science Foundation, the Department of Energy, NRF-CRP (27-2021-0001), MOE AcRF Tier 1 FRC Grant (22-5367-A0001), Singapore, and A*STAR MTC YIRG 2022 (M22K3c0093), Singapore is gratefully acknowledged.
-
Data availability: The raw data can be obtained on request from the corresponding authors.
References
Abbasi, M., Santos, B.P., Pereira, T.C., Sofia, R., Monteiro, N.R., Simões, C.J., Brito, R.M., Ribeiro, B., Oliveira, J.L., and Arrais, J.P. (2022). Designing optimized drug candidates with generative adversarial network. J. Cheminf. 14: 40, https://doi.org/10.1186/s13321-022-00623-6.Search in Google Scholar PubMed PubMed Central
Abdullah, F. and Christofides, P.D. (2023a). Data-based modeling and control of nonlinear process systems using sparse identification: an overview of recent results. Comput. Chem. Eng. 174: 108247, https://doi.org/10.1016/j.compchemeng.2023.108247.Search in Google Scholar
Abdullah, F. and Christofides, P.D. (2023b). Real-time adaptive sparse-identification-based predictive control of nonlinear processes. Chem. Eng. Res. Des. 196: 750–769, https://doi.org/10.1016/j.cherd.2023.07.011.Search in Google Scholar
Abdullah, F., Wu, Z., and Christofides, P.D. (2021a). Data-based reduced-order modeling of nonlinear two-time-scale processes. Chem. Eng. Res. Des. 166: 1–9, https://doi.org/10.1016/j.cherd.2020.11.009.Search in Google Scholar
Abdullah, F., Wu, Z., and Christofides, P.D. (2021b). Sparse-identification-based model predictive control of nonlinear two-time-scale processes. Comput. Chem. Eng. 153: 107411, https://doi.org/10.1016/j.compchemeng.2021.107411.Search in Google Scholar
Abdullah, F., Alhajeri, M.S., and Christofides, P.D. (2022a). Modeling and control of nonlinear processes using sparse identification: using dropout to handle noisy data. Ind. Eng. Chem. Res. 61: 17976–17992, https://doi.org/10.1021/acs.iecr.2c02639.Search in Google Scholar
Abdullah, F., Wu, Z., and Christofides, P.D. (2022b). Handling noisy data in sparse model identification using subsampling and co-teaching. Comput. Chem. Eng. 157: 107628, https://doi.org/10.1016/j.compchemeng.2021.107628.Search in Google Scholar
Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019). Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, August 4–8, 2019: optuna: a next-generation hyperparameter optimization framework. Association for Computing Machinery, Anchorage, AK, USA, pp. 2623–2631.10.1145/3292500.3330701Search in Google Scholar
Akpinar, N.-J., Kratzwald, B., and Feuerriegel, S. (2019). Sample complexity bounds for recurrent neural networks with application to combinatorial graph problems. arXiv preprint arXiv:1901.10289.Search in Google Scholar
Alhajeri, M.S., Abdullah, F., Wu, Z., and Christofides, P.D. (2022). Physics-informed machine learning modeling for predictive control using noisy data. Chem. Eng. Res. Des. 186: 34–49, https://doi.org/10.1016/j.cherd.2022.07.035.Search in Google Scholar
Alhajeri, M.S., Ren, Y.M., Ou, F., Abdullah, F., and Christofides, P.D. (2024). Model predictive control of nonlinear processes using transfer learning-based recurrent neural networks. Chem. Eng. Res. Des. 205: 1–12, https://doi.org/10.1016/j.cherd.2024.03.019.Search in Google Scholar
Ali, M., Cai, X., Khan, F.I., Pistikopoulos, E.N., and Tian, Y. (2023). Dynamic risk-based process design and operational optimization via multi-parametric programming. Digit. Chem. Eng. 7: 100096, https://doi.org/10.1016/j.dche.2023.100096.Search in Google Scholar
Amos, B., Xu, L., and Kolter, J.Z. (2017). Proceedings of the 34th international conference on machine learning, August 6–11, 2017: input convex neural networks. PMLR, Sydney, Australia, pp. 146–155.Search in Google Scholar
Anil, C., Lucas, J., and Grosse, R. (2019). Proceedings of the 36th international conference on machine learning, June 9–15, 2019: sorting out Lipschitz function approximation. PMLR, California, USA, pp. 291–301.Search in Google Scholar
Antonelo, E.A., Camponogara, E., Seman, L.O., Jordanou, J.P., de Souza, E.R., and Hübner, J.F. (2024). Physics-informed neural nets for control of dynamical systems. Neurocomputing 579: 127419, https://doi.org/10.1016/j.neucom.2024.127419.Search in Google Scholar
Arjovsky, M., Chintala, S., and Bottou, L. (2017). Proceedings of the 34th international conference on machine learning, August 6–11, 2017: Wasserstein generative adversarial networks. PMLR, Sydney, Australia, pp. 214–223.Search in Google Scholar
Arnold, F. and King, R. (2021). State-space modeling for control based on physics-informed neural networks. Eng. Appl. Artif. Intell. 101: 104195, https://doi.org/10.1016/j.engappai.2021.104195.Search in Google Scholar
Bangi, M.S.F., Kao, K., and Kwon, J.S.-I. (2022). Physics-informed neural networks for hybrid modeling of lab-scale batch fermentation for β-carotene production using Saccharomyces cerevisiae. Chem. Eng. Res. Des. 179: 415–423, https://doi.org/10.1016/j.cherd.2022.01.041.Search in Google Scholar
Bank, D., Koenigstein, N., and Giryes, R. (2023) Autoencoders. In: Machine learning for data science handbook: data mining and knowledge discovery handbook. Springer, Cham, pp. 353–374.10.1007/978-3-031-24628-9_16Search in Google Scholar
Bartlett, P.L., Foster, D.J., and Telgarsky, M.J. (2017). Spectrally-normalized margin bounds for neural networks. In: Advances in neural information processing systems, Vol. 30. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Batra, R., Dai, H., Huan, T.D., Chen, L., Kim, C., Gutekunst, W.R., Song, L., and Ramprasad, R. (2020). Polymers for extreme conditions designed using syntax-directed variational autoencoders. Chem. Mater. 32: 10489–10500, https://doi.org/10.1021/acs.chemmater.0c03332.Search in Google Scholar
Ben-David, S., Blitzer, J., Crammer, K., and Pereira, F. (2006). Analysis of representations for domain adaptation. In: Advances in neural information processing systems, Vol. 19. MIT Press, Cambridge, MA.10.7551/mitpress/7503.003.0022Search in Google Scholar
Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., and Vaughan, J.W. (2010). A theory of learning from different domains. Mach. Learn. 79: 151–175, https://doi.org/10.1007/s10994-009-5152-4.Search in Google Scholar
Berberich, J. and Allgöwer, F. (2024). An overview of systems-theoretic guarantees in data-driven model predictive control. arXiv preprint arXiv:2406.04130.10.1146/annurev-control-030323-024328Search in Google Scholar
Berberich, J., Köhler, J., Müller, M.A., and Allgöwer, F. (2020). Data-driven model predictive control with stability and robustness guarantees. IEEE Trans. Automat. Control 66: 1702–1717, https://doi.org/10.1109/tac.2020.3000182.Search in Google Scholar
Bergstra, J., Bardenet, R., Bengio, Y., and Kégl, B. (2011). Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems, Vol. 24. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Bergstra, J., Yamins, D., and Cox, D. (2013). Proceedings of the 30th international conference on machine learning, June 16–21, 2013: making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. PMLR, Atlanta, GA, USA, pp. 115–123.Search in Google Scholar
Bhadriraju, B., Narasingam, A., and Kwon, J.S.-I. (2019). Machine learning-based adaptive model identification of systems: application to a chemical process. Chem. Eng. Res. Des. 152: 372–383, https://doi.org/10.1016/j.cherd.2019.09.009.Search in Google Scholar
Bhowmick, A., D’Souza, M., and Raghavan, G.S. (2021) LipBaB: computing exact Lipschitz constant of ReLU networks. In: Artificial neural networks and machine learning – ICANN 2021. Springer, Cham, pp. 151–162.10.1007/978-3-030-86380-7_13Search in Google Scholar
Bi, K., Beykal, B., Avraamidou, S., Pappas, I., Pistikopoulos, E.N., and Qiu, T. (2020). Integrated modeling of transfer learning and intelligent heuristic optimization for a steam cracking process. Ind. Eng. Chem. Res. 59: 16357–16367, https://doi.org/10.1021/acs.iecr.0c02657.Search in Google Scholar PubMed PubMed Central
Bitmead, R.R., Gevers, M., and Wertz, V. (1990). Adaptive optimal control the thinking man’s GPC. Prentice Hall, Victoria, Australia.Search in Google Scholar
Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., and Wortman, J. (2007). Learning bounds for domain adaptation. In: Advances in neural information processing systems, Vol. 20. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Bo, S., Agyeman, B.T., Yin, X., and Liu, J. (2023). Control invariant set enhanced safe reinforcement learning: improved sampling efficiency, guaranteed stability and robustness. Comput. Chem. Eng. 179: 108413, https://doi.org/10.1016/j.compchemeng.2023.108413.Search in Google Scholar
Bonassi, F., Farina, M., Xie, J., and Scattolini, R. (2022). On recurrent neural networks for learning-based control: recent results and ideas for future developments. J. Process Control 114: 92–104, https://doi.org/10.1016/j.jprocont.2022.04.011.Search in Google Scholar
Bond-Taylor, S., Leach, A., Long, Y., and Willcocks, C.G. (2021). Deep generative modelling: a comparative review of VAEs, GANs, normalizing flows, energy-based and autoregressive models. IEEE Trans. Pattern Anal. Mach. Intell. 44: 7327–7347, https://doi.org/10.1109/tpami.2021.3116668.Search in Google Scholar PubMed
Bradford, E., Imsland, L., Zhang, D., and del Rio Chanona, E.A. (2020). Stochastic data-driven model predictive control using Gaussian processes. Comput. Chem. Eng. 139: 106844, https://doi.org/10.1016/j.compchemeng.2020.106844.Search in Google Scholar
Briceno-Mena, L.A., Romagnoli, J.A., and Arges, C.G. (2022). PemNet: a transfer learning-based modeling approach of high-temperature polymer electrolyte membrane electrochemical systems. Ind. Eng. Chem. Res. 61: 3350–3357, https://doi.org/10.1021/acs.iecr.1c04237.Search in Google Scholar
Brunke, L., Greeff, M., Hall, A.W., Yuan, Z., Zhou, S., Panerati, J., and Schoellig, A.P. (2022). Safe learning in robotics: from learning-based control to safe reinforcement learning. Annu. Rev. Control Robot. Auton. Syst. 5: 411–444, https://doi.org/10.1146/annurev-control-042920-020211.Search in Google Scholar
Bünning, F., Schalbetter, A., Aboudonia, A., de Badyn, M.H., Heer, P., and Lygeros, J. (2021). Proceedings of the 3rd conference on learning for dynamics and control, June 7–8, 2021: input convex neural networks for building MPC. PMLR, pp. 251–262.Search in Google Scholar
Cai, S., Wang, Z., Fuest, F., Jeon, Y.J., Gray, C., and Karniadakis, G.E. (2021). Flow over an espresso cup: inferring 3-D velocity and pressure fields from tomographic background oriented Schlieren via physics-informed neural networks. J. Fluid Mech. 915: A102, https://doi.org/10.1017/jfm.2021.135.Search in Google Scholar
Chandrasekar, A., Abdulhussain, H., Thompson, M.R., and Mhaskar, P. (2024). Utilizing neural networks for image-based model predictive controller of a batch rotational molding process. IFAC-PapersOnLine 58: 470–475, https://doi.org/10.1016/j.ifacol.2024.08.381.Search in Google Scholar
Chandrashekar, G. and Sahin, F. (2014). A survey on feature selection methods. Comput. Electr. Eng. 40: 16–28, https://doi.org/10.1016/j.compeleceng.2013.11.024.Search in Google Scholar
Chang, H.-C. and Aluko, M. (1984). Multi-scale analysis of exotic dynamics in surface catalyzed reactions–I: justification and preliminary model discriminations. Chem. Eng. Sci. 39: 37–50, https://doi.org/10.1016/0009-2509(84)80128-1.Search in Google Scholar
Chen, H. and Allgöwer, F. (1998). A quasi-infinite horizon nonlinear model predictive control scheme with guaranteed stability. Automatica 34: 1205–1217, https://doi.org/10.1016/s0005-1098(98)00073-9.Search in Google Scholar
Chen, W.-H. and You, F. (2021). Semiclosed greenhouse climate control under uncertainty via machine learning and data-driven robust model predictive control. IEEE Trans. Control Syst. Technol. 30: 1186–1197, https://doi.org/10.1109/tcst.2021.3094999.Search in Google Scholar
Chen, R.T., Rubanova, Y., Bettencourt, J., and Duvenaud, D.K. (2018a) Neural ordinary differential equations. In: Advances in neural information processing systems, Vol. 31. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Chen, S., Saulnier, K., Atanasov, N., Lee, D.D., Kumar, V., Pappas, G.J., and Morari, M. (2018b). Proceedings of the 2018 annual American control conference, June 27–29, 2018: approximating explicit model predictive control using constrained neural networks. Milwaukee, Wisconsin, USA, pp. 1520–1527.10.23919/ACC.2018.8431275Search in Google Scholar
Chen, Y., Shi, Y., and Zhang, B. (2018c). Optimal control via neural networks: a convex approach. arXiv preprint arXiv:1805.11835.Search in Google Scholar
Chen, M., Li, X., and Zhao, T. (2019). On generalization bounds of a family of recurrent neural networks. arXiv preprint arXiv:1910.12947.Search in Google Scholar
Chen, S., Wu, Z., and Christofides, P.D. (2020a). Decentralized machine-learning-based predictive control of nonlinear processes. Chem. Eng. Res. Des. 162: 45–60, https://doi.org/10.1016/j.cherd.2020.07.019.Search in Google Scholar
Chen, S., Wu, Z., Rincon, D., and Christofides, P.D. (2020b). Machine learning-based distributed model predictive control of nonlinear processes. AIChE J. 66: e17013, https://doi.org/10.1002/aic.17013.Search in Google Scholar
Chen, S., Wu, Z., and Christofides, P.D. (2022a). Machine-learning-based construction of barrier functions and models for safe model predictive control. AIChE J. 68: e17456, https://doi.org/10.1002/aic.17456.Search in Google Scholar
Chen, S., Wu, Z., and Christofides, P.D. (2022b). Statistical machine-learning-based predictive control using barrier functions for process operational safety. Comput. Chem. Eng. 163: 107860, https://doi.org/10.1016/j.compchemeng.2022.107860.Search in Google Scholar
Cheng, F., He, Q.P., and Zhao, J. (2019). A novel process monitoring approach based on variational recurrent autoencoder. Comput. Chem. Eng. 129: 106515, https://doi.org/10.1016/j.compchemeng.2019.106515.Search in Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), October 25–29, 2014: learning phrase representations using RNN encoder–decoder for statistical machine translation. Doha, Qatar, pp. 1724–1734.10.3115/v1/D14-1179Search in Google Scholar
Christofides, P.D., Scattolini, R., De La Pena, D.M., and Liu, J. (2013). Distributed model predictive control: a tutorial review and future research directions. Comput. Chem. Eng. 51: 21–41, https://doi.org/10.1016/j.compchemeng.2012.05.011.Search in Google Scholar
Cisse, M., Bojanowski, P., Grave, E., Dauphin, Y., and Usunier, N. (2017). Proceedings of the 34th international conference on machine learning, August 6–11, 2017: parseval networks: improving robustness to adversarial examples. PMLR, Sydney, Australia, pp. 854–863.Search in Google Scholar
Çıtmacı, B., Luo, J., Jang, J.B., Canuso, V., Richard, D., Ren, Y.M., Morales-Guio, C.G., and Christofides, P.D. (2022). Machine learning-based ethylene concentration estimation, real-time optimization and feedback control of an experimental electrochemical reactor. Chem. Eng. Res. Des. 185: 87–107, https://doi.org/10.1016/j.cherd.2022.06.044.Search in Google Scholar
Daoutidis, P., Lee, J.H., Rangarajan, S., Chiang, L., Gopaluni, B., Schweidtmann, A.M., Harjunkoski, I., Mercangöz, M., Mesbah, A., Boukouvala, F., et al.. (2023). Machine learning in process systems engineering: challenges and opportunities. Comput. Chem. Eng. 181: 108523, https://doi.org/10.1016/j.compchemeng.2023.108523.Search in Google Scholar
David, S.B., Lu, T., Luu, T., and Pál, D. (2010). Proceedings of the 13th international conference on artificial intelligence and statistics, May 13–15, 2010: impossibility theorems for domain adaptation. JMLR Workshop and Conference Proceedings, Sardinia, Italy, pp. 129–136.Search in Google Scholar
de Giuli, L.B., La Bella, A., and Scattolini, R. (2024). Physics-informed neural network modeling and predictive control of district heating systems. IEEE Trans. Control Syst. Technol. 32: 1182–1195, https://doi.org/10.1109/tcst.2024.3355476.Search in Google Scholar
de Vos, B.D., Jansen, G.E., and Išgum, I. (2023). Stochastic co-teaching for training neural networks with unknown levels of label noise. Sci. Rep. 13: 16875, https://doi.org/10.1038/s41598-023-43864-7.Search in Google Scholar PubMed PubMed Central
Decardi-Nelson, B., Alshehri, A.S., Ajagekar, A., and You, F. (2024). Generative AI and process systems engineering: the next Frontier. Comput. Chem. Eng. 187: 108723, https://doi.org/10.1016/j.compchemeng.2024.108723.Search in Google Scholar
Degeest, A., Verleysen, M., and Frénay, B. (2019) About filter criteria for feature selection in regression. In: Advances in computational intelligence. Springer, Cham, pp. 579–590.10.1007/978-3-030-20518-8_48Search in Google Scholar
Dev, P., Jain, S., Arora, P.K., and Kumar, H. (2021). Machine learning and its impact on control systems: a review. Mater. Today: Proc. 47: 3744–3749, https://doi.org/10.1016/j.matpr.2021.02.281.Search in Google Scholar
Dobbelaere, M.R., Plehiers, P.P., Van de Vijver, R., Stevens, C.V., and Van Geem, K.M. (2021). Machine learning in chemical engineering: strengths, weaknesses, opportunities, and threats. Engineering 7: 1201–1211, https://doi.org/10.1016/j.eng.2021.03.019.Search in Google Scholar
Dong, D. and McAvoy, T. (1996). Nonlinear principal component analysis–based on principal curves and neural networks. Comput. Chem. Eng. 20: 65–78, https://doi.org/10.1016/0098-1354(95)00003-k.Search in Google Scholar
Elgamal, T. (1985). A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theor. 31: 469–472, https://doi.org/10.1109/tit.1985.1057074.Search in Google Scholar
Ellis, M.J. and Chinde, V. (2020). An encoder–decoder LSTM-based EMPC framework applied to a building HVAC system. Chem. Eng. Res. Des. 160: 508–520, https://doi.org/10.1016/j.cherd.2020.06.008.Search in Google Scholar
Everett, M. (2021). Proceedings of the 60th IEEE conference on decision and control (CDC), December 14–17, 2021: neural network verification in control. IEEE, Austin, TX, USA, pp. 6326–6340.10.1109/CDC45484.2021.9683154Search in Google Scholar
Farokhi, F., Shames, I., and Batterham, N. (2017). Secure and private control using semi-homomorphic encryption. Control Eng. Pract. 67: 13–20, https://doi.org/10.1016/j.conengprac.2017.07.004.Search in Google Scholar
Federer, H. (2014). Geometric measure theory. Springer Berlin Heidelberg, Heidelberg.Search in Google Scholar
Ferramosca, A., Limon, D., González, A.H., Odloak, D., and Camacho, E.F. (2010). MPC for tracking zone regions. J. Process Control 20: 506–516, https://doi.org/10.1016/j.jprocont.2010.02.005.Search in Google Scholar
Gal, Y. and Ghahramani, Z. (2016a). Proceedings of the 33rd international conference on machine learning, June 19–24, 2016: dropout as a Bayesian approximation: representing model uncertainty in deep learning. PMLR, New York, USA, pp. 1050–1059.Search in Google Scholar
Gal, Y. and Ghahramani, Z. (2016b) A theoretically grounded application of dropout in recurrent neural networks. In: Advances in neural information processing systems, Vol. 29. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Gao, Y., Yan, S., Zhou, J., Cannon, M., Abate, A., and Johansson, K.H. (2024). Proceedings of the 6th annual learning for dynamics & control conference, July 15–17, 2024: learning-based rigid tube model predictive control. PMLR, Oxford, UK, pp. 492–503.Search in Google Scholar
Garcıa, J. and Fernández, F. (2015). A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16: 1437–1480.Search in Google Scholar
Gentry, C., Halevi, S., and Smart, N.P. (2012). Proceedings of the annual cryptology conference– CRYPTO 2012, August 19–23, 2012: homomorphic evaluation of the AES circuit. Springer Berlin Heidelberg, Santa Barbara, CA, USA, pp. 850–867.10.1007/978-3-642-32009-5_49Search in Google Scholar
Golowich, N., Rakhlin, A., and Shamir, O. (2018). Proceedings of the 31st conference on learning theory, July 6–9, 2018: size-independent sample complexity of neural networks. PMLR, Stockholm, Sweden, pp. 297–299.Search in Google Scholar
González, A.H. and Odloak, D. (2009). A stable MPC with zone control. J. Process Control 19: 110–122, https://doi.org/10.1016/j.jprocont.2008.01.003.Search in Google Scholar
Gonzalez, C., Asadi, H., Kooijman, L., and Lim, C.P. (2023). Neural networks for fast optimisation in model predictive control: a review. arXiv preprint arXiv:2309.02668.Search in Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. In: Advances in neural information processing systems, Vol. 27. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Gouk, H., Frank, E., Pfahringer, B., and Cree, M.J. (2021). Regularisation of neural networks by enforcing Lipschitz continuity. Mach. Learn. 110: 393–416, https://doi.org/10.1007/s10994-020-05929-w.Search in Google Scholar
Grimstad, B. and Andersson, H. (2019). ReLU networks as surrogate models in mixed-integer linear programs. Comput. Chem. Eng. 131: 106580, https://doi.org/10.1016/j.compchemeng.2019.106580.Search in Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A.C. (2017). Improved training of Wasserstein GANs. In: Advances in neural information processing systems, Vol. 30. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Guo, J., Du, W., and Nascu, I. (2020). Adaptive modeling of fixed-bed reactors with multicycle and multimode characteristics based on transfer learning and just-in-time learning. Ind. Eng. Chem. Res. 59: 6629–6637, https://doi.org/10.1021/acs.iecr.9b06668.Search in Google Scholar
Han, B., Yao, Q., Yu, X., Niu, G., Xu, M., Hu, W., Tsang, I., and Sugiyama, M. (2018). Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Advances in neural information processing systems, Vol. 31. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Han, X., Zhang, L., Zhou, K., and Wang, X. (2019). ProGAN: protein solubility generative adversarial nets for data augmentation in DNN framework. Comput. Chem. Eng. 131: 106533, https://doi.org/10.1016/j.compchemeng.2019.106533.Search in Google Scholar
Harshvardhan, G., Gourisaria, M.K., Pandey, M., and Rautaray, S.S. (2020). A comprehensive survey and analysis of generative models in machine learning. Comput. Sci. Rev. 38: 100285, https://doi.org/10.1016/j.cosrev.2020.100285.Search in Google Scholar
Hassanpour, H., Corbett, B., and Mhaskar, P. (2020). Integrating dynamic neural network models with principal component analysis for adaptive model predictive control. Chem. Eng. Res. Des. 161: 26–37, https://doi.org/10.1016/j.cherd.2020.03.031.Search in Google Scholar
He, R., Li, X., Chen, G., Chen, G., and Liu, Y. (2020). Generative adversarial network-based semi-supervised learning for real-time risk warning of process industries. Expert Syst. Appl. 150: 113244, https://doi.org/10.1016/j.eswa.2020.113244.Search in Google Scholar
Hein, M. and Andriushchenko, M. (2017). Formal guarantees on the robustness of a classifier against adversarial manipulation. In: Advances in neural information processing systems, Vol. 30. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Hewing, L., Wabersich, K.P., Menner, M., and Zeilinger, M.N. (2020). Learning-based model predictive control: toward safe learning in control. Annu. Rev. Control Robot. Auton. Syst. 3: 269–296, https://doi.org/10.1146/annurev-control-090419-075625.Search in Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R.R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.Search in Google Scholar
Hirtreiter, E., Schulze Balhorn, L., and Schweidtmann, A.M. (2024). Toward automatic generation of control structures for process flow diagrams with large language models. AIChE J. 70: e18259, https://doi.org/10.1002/aic.18259.Search in Google Scholar
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Comput. 9: 1735–1780, https://doi.org/10.1162/neco.1997.9.8.1735.Search in Google Scholar PubMed
Hoi, S.C., Sahoo, D., Lu, J., and Zhao, P. (2021). Online learning: a comprehensive survey. Neurocomputing 459: 249–289, https://doi.org/10.1016/j.neucom.2021.04.112.Search in Google Scholar
Hoskins, J.C. and Himmelblau, D.M. (1988). Artificial neural network models of knowledge representation in chemical engineering. Comput. Chem. Eng. 12: 881–890, https://doi.org/10.1016/0098-1354(88)87015-7.Search in Google Scholar
Hu, C. and Wu, Z. (2024). Model predictive control of switched nonlinear systems using online machine learning. Chem. Eng. Res. Des. 209: 221–236, https://doi.org/10.1016/j.cherd.2024.08.001.Search in Google Scholar
Hu, G. and You, F. (2023). Multi-zone building control with thermal comfort constraints under disjunctive uncertainty using data-driven robust model predictive control. Adv. Appl. Energy 9: 100124, https://doi.org/10.1016/j.adapen.2023.100124.Search in Google Scholar
Hu, C., Cao, Y., and Wu, Z. (2023a). Online machine learning modeling and predictive control of nonlinear systems with scheduled mode transitions. AIChE J. 69: e17882, https://doi.org/10.1002/aic.17882.Search in Google Scholar
Hu, C., Chen, S., and Wu, Z. (2023b). Economic model predictive control of nonlinear systems using online learning of neural networks. Processes 11: 342, https://doi.org/10.3390/pr11020342.Search in Google Scholar
Huang, B. and Kadali, R. (2008) System identification: conventional approach. In: Dynamic modeling, predictive control and performance monitoring: a data-driven subspace approach. Springer London, London, pp. 9–29.10.1007/978-1-84800-233-3_2Search in Google Scholar
Huang, J., Gretton, A., Borgwardt, K., Schölkopf, B., and Smola, A. (2006). Correcting sample selection bias by unlabeled data. In: Advances in neural information processing systems, Vol. 19. MIT Press, Cambridge, MA.10.7551/mitpress/7503.003.0080Search in Google Scholar
Huang, Z., Liu, J., and Huang, B. (2023). Model predictive control of agro-hydrological systems based on a two-layer neural network modeling framework. Int. J. Adapt. Control Signal Process. 37: 1536–1558, https://doi.org/10.1002/acs.3586.Search in Google Scholar
Jalanko, M., Sanchez, Y., Mahalec, V., and Mhaskar, P. (2021). Adaptive system identification of industrial ethylene splitter: a comparison of subspace identification and artificial neural networks. Comput. Chem. Eng. 147: 107240, https://doi.org/10.1016/j.compchemeng.2021.107240.Search in Google Scholar
Kadakia, Y.A., Abdullah, F., Alnajdi, A., and Christofides, P.D. (2024a). Encrypted distributed model predictive control of nonlinear processes. Control Eng. Pract. 145: 105874, https://doi.org/10.1016/j.conengprac.2024.105874.Search in Google Scholar
Kadakia, Y.A., Abdullah, F., Alnajdi, A., and Christofides, P.D. (2024b). Integrating dynamic economic optimization and encrypted control for cyber-resilient operation of nonlinear processes. AIChE J. 70: e18509, https://doi.org/10.1002/aic.18509.Search in Google Scholar
Kadakia, Y.A., Suryavanshi, A., Alnajdi, A., Abdullah, F., and Christofides, P.D. (2024c). Proceedings of the 2024 American control conference, July 10–12, 2024: a two-tier encrypted control architecture for enhanced cybersecurity of nonlinear processes. Toronto, Canada, pp. 4452–4459.10.23919/ACC60939.2024.10644813Search in Google Scholar
Kadakia, Y.A., Suryavanshi, A., Alnajdi, A., Abdullah, F., and Christofides, P.D. (2024d). Integrating machine learning detection and encrypted control for enhanced cybersecurity of nonlinear processes. Comput. Chem. Eng. 180: 108498, https://doi.org/10.1016/j.compchemeng.2023.108498.Search in Google Scholar
Karagiannopoulos, M., Anyfantis, D., Kotsiantis, S.B., and Pintelas, P.E. (2007). Proceedings of the 8th hellenic European research on computer mathematics & its applications, September 20–22, 2007: feature selection for regression problems. Athens, Greece.Search in Google Scholar
Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., and Yang, L. (2021). Physics-informed machine learning. Nat. Rev. Phys. 3: 422–440, https://doi.org/10.1038/s42254-021-00314-5.Search in Google Scholar
Kassa, A.M. and Kassa, S.M. (2016). A branch-and-bound multi-parametric programming approach for non-convex multilevel optimization with polyhedral constraints. J. Global Optim. 64: 745–764, https://doi.org/10.1007/s10898-015-0341-0.Search in Google Scholar
Katz, J., Pappas, I., Avraamidou, S., and Pistikopoulos, E.N. (2020). Integrating deep learning models and multiparametric programming. Comput. Chem. Eng. 136: 106801, https://doi.org/10.1016/j.compchemeng.2020.106801.Search in Google Scholar
Kenefake, D. and Pistikopoulos, E.N. (2022). Proceedings of the 32nd European aymposium on computer-aided process engineering, June 12–15, 2022: PPOPT-multiparametric solver for explicit MPC. Toulouse, France, pp. 1273–1278.10.1016/B978-0-323-95879-0.50213-7Search in Google Scholar
Khan, N. and Ammar Taqvi, S.A. (2023). Machine learning an intelligent approach in process industries: a perspective and overview. ChemBioEng Rev. 10: 195–221, https://doi.org/10.1002/cben.202200030.Search in Google Scholar
Kim, Y. and Kim, J.W. (2022). Safe model-based reinforcement learning for nonlinear optimal control with state and input constraints. AIChE J. 68: e17601, https://doi.org/10.1002/aic.17601.Search in Google Scholar
Kim, J., Lee, C., Shim, H., Cheon, J.H., Kim, A., Kim, M., and Song, Y. (2016). Encrypting controller using fully homomorphic encryption for security of cyber-physical systems. IFAC-PapersOnLine 49: 175–180, https://doi.org/10.1016/j.ifacol.2016.10.392.Search in Google Scholar
Kim, S., Noh, J., Gu, G.H., Aspuru-Guzik, A., and Jung, Y. (2020). Generative adversarial networks for crystal structure prediction. ACS Cent. Sci. 6: 1412–1420, https://doi.org/10.1021/acscentsci.0c00426.Search in Google Scholar PubMed PubMed Central
Kingma, D.P. and Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.Search in Google Scholar
Koiran, P. and Sontag, E.D. (1998). Vapnik-Chervonenkis dimension of recurrent neural networks. Discrete Appl. Math. 86: 63–79, https://doi.org/10.1016/s0166-218x(98)00014-6.Search in Google Scholar
Kokotović, P., Khalil, H.K., and O’Reilly, J. (1999). Singular perturbation methods in control: analysis and design. Society for Industrial and Applied Mathematics, Chap. 3, pp. 93–156.10.1137/1.9781611971118Search in Google Scholar
Kramer, M.A. (1991). Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37: 233–243, https://doi.org/10.1002/aic.690370209.Search in Google Scholar
Kvasnica, M., Grieder, P., Baotić, M., and Morari, M. (2004). Proceedings of the 7th international workshop on hybrid systems: computation and control (HSCC 2004), March 25–27, 2004: multi-parametric toolbox (MPT). Philadelphia, PA, USA, pp. 448–462.10.1007/978-3-540-24743-2_30Search in Google Scholar
Lanzetti, N., Lian, Y.Z., Cortinovis, A., Dominguez, L., Mercangöz, M., and Jones, C. (2019). Proceedings of the 18th European control conference (ECC), June 25–28, 2019: recurrent neural network based MPC for process industries. IEEE, Naples, Italy, pp. 1005–1010.10.23919/ECC.2019.8795809Search in Google Scholar
Lee, Y.S. and Chen, J. (2023). Developing semi-supervised latent dynamic variational autoencoders to enhance prediction performance of product quality. Chem. Eng. Sci. 265: 118192, https://doi.org/10.1016/j.ces.2022.118192.Search in Google Scholar
Lee, J.H., Shin, J., and Realff, M.J. (2018). Machine learning: overview of the recent progresses and implications for the process systems engineering field. Comput. Chem. Eng. 114: 111–121, https://doi.org/10.1016/j.compchemeng.2017.10.008.Search in Google Scholar
Lee, S., Kwak, M., Tsui, K.-L., and Kim, S.B. (2019). Process monitoring using variational autoencoder for high-dimensional nonlinear processes. Eng. Appl. Artif. Intell. 83: 13–27, https://doi.org/10.1016/j.engappai.2019.04.013.Search in Google Scholar
Lee, N., Kim, H., Jung, J., Park, K.-H., Linga, P., and Seo, Y. (2022). Time series prediction of hydrate dynamics on flow assurance using PCA and recurrent neural networks with iterative transfer learning. Chem. Eng. Sci. 263: 118111, https://doi.org/10.1016/j.ces.2022.118111.Search in Google Scholar
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., and Liu, H. (2017). Feature selection: a data perspective. ACM Comput. Surv. 50: 1–45, https://doi.org/10.1145/3136625.Search in Google Scholar
Limon, D., Calliess, J., and Maciejowski, J.M. (2017). Learning-based nonlinear model predictive control. IFAC-PapersOnLine 50: 7769–7776, https://doi.org/10.1016/j.ifacol.2017.08.1050.Search in Google Scholar
Lu, J., Cao, Z., Zhao, C., and Gao, F. (2019). 110th anniversary: an overview on learning-based model predictive control for batch processes. Ind. Eng. Chem. Res. 58: 17164–17173, https://doi.org/10.1021/acs.iecr.9b02370.Search in Google Scholar
Luo, J., Canuso, V., Jang, J.B., Wu, Z., Morales-Guio, C.G., and Christofides, P.D. (2022). Machine learning-based operational modeling of an electrochemical reactor: handling data variability and improving empirical models. Ind. Eng. Chem. Res. 61: 8399–8410, https://doi.org/10.1021/acs.iecr.1c04176.Search in Google Scholar
Luo, J., Çıtmacı, B., Jang, J.B., Abdullah, F., Morales-Guio, C.G., and Christofides, P.D. (2023). Machine learning-based predictive control using on-line model linearization: application to an experimental electrochemical reactor. Chem. Eng. Res. Des. 197: 721–737, https://doi.org/10.1016/j.cherd.2023.08.017.Search in Google Scholar
Mahmood, M. and Mhaskar, P. (2008). Enhanced stability regions for model predictive control of nonlinear process systems. AIChE J. 54: 1487–1498, https://doi.org/10.1002/aic.11458.Search in Google Scholar
Mahmood, F., Govindan, R., Bermak, A., Yang, D., and Al-Ansari, T. (2023). Data-driven robust model predictive control for greenhouse temperature control and energy utilisation assessment. Appl. Energy 343: 121190, https://doi.org/10.1016/j.apenergy.2023.121190.Search in Google Scholar
Mansour, Y., Mohri, M., and Rostamizadeh, A. (2008). Domain adaptation with multiple sources. In: Advances in neural information processing systems, Vol. 21. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Mansour, Y., Mohri, M., and Rostamizadeh, A. (2009). Domain adaptation: learning bounds and algorithms. arXiv preprint arXiv:0902.3430.Search in Google Scholar
Manzano, J.M., Limon, D., de la Peña, D.M., and Calliess, J.-P. (2020). Robust learning-based MPC for nonlinear constrained systems. Automatica 117: 108948, https://doi.org/10.1016/j.automatica.2020.108948.Search in Google Scholar
Mayne, D.Q., Rawlings, J.B., Rao, C.V., and Scokaert, P.O. (2000). Constrained model predictive control: stability and optimality. Automatica 36: 789–814, https://doi.org/10.1016/s0005-1098(99)00214-9.Search in Google Scholar
Meng, F., Shen, X., and Karimi, H.R. (2022). Emerging methodologies in stability and optimization problems of learning-based nonlinear model predictive control: a survey. Int. J. Circ. Theor. Appl. 50: 4146–4170, https://doi.org/10.1002/cta.3370.Search in Google Scholar
Mesbah, A., Wabersich, K.P., Schoellig, A.P., Zeilinger, M.N., Lucia, S., Badgwell, T.A., and Paulson, J.A. (2022). Proceedings of the 2022 American control conference, June 8–10, 2022: fusion of machine learning and MPC under uncertainty: what advances are on the horizon? IEEE, Atlanta, GA, USA, pp. 342–357.10.23919/ACC53348.2022.9867643Search in Google Scholar
Mhaskar, P., El-Farra, N.H., and Christofides, P.D. (2006). Stabilization of nonlinear systems with state and control constraints using Lyapunov-based predictive control. Syst. Control Lett. 55: 650–659, https://doi.org/10.1016/j.sysconle.2005.09.014.Search in Google Scholar
Mishra, S. and Molinaro, R. (2022). Estimates on the generalization error of physics-informed neural networks for approximating a class of inverse problems for PDEs. IMA J. Numer. Anal. 42: 981–1022, https://doi.org/10.1093/imanum/drab032.Search in Google Scholar
Mishra, S. and Molinaro, R. (2023). Estimates on the generalization error of physics-informed neural networks for approximating PDEs. IMA J. Numer. Anal. 43: 1–43, https://doi.org/10.1093/imanum/drab093.Search in Google Scholar
Mohri, M., Rostamizadeh, A., and Talwalkar, A. (2018). Foundations of machine learning. MIT press, Cambridge, MA.Search in Google Scholar
Mowbray, M., Vallerio, M., Perez-Galvan, C., Zhang, D., Chanona, A.D.R., and Navarro-Brull, F.J. (2022). Industrial data science–a review of machine learning applications for chemical and process industries. React. Chem. Eng. 7: 1471–1509, https://doi.org/10.1039/d1re00541c.Search in Google Scholar
Murray-Smith, R., Sbarbaro, D., Rasmussen, C.E., and Girard, A. (2003). Adaptive, cautious, predictive control with Gaussian process priors. IFAC Proc. Vol. 36: 1155–1160, https://doi.org/10.1016/s1474-6670(17)34915-7.Search in Google Scholar
Na, J., Jeon, K., and Lee, W.B. (2018). Toxic gas release modeling for real-time analysis using variational autoencoder with convolutional neural networks. Chem. Eng. Sci. 181: 68–78, https://doi.org/10.1016/j.ces.2018.02.008.Search in Google Scholar
Nagy, Z.K. (2007). Model based control of a yeast fermentation bioreactor using optimally designed artificial neural networks. Chem. Eng. J. 127: 95–109, https://doi.org/10.1016/j.cej.2006.10.015.Search in Google Scholar
Nascimento, C.A.O., Giudici, R., and Guardani, R. (2000). Neural network based approach for optimization of industrial chemical processes. Comput. Chem. Eng. 24: 2303–2314, https://doi.org/10.1016/s0098-1354(00)00587-1.Search in Google Scholar
Neyshabur, B., Bhojanapalli, S., and Srebro, N. (2017). A PAC-Bayesian approach to spectrally-normalized margin bounds for neural networks. arXiv preprint arXiv:1707.09564.Search in Google Scholar
Nian, R., Liu, J., and Huang, B. (2020). A review on reinforcement learning: introduction and applications in industrial process control. Comput. Chem. Eng. 139: 106886, https://doi.org/10.1016/j.compchemeng.2020.106886.Search in Google Scholar
Ning, C. and You, F. (2021). Online learning based risk-averse stochastic MPC of constrained linear uncertain systems. Automatica 125: 109402, https://doi.org/10.1016/j.automatica.2020.109402.Search in Google Scholar
Norouzi, A., Heidarifar, H., Borhan, H., Shahbakhti, M., and Koch, C.R. (2023). Integrating machine learning and model predictive control for automotive applications: a review and future directions. Eng. Appl. Artif. Intell. 120: 105878, https://doi.org/10.1016/j.engappai.2023.105878.Search in Google Scholar
Nouira, A., Sokolovska, N., and Crivello, J.-C. (2018). CrystalGAN: learning to discover crystallographic structures with generative adversarial networks. arXiv preprint arXiv:1810.11203.Search in Google Scholar
Oberdieck, R., Diangelakis, N.A., Papathanasiou, M.M., Nascu, I., and Pistikopoulos, E.N. (2016). POP–parametric optimization toolbox. Ind. Eng. Chem. Res. 55: 8979–8991, https://doi.org/10.1021/acs.iecr.6b01913.Search in Google Scholar
Paillier, P. (1999). Public-key cryptosystems based on composite degree residuosity classes. In: Proceedings of the international conference on the theory and applications of cryptographic techniques, May 2–6, 1999: public-key cryptosystems based on composite degree residuosity classes. Springer, Prague, Czech Republic, pp. 223–238.10.1007/3-540-48910-X_16Search in Google Scholar
Pan, S.J. and Yang, Q. (2009). A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22: 1345–1359, https://doi.org/10.1109/tkde.2009.191.Search in Google Scholar
Pan, I., Mason, L.R., and Matar, O.K. (2022). Data-centric engineering: integrating simulation, machine learning and statistics. Challenges and opportunities. Chem. Eng. Sci. 249: 117271, https://doi.org/10.1016/j.ces.2021.117271.Search in Google Scholar
Pappas, I., Diangelakis, N.A., and Pistikopoulos, E.N. (2021). The exact solution of multiparametric quadratically constrained quadratic programming problems. J. Global Optim. 79: 59–85, https://doi.org/10.1007/s10898-020-00933-9.Search in Google Scholar
Parker, S., Wu, Z., and Christofides, P.D. (2023). Cybersecurity in process control, operations, and supply chain. Comput. Chem. Eng. 171: 108169, https://doi.org/10.1016/j.compchemeng.2023.108169.Search in Google Scholar
Patel, R., Bhartiya, S., and Gudi, R. (2023). Optimal temperature trajectory for tubular reactor using physics informed neural networks. J. Process Control 128: 103003, https://doi.org/10.1016/j.jprocont.2023.103003.Search in Google Scholar
Pistikopoulos, E.N., Diangelakis, N.A., and Oberdieck, R. (2020). Multi-parametric optimization and control. John Wiley & Sons, London.10.1002/9781119265245Search in Google Scholar
Pravin, P., Tan, J.Z.M., Yap, K.S., and Wu, Z. (2022). Hyperparameter optimization strategies for machine learning-based stochastic energy efficient scheduling in cyber-physical production systems. Digit. Chem. Eng. 4: 100047, https://doi.org/10.1016/j.dche.2022.100047.Search in Google Scholar
Qin, R. and Zhao, J. (2022). High-efficiency generative adversarial network model for chemical process fault diagnosis. IFAC-PapersOnLine 55: 732–737, https://doi.org/10.1016/j.ifacol.2022.07.531.Search in Google Scholar
Raissi, M., Perdikaris, P., and Karniadakis, G.E. (2019). Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378: 686–707, https://doi.org/10.1016/j.jcp.2018.10.045.Search in Google Scholar
Raissi, M., Yazdani, A., and Karniadakis, G.E. (2020). Hidden fluid mechanics: learning velocity and pressure fields from flow visualizations. Science 367: 1026–1030, https://doi.org/10.1126/science.aaw4741.Search in Google Scholar PubMed PubMed Central
Rakhlin, A., Sridharan, K., and Tewari, A. (2010). Online learning: random averages, combinatorial parameters, and learnability. In: Advances in neural information processing systems, Vol. 23. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Ren, Y., Alhajeri, M.S., Luo, J., Chen, S., Abdullah, F., Wu, Z., and Christofides, P.D. (2022). A tutorial review of neural network modeling approaches for model predictive control. Comput. Chem. Eng.: 107956, https://doi.org/10.1016/j.compchemeng.2022.107956.Search in Google Scholar
Rendall, R., Castillo, I., Schmidt, A., Chin, S.-T., Chiang, L.H., and Reis, M. (2019). Wide spectrum feature selection (WiSe) for regression model building. Comput. Chem. Eng. 121: 99–110, https://doi.org/10.1016/j.compchemeng.2018.10.005.Search in Google Scholar
Rijmen, V. and Daemen, J. (2001). Advanced encryption standard. In: Proceedings of federal information processing standards publications, Vol. 19. National Institute of Standards and Technology, p. 22.Search in Google Scholar
Robinet, F., Parera, C., Hundt, C., and Frank, R. (2022). Proceedings of the 2022 IEEE/CVF winter conference on applications of computer vision, January 4–8, 2022: weakly-supervised free space estimation through stochastic co-teaching. Waikoloa, HI, USA, pp. 618–627.10.1109/WACVW54805.2022.00068Search in Google Scholar
Rogers, A.W., Cardenas, I.O.S., Del Rio-Chanona, E.A., and Zhang, D. (2023). Investigating physics-informed neural networks for bioprocess hybrid model construction. In: Computer aided chemical engineering, Vol. 52. Elsevier, Amsterdam, pp. 83–88.10.1016/B978-0-443-15274-0.50014-7Search in Google Scholar
Romdlony, M.Z. and Jayawardhana, B. (2016). Stabilization with guaranteed safety using control Lyapunov–barrier function. Automatica 66: 39–47, https://doi.org/10.1016/j.automatica.2015.12.011.Search in Google Scholar
Sangoi, E., Quaglio, M., Bezzo, F., and Galvanin, F. (2022). Optimal design of experiments based on artificial neural network classifiers for fast kinetic model recognition. In: Computer aided chemical engineering, Vol. 49. Elsevier, Amsterdam, pp. 817–822.10.1016/B978-0-323-85159-6.50136-6Search in Google Scholar
Sangoi, E., Quaglio, M., Bezzo, F., and Galvanin, F. (2024). An optimal experimental design framework for fast kinetic model identification based on artificial neural networks. Comput. Chem. Eng. 187: 108752, https://doi.org/10.1016/j.compchemeng.2024.108752.Search in Google Scholar
Saraswathi K, S., Bhosale, H., Ovhal, P., Parlikkad Rajan, N., and Valadi, J.K. (2020). Random forest and autoencoder data-driven models for prediction of dispersed-phase holdup and drop size in rotating disc contactors. Ind. Eng. Chem. Res. 60: 425–435, https://doi.org/10.1021/acs.iecr.0c04149.Search in Google Scholar
Scattolini, R. (2009). Architectures for distributed and hierarchical model predictive control–a review. J. Process Control 19: 723–731, https://doi.org/10.1016/j.jprocont.2009.02.003.Search in Google Scholar
Schilter, O., Vaucher, A., Schwaller, P., and Laino, T. (2023). Designing catalysts with deep generative models and computational data. A case study for Suzuki cross coupling reactions. Digit. Discov. 2: 728–735, https://doi.org/10.1039/d2dd00125j.Search in Google Scholar PubMed PubMed Central
Schlüter, N., Binfet, P., and Darup, M.S. (2023). A brief survey on encrypted control: from the first to the second generation and beyond. Annu. Rev. Control 56: 100913, https://doi.org/10.1016/j.arcontrol.2023.100913.Search in Google Scholar
Schweidtmann, A.M., Esche, E., Fischer, A., Kloft, M., Repke, J.-U., Sager, S., and Mitsos, A. (2021). Machine learning in chemical engineering: a perspective. Chem. Ing. Tech. 93: 2029–2039, https://doi.org/10.1002/cite.202100083.Search in Google Scholar
Serrurier, M., Mamalet, F., González-Sanz, A., Boissin, T., Loubes, J.-M., and Del Barrio, E. (2021). Proceedings of the 2021 IEEE/CVF conference on computer vision and pattern recognition, June 20–25, 2021: achieving robustness in classification using optimal transport with hinge regularization. Nashville, TN, USA, pp. 505–514.10.1109/CVPR46437.2021.00057Search in Google Scholar
Settles, B. (2009). Active learning literature survey. Computer Sciences Technical Report 1648. University of Wisconsin–Madison.Search in Google Scholar
Shalev-Shwartz, S. (2012). Online learning and online convex optimization. Found. Trends Mach. Learn. 4: 107–194, https://doi.org/10.1561/2200000018.Search in Google Scholar
Shang, C. and You, F. (2019). Data analytics and machine learning for smart process manufacturing: recent advances and perspectives in the big data era. Engineering 5: 1010–1016, https://doi.org/10.1016/j.eng.2019.01.019.Search in Google Scholar
Sitapure, N. and Kwon, J.S.-I. (2022). Neural network-based model predictive control for thin-film chemical deposition of quantum dots using data from a multiscale simulation. Chem. Eng. Res. Des. 183: 595–607, https://doi.org/10.1016/j.cherd.2022.05.041.Search in Google Scholar
Soloperto, R., Müller, M.A., and Allgöwer, F. (2022). Guaranteed closed-loop learning in model predictive control. IEEE Trans. Automat. Control 68: 991–1006, https://doi.org/10.1109/tac.2022.3172453.Search in Google Scholar
Sontag, E.D. (1998a). A learning result for continuous-time recurrent neural networks. Syst. Control Lett. 34: 151–158, https://doi.org/10.1016/s0167-6911(98)00006-1.Search in Google Scholar
Sontag, E.D. (1998b). VC dimension of neural networks. NATO ASI Ser. F Comput. Syst. Sci. 168: 69–96.Search in Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15: 1929–1958.Search in Google Scholar
Stewart, B.T., Venkat, A.N., Rawlings, J.B., Wright, S.J., and Pannocchia, G. (2010). Cooperative distributed model predictive control. Syst. Control Lett. 59: 460–469, https://doi.org/10.1016/j.sysconle.2010.06.005.Search in Google Scholar
Su, H.T., McAvoy, T.J., and Werbos, P. (1992). Long-term predictions of chemical processes using recurrent neural networks: a parallel training approach. Ind. Eng. Chem. Res. 31: 1338–1352, https://doi.org/10.1021/ie00005a014.Search in Google Scholar
Subraveti, S.G., Li, Z., Prasad, V., and Rajendran, A. (2022). Physics-based neural networks for simulation and synthesis of cyclic adsorption processes. Ind. Eng. Chem. Res. 61: 4095–4113, https://doi.org/10.1021/acs.iecr.1c04731.Search in Google Scholar
Suryavanshi, A., Alnajdi, A., Alhajeri, M., Abdullah, F., and Christofides, P.D. (2023). Encrypted model predictive control design for security to cyberattacks. AIChE J. 69: e18104, https://doi.org/10.1002/aic.18104.Search in Google Scholar
Tan, W.G.Y. and Wu, Z. (2024). Robust machine learning modeling for predictive control using Lipschitz-constrained neural networks. Comput. Chem. Eng. 180: 108466, https://doi.org/10.1016/j.compchemeng.2023.108466.Search in Google Scholar
Tan, G.Y., Xiao, M., Wu, G., and Wu, Z. (2024a). Proceedings of the 2024 American control conference, July 10–12, 2024: machine learning modeling of nonlinear processes with Lyapunov stability guarantees. Toronto, Canada, pp. 528–535.10.23919/ACC60939.2024.10644912Search in Google Scholar
Tan, W.G.Y., Xiao, M., and Wu, Z. (2024b). Robust reduced-order machine learning modeling of high-dimensional nonlinear processes using noisy data. Digit. Chem. Eng. 11: 100145, https://doi.org/10.1016/j.dche.2024.100145.Search in Google Scholar
Tang, W. (2023). Synthesis of data-driven nonlinear state observers using lipschitz-bounded neural networks. arXiv preprint arXiv:2310.03187.10.23919/ACC60939.2024.10644627Search in Google Scholar
Tang, W. and Daoutidis, P. (2022). Proceedings of the 2022 American control conference, June 8–10, 2022: data-driven control: overview and perspectives. IEEE, Atlanta, Georgia, USA, pp. 1048–1064.10.23919/ACC53348.2022.9867266Search in Google Scholar
Terzi, E., Bonassi, F., Farina, M., and Scattolini, R. (2021). Learning model predictive control with long short-term memory networks. Int. J. Robust Nonlinear Control 31: 8877–8896, https://doi.org/10.1002/rnc.5519.Search in Google Scholar
Thebelt, A., Wiebe, J., Kronqvist, J., Tsay, C., and Misener, R. (2022). Maximizing information from chemical engineering data sets: applications to machine learning. Chem. Eng. Sci. 252: 117469, https://doi.org/10.1016/j.ces.2022.117469.Search in Google Scholar
Tian, Y., Pappas, I., Burnak, B., Katz, J., and Pistikopoulos, E.N. (2021). Simultaneous design & control of a reactive distillation system–a parametric optimization & control approach. Chem. Eng. Sci. 230: 116232, https://doi.org/10.1016/j.ces.2020.116232.Search in Google Scholar
Tian, J., Han, D., Li, M., and Shi, P. (2022). A multi-source information transfer learning method with subdomain adaptation for cross-domain fault diagnosis. Knowl. Base Syst. 243: 108466, https://doi.org/10.1016/j.knosys.2022.108466.Search in Google Scholar
Vapnik, V., Levin, E., and Le Cun, Y. (1994). Measuring the VC-dimension of a learning machine. Neural Comput. 6: 851–876, https://doi.org/10.1162/neco.1994.6.5.851.Search in Google Scholar
Wächter, A. and Biegler, L.T. (2006). On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 106: 25–57, https://doi.org/10.1007/s10107-004-0559-y.Search in Google Scholar
Wang, R. and Manchester, I. (2023). Proceedings of the 40th international conference on machine learning, July 23–29, 2023: direct parameterization of lipschitz-bounded deep networks. PMLR, Hawaii, USA, pp. 36093–36110.Search in Google Scholar
Wang, Y. and Wu, Z. (2024a). Control Lyapunov-barrier function-based safe reinforcement learning for nonlinear optimal control. AIChE J. 70: e18306, https://doi.org/10.1002/aic.18306.Search in Google Scholar
Wang, Y. and Wu, Z. (2024b). Physics-informed reinforcement learning for optimal control of nonlinear systems. AIChE J. 70: e18542, https://doi.org/10.1002/aic.18542.Search in Google Scholar
Wang, Z. and Wu, Z. (2024c). Foundation model for chemical process modeling: meta-learning with physics-informed adaptation. arXiv preprint arXiv:2405.11752.Search in Google Scholar
Wang, X., Ayachi, S., Corbett, B., and Mhaskar, P. (in press). Integrating autoencoder with Koopman operator to design a linear data-driven model predictive controller. Can. J. Chem. Eng.Search in Google Scholar
Wang, Z., Dai, Z., Póczos, B., and Carbonell, J. (2019). Proceedings of the 2019 IEEE/CVF conference on computer vision and pattern recognition, June 15–20, 2019: characterizing and avoiding negative transfer. Long Beach, CA, USA, pp. 11293–11302.10.1109/CVPR.2019.01155Search in Google Scholar
Wang, W., Wang, Y., Tian, Y., and Wu, Z. (2024a). Explicit machine learning-based model predictive control of nonlinear processes via multi-parametric programming. Comput. Chem. Eng. 186: 108689, https://doi.org/10.1016/j.compchemeng.2024.108689.Search in Google Scholar
Wang, Z., Yu, D., and Wu, Z. (2025). Real-time machine-learning-based optimization using input convex long short-term memory network. Appl. Energy 377: 124472, https://doi.org/10.1016/j.apenergy.2024.124472.Search in Google Scholar
Wang, W., Zhang, H., Wang, Y., Tian, Y., and Wu, Z. (2024b). Fast explicit machine learning-based model predictive control using input convex neural networks. Ind. Eng. Chem. Res. 63: 17279–17293, https://doi.org/10.1021/acs.iecr.4c02257.Search in Google Scholar
Wei, C. and Ma, T. (2019). Data-dependent sample complexity of deep neural networks via Lipschitz augmentation. In: Advances in neural information processing systems, Vol. 32. Curran Associates, Inc, Red Hook, NY.Search in Google Scholar
Wieland, P. and Allgöwer, F. (2007). Constructive safety using control barrier functions. IFAC Proc. Vol. 40: 462–467, https://doi.org/10.3182/20070822-3-za-2920.00076.Search in Google Scholar
Wong, W., Chee, E., Li, J., and Wang, X. (2018). Recurrent neural network-based model predictive control for continuous pharmaceutical manufacturing. Mathematics 6: 242, https://doi.org/10.3390/math6110242.Search in Google Scholar
Wu, Z. and Christofides, P.D. (2020). Control Lyapunov-barrier function-based predictive control of nonlinear processes using machine learning modeling. Comput. Chem. Eng. 134: 106706, https://doi.org/10.1016/j.compchemeng.2019.106706.Search in Google Scholar
Wu, Z., Durand, H., and Christofides, P.D. (2018). Safe economic model predictive control of nonlinear systems. Syst. Control Lett. 118: 69–76, https://doi.org/10.1016/j.sysconle.2018.05.013.Search in Google Scholar
Wu, Z., Albalawi, F., Zhang, Z., Zhang, J., Durand, H., and Christofides, P.D. (2019a). Control Lyapunov-barrier function-based model predictive control of nonlinear systems. Automatica 109: 108508, https://doi.org/10.1016/j.automatica.2019.108508.Search in Google Scholar
Wu, Z., Rincon, D., and Christofides, P.D. (2019b). Real-time adaptive machine-learning-based predictive control of nonlinear processes. Ind. Eng. Chem. Res. 59: 2275–2290, https://doi.org/10.1021/acs.iecr.9b03055.Search in Google Scholar
Wu, Z., Tran, A., Rincon, D., and Christofides, P.D. (2019c). Machine learning-based predictive control of nonlinear processes. Part I: theory. AIChE J. 65: e16729, https://doi.org/10.1002/aic.16729.Search in Google Scholar
Wu, Z., Tran, A., Rincon, D., and Christofides, P.D. (2019d). Machine-learning-based predictive control of nonlinear processes. Part II: computational implementation. AIChE J. 65: e16734, https://doi.org/10.1002/aic.16734.Search in Google Scholar
Wu, Z., Rincon, D., and Christofides, P.D. (2020). Process structure-based recurrent neural network modeling for model predictive control of nonlinear processes. J. Process Control 89: 74–84, https://doi.org/10.1016/j.jprocont.2020.03.013.Search in Google Scholar
Wu, Z., Luo, J., Rincon, D., and Christofides, P.D. (2021a). Machine learning-based predictive control using noisy data: evaluating performance and robustness via a large-scale process simulator. Chem. Eng. Res. Des. 168: 275–287, https://doi.org/10.1016/j.cherd.2021.02.011.Search in Google Scholar
Wu, Z., Rincon, D., Gu, Q., and Christofides, P.D. (2021b). Statistical machine learning in model predictive control of nonlinear processes. Mathematics 9: 1912, https://doi.org/10.3390/math9161912.Search in Google Scholar
Wu, Z., Rincon, D., Luo, J., and Christofides, P.D. (2021c). Machine learning modeling and predictive control of nonlinear processes using noisy data. AIChE J. 67: e17164, https://doi.org/10.1002/aic.17164.Search in Google Scholar
Wu, G., Yion, W.T.G., Dang, K.L.N.Q., and Wu, Z. (2023a). Physics-informed machine learning for MPC: application to a batch crystallization process. Chem. Eng. Res. Des. 192: 556–569, https://doi.org/10.1016/j.cherd.2023.02.048.Search in Google Scholar
Wu, Z., Zhang, B., Yu, H., Ren, J., Pan, M., He, C., and Chen, Q. (2023b). Accelerating heat exchanger design by combining physics-informed deep learning and transfer learning. Chem. Eng. Sci. 282: 119285, https://doi.org/10.1016/j.ces.2023.119285.Search in Google Scholar
Wu, Z., Li, M., He, C., Zhang, B., Ren, J., Yu, H., and Chen, Q. (2024). Physics-informed learning of chemical reactor systems using decoupling–coupling training framework. AIChE J.: e18436, https://doi.org/10.1002/aic.18436.Search in Google Scholar
Xiao, M. and Wu, Z. (2023). Modeling and control of a chemical process network using physics-informed transfer learning. Ind. Eng. Chem. Res. 62: 17216–17227, https://doi.org/10.1021/acs.iecr.3c01435.Search in Google Scholar
Xiao, M., Hu, C., and Wu, Z. (2023). Modeling and predictive control of nonlinear processes using transfer learning method. AIChE J. 69: e18076, https://doi.org/10.1002/aic.18076.Search in Google Scholar
Xiao, M., Vellayappan, K., Pravin, P., Gudena, K., and Wu, Z. (2024). Optimization-based multi-source transfer learning for modeling of nonlinear processes. Chem. Eng. Sci. 295: 120117, https://doi.org/10.1016/j.ces.2024.120117.Search in Google Scholar
Xie, R., Jan, N.M., Hao, K., Chen, L., and Huang, B. (2019). Supervised variational autoencoders for soft sensor modeling with missing data. IEEE Trans. Ind. Inf. 16: 2820–2828, https://doi.org/10.1109/tii.2019.2951622.Search in Google Scholar
Xiu, X., Yang, Y., Kong, L., and Liu, W. (2020). Laplacian regularized robust principal component analysis for process monitoring. J. Process Control 92: 212–219, https://doi.org/10.1016/j.jprocont.2020.06.011.Search in Google Scholar
Xu, Z. and Wu, Z. (2024). Privacy-preserving federated machine learning modeling and predictive control of heterogeneous nonlinear systems. Comput. Chem. Eng. 187: 108749, https://doi.org/10.1016/j.compchemeng.2024.108749.Search in Google Scholar
Yang, S. and Bequette, B.W. (2021). Optimization-based control using input convex neural networks. Comput. Chem. Eng. 144: 107143, https://doi.org/10.1016/j.compchemeng.2020.107143.Search in Google Scholar
Yang, F., Li, K., Zhong, Z., Luo, Z., Sun, X., Cheng, H., Guo, X., Huang, F., Ji, R., and Li, S. (2020). Asymmetric co-teaching for unsupervised cross-domain person re-identification. In: Proceedings of the thirty-fourth AAAI conference on artificial intelligence, February 7–12, 2020: asymmetric co-teaching for unsupervised cross-domain person re-identification, Vol. 34. New York, USA, pp. 12597–12604.10.1609/aaai.v34i07.6950Search in Google Scholar
Yao, Y. and Doretto, G. (2010). Proceedings of the 2010 IEEE computer society conference on computer vision and pattern recognition, June 13–18, 2010: boosting for transfer learning with multiple sources. San Francisco, CA, USA, pp. 1855–1862.10.1109/CVPR.2010.5539857Search in Google Scholar
You, Y. and Nikolaou, M. (1993). Dynamic process modeling with recurrent neural networks. AIChE J. 39: 1654–1667, https://doi.org/10.1002/aic.690391009.Search in Google Scholar
Yu, X., Han, B., Yao, J., Niu, G., Tsang, I., and Sugiyama, M. (2019). Proceedings of the 36th international conference on machine learning, June 9–15, 2019: how does disagreement help generalization against label corruption? PMLR, California, USA, pp. 7164–7173.Search in Google Scholar
Zhang, S. and Qiu, T. (2022). Semi-supervised LSTM ladder autoencoder for chemical process fault diagnosis and localization. Chem. Eng. Sci. 251: 117467, https://doi.org/10.1016/j.ces.2022.117467.Search in Google Scholar
Zhang, J., Lei, Q., and Dhillon, I. (2018) Stabilizing gradients for deep neural networks via efficient svd parameterization. In: Proceedings of the 35th international conference on machine learning, July 10–15, 2018: stabilizing gradients for deep neural networks via efficient svd parameterization. PMLR, Stockholm, Sweden, pp. 5806–5814.Search in Google Scholar
Zhang, C., Xie, Y., Bai, H., Yu, B., Li, W., and Gao, Y. (2021a). A survey on federated learning. Knowl. Base Syst. 216: 106775, https://doi.org/10.1016/j.knosys.2021.106775.Search in Google Scholar
Zhang, X., Zou, Y., and Li, S. (2021b). Semi-supervised generative adversarial network with guaranteed safeness for industrial quality prediction. Comput. Chem. Eng. 153: 107418, https://doi.org/10.1016/j.compchemeng.2021.107418.Search in Google Scholar
Zhang, X., Pan, W., Scattolini, R., Yu, S., and Xu, X. (2022). Robust tube-based model predictive control with Koopman operators. Automatica 137: 110114, https://doi.org/10.1016/j.automatica.2021.110114.Search in Google Scholar
Zhang, Z., Wang, X., Wang, G., Jiang, Q., Yan, X., and Zhuang, Y. (2024). A data enhancement method based on generative adversarial network for small sample-size with soft sensor application. Comput. Chem. Eng. 186: 108707, https://doi.org/10.1016/j.compchemeng.2024.108707.Search in Google Scholar
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., and Chandra, V. (2018). Federated learning with non-iid data. arXiv preprint arXiv:1806.00582.Search in Google Scholar
Zhao, T., Zheng, Y., Gong, J., and Wu, Z. (2022a). Machine learning-based reduced-order modeling and predictive control of nonlinear processes. Chem. Eng. Res. Des. 179: 435–451, https://doi.org/10.1016/j.cherd.2022.02.005.Search in Google Scholar
Zhao, T., Zheng, Y., and Wu, Z. (2022b). Improving computational efficiency of machine learning modeling of nonlinear processes using sensitivity analysis and active learning. Digit. Chem. Eng. 3: 100027, https://doi.org/10.1016/j.dche.2022.100027.Search in Google Scholar
Zhao, T., Zheng, Y., and Wu, Z. (2023). Feature selection-based machine learning modeling for distributed model predictive control of nonlinear processes. Comput. Chem. Eng. 169: 108074, https://doi.org/10.1016/j.compchemeng.2022.108074.Search in Google Scholar
Zheng, Y. and Wu, Z. (2023). Physics-informed online machine learning and predictive control of nonlinear processes with parameter uncertainty. Ind. Eng. Chem. Res. 62: 2804–2818, https://doi.org/10.1021/acs.iecr.2c03691.Search in Google Scholar
Zheng, S. and Zhao, J. (2020). A new unsupervised data mining method based on the stacked autoencoder for chemical process fault diagnosis. Comput. Chem. Eng. 135: 106755, https://doi.org/10.1016/j.compchemeng.2020.106755.Search in Google Scholar
Zheng, Y., Wang, X., and Wu, Z. (2022a). Machine learning modeling and predictive control of the batch crystallization process. Ind. Eng. Chem. Res. 61: 5578–5592, https://doi.org/10.1021/acs.iecr.2c00026.Search in Google Scholar
Zheng, Y., Zhang, T., Li, S., Zhang, G., and Wang, Y. (2022b). Gp-based MPC with updating tube for safety control of unknown system. Digit. Chem. Eng. 4: 100041, https://doi.org/10.1016/j.dche.2022.100041.Search in Google Scholar
Zheng, Y., Zhao, T., Wang, X., and Wu, Z. (2022c). Online learning-based predictive control of crystallization processes under batch-to-batch parametric drift. AIChE J. 68: e17815, https://doi.org/10.1002/aic.17815.Search in Google Scholar
Zheng, Y., Hu, C., Wang, X., and Wu, Z. (2023). Physics-informed recurrent neural network modeling for predictive control of nonlinear processes. J. Process Control 128: 103005, https://doi.org/10.1016/j.jprocont.2023.103005.Search in Google Scholar
Zhu, Q.X., Xu, T.X., Xu, Y., and He, Y.L. (2021). Improved virtual sample generation method using enhanced conditional generative adversarial networks with cycle structures for soft sensors with limited data. Ind. Eng. Chem. Res. 61: 530–540, https://doi.org/10.1021/acs.iecr.1c03197.Search in Google Scholar
Zhu, W., Zhang, J., and Romagnoli, J. (2022). General feature extraction for process monitoring using transfer learning approaches. Ind. Eng. Chem. Res. 61: 5202–5214, https://doi.org/10.1021/acs.iecr.1c04565.Search in Google Scholar
Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., and He, Q. (2020). A comprehensive survey on transfer learning. Proc. IEEE 109: 43–76, https://doi.org/10.1109/jproc.2020.3004555.Search in Google Scholar
© 2024 the author(s), published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Articles in the same Issue
- Frontmatter
- Editorial
- Recent developments and prospects of machine learning in chemical engineering
- Reviews
- Recent advances in continuous nanomanufacturing: focus on machine learning-driven process control
- Uncertainty quantification and propagation in atomistic machine learning
- A tutorial review of machine learning-based model predictive control methods
- Accelerating process control and optimization via machine learning: a review
- Machine learning applications for thermochemical and kinetic property prediction
Articles in the same Issue
- Frontmatter
- Editorial
- Recent developments and prospects of machine learning in chemical engineering
- Reviews
- Recent advances in continuous nanomanufacturing: focus on machine learning-driven process control
- Uncertainty quantification and propagation in atomistic machine learning
- A tutorial review of machine learning-based model predictive control methods
- Accelerating process control and optimization via machine learning: a review
- Machine learning applications for thermochemical and kinetic property prediction