Startseite Tracking Control of a Continuous Stirred Tank Reactor Using Direct and Tuned Reinforcement Learning Based Controllers
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Tracking Control of a Continuous Stirred Tank Reactor Using Direct and Tuned Reinforcement Learning Based Controllers

  • B. Jaganatha Pandian ORCID logo EMAIL logo und Mathew M. Noel
Veröffentlicht/Copyright: 14. November 2017
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

The need for linear model, of the nonlinear system, while tuning controllers limits the use of classic controllers. Also, the tuning procedure involves complex computations. This is further complicated when it is necessary to operate the nonlinear system under different operating constraints. Continues Stirred Tank Reactor (CSTR) is one of those non-linear systems which is studied extensively in control and chemical engineering due to its highly non-linear characteristics and its diverse operating range. This paper proposes two different control schemes based on reinforcement learning algorithm to achieve both servo as well as regulatory control. One approach is the direct application of Reinforcement Learning (RL) with ANN approximation and another is tuning of PID controller parameters using reinforcement learning. The main objective of this paper is to handle multiple set point control for the CSTR system using RL. The temperature of the CSTR system is controlled here for multiple setpoint changes. A comparative study is also done between the two proposed algorithm and from the test result, it is seen that direct RL approach with approximation performs better than tuning a PID using RL as oscillations and overshoot are less for direct RL approach. Also, the learning time for the direct RL based controller is lesser than the later.

A Nomenclature

a, A

Action variable and its constraint set

s, S

State vector and its constraint set

R(s)

Reward function

Psa

Probability of reaching “s” upon execution of “a”

Vπ(s)

Cumulative discounted reward

π*

Optimal policy

V*(s)

Optimal value

Qc

Coolant Flow rate (lpm)

CA

Concentration of A in the reactor (mol/l)

TR

Temperature of reactor fluid (K)

QIN

Product Flow rate (lpm)

CI

Input product concentration (mol/lit)

TI

Input temperature (K)

TC

Coolant Temperature (K)

VC

Container volume (l)

E/R

Activation energy term (K)

k0

Reaction rate constant (1pm)

k1, k2, k3

CSTR Plant constants

References:

[1] Mohammadzaheri M, Chen L. Intelligent control of a nonlinear tank reactor based on Lyapunov direct method. In: Industrial Technology, 2009. ICIT 2009. IEEE International Conference on 2009 Feb 10:1–6. IEEE.10.1109/ICIT.2009.4939554Suche in Google Scholar

[2] Salahshoor K, Sabet Kamalabady A. Adaptive feedback linearization control of SISO nonlinear processes using a self-generating neural network-based approach. Chem Prod Process Model. 2011;6(1). DOI: 10.2202/1934-2659.1518Suche in Google Scholar

[3] Rahmat MF, Yazdani AM, Movahed MA, Mahmoudzadeh S. Temperature control of a continuous stirred tank reactor by means of two different intelligent strategies. Int J Smart Sens Intell Syst. 2011;4(2):244–67.10.21307/ijssis-2017-438Suche in Google Scholar

[4] Wahab A, Khairi A, Hussain MA, Omar R. An artificial intelligence software-based controller for temperature control of a partially simulated chemical reactor system. Chem Prod Process Model. 2008;3(1):53.Suche in Google Scholar

[5] Aguilar R, Poznyak A. A new robust sliding-mode observer design for monitoring in chemical reactors. Analysis. 2004;3:6.Suche in Google Scholar

[6] Manimozhi M, Meenakshi R. Multiloop IMC-based PID controller for CSTR process. In: Proceedings of the International Conference on Soft Computing Systems 2016:615–25. Springer India.10.1007/978-81-322-2671-0_59Suche in Google Scholar

[7] Zhang Y, Ding SX, Yang Y, Li L. Data-driven design of two-degree-of-freedom controllers using reinforcement learning techniques. IET Control Theory Appl. 2015;9(7):1011–21.10.1049/iet-cta.2014.0156Suche in Google Scholar

[8] Radac MB, Precup RE, Roman RC. Model-free control performance improvement using virtual reference feedback tuning and reinforcement Q-learning. Int J Syst Sci. 2017;48(5):1071–83.10.1080/00207721.2016.1236423Suche in Google Scholar

[9] Si J, Wang YT. Online learning control by association and reinforcement. IEEE Trans Neural Networks. 2001;12(2):264–76.10.1109/72.914523Suche in Google Scholar PubMed

[10] Syafiie S, Tadeo F, Martinez E. Model-free learning control of neutralization processes using reinforcement learning. Eng Appl Artif Intell. 2007 Sep 30;20(6):767–82.10.1016/j.engappai.2006.10.009Suche in Google Scholar

[11] Cerrada M, Aguilar J. Reinforcement learning in system identification. California: INTECH Open Access Publisher, 2008.10.5772/5273Suche in Google Scholar

[12] Govindhasamy JJ, McLoone SF, Irwin GW. Reinforcement learning for process identification, control and optimisation. In: Intelligent Systems, 2004. Proceedings. 2004 2nd International IEEE Conference 2004 Jun 22;1:316–21. IEEE.Suche in Google Scholar

[13] Malikopoulos AA, Papalambros PY, Assanis DN. A real-time computational learning model for sequential decision-making problems under uncertainty. J Dyn Syst Meas Control. 2009;131(4):041010.10.1115/1.3117200Suche in Google Scholar

[14] Wong WC, Lee JH. A reinforcement learning‐based scheme for direct adaptive optimal control of linear stochastic systems. Optimal Control Appl Methods. 2010;31(4):365–74.10.1002/oca.915Suche in Google Scholar

[15] Pradeep DJ, Noel MM, Arun N. Nonlinear control of a boost converter using a robust regression based reinforcement learning algorithm. Eng Appl Artif Intell. 2016;52:1–9.10.1016/j.engappai.2016.02.007Suche in Google Scholar

[16] Pazis J, Lagoudakis MG. Learning continuous-action control policies. In: Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL’09. IEEE Symposium on 2009 Mar 30:169–76. IEEE.10.1109/ADPRL.2009.4927541Suche in Google Scholar

[17] Weinstein A. Local planning for continuous Markov decision processes. New Jersey: Rutgers The State University of New Jersey-New Brunswick, 2014.Suche in Google Scholar

[18] Howell MN, Frost GP, Gordon TJ, Wu QH. Continuous action reinforcement learning applied to vehicle suspension control. Mechatronics. 1997;7(3):263–76.10.1016/S0957-4158(97)00003-2Suche in Google Scholar

[19] Lee M, Anderson CW. Convergent reinforcement learning control with neural networks and continuous action search. In: Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014 IEEE Symposium on 2014 Dec 9:1–8. IEEE.10.1109/ADPRL.2014.7010612Suche in Google Scholar

[20] Noel MM, Pandian BJ. Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach. Appl Soft Comput. 2014;23:444–51.10.1016/j.asoc.2014.06.037Suche in Google Scholar

[21] Liu YJ, Tang L, Tong S, Chen CP, Li DJ. Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems. IEEE Trans Neural Networks Learn Syst. 2015;26(1):165–76.10.1109/TNNLS.2014.2360724Suche in Google Scholar

[22] Howell MN, Best MC. On-line PID tuning for engine idle-speed control using continuous action reinforcement learning automata. Control Eng Pract. 2000;8(2):147–54.10.1016/S0967-0661(99)00141-0Suche in Google Scholar

[23] Chamsai T, Jirawattana P, Radpukdee T. Robust adaptive PID controller for a class of uncertain nonlinear systems: an application for speed tracking control of an SI engine. Math Probl Eng. 2015;2015:1–12.10.1155/2015/510738Suche in Google Scholar

[24] El Hakim A, Hindersah H, Rijanto E. Application of reinforcement learning on self-tuning pid controller for soccer robot multi-agent system. In: Rural Information & Communication Technology and Electric-Vehicle Technology (rICT & ICeV-T), 2013 Joint International Conference on 2013:1–6. IEEE.10.1109/rICT-ICeVT.2013.6741546Suche in Google Scholar

[25] Sedighizadeh M, Rezazadeh A. Adaptive PID controller based on reinforcement learning for wind turbine control. Proceedings of World Academy of Science, Engineering and Technology. Cairo, Egypt. 2008;27:257–62.Suche in Google Scholar

[26] Liu YJ, Tong S. Optimal control-based adaptive NN design for a class of nonlinear discrete-time block-triangular systems. IEEE Trans Cybern. 2016;46(11):2670–80.10.1109/TCYB.2015.2494007Suche in Google Scholar PubMed

[27] Li DP, Li DJ. Adaptive neural tracking control for nonlinear time-delay systems with full state constraints. IEEE Trans Syst Man Cybern Syst. 2017;47(7):1590–1601.10.1109/TSMC.2016.2637063Suche in Google Scholar

[28] Li DP, Li DJ, Liu YJ, Tong S, Chen CP. Approximation-based adaptive neural tracking control of nonlinear MIMO unknown time-varying delay systems with full state constraints. IEEE Trans Cybern. 2017;47(10):3100–09.10.1109/TCYB.2017.2707178Suche in Google Scholar PubMed

[29] Bellman RI. Dynamic programming. Princeton, NJ: Princeton University Press,1957:3. 1(2).Suche in Google Scholar

Received: 2017-06-10
Revised: 2017-09-25
Accepted: 2017-10-14
Published Online: 2017-11-14

© 2018 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 20.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/cppm-2017-0040/html?lang=de
Button zum nach oben scrollen