Home Mathematics 5 Reinforcement learning
Chapter
Licensed
Unlicensed Requires Authentication

5 Reinforcement learning

  • Nazeer Shaik , Chandra Sekaran , Amit Mahajan and Balkeshwar Singh
Become an author with De Gruyter Brill
Toward Artificial General Intelligence
This chapter is in the book Toward Artificial General Intelligence

Abstract

The field of reinforcement learning (RL) is introduced in this chapter, which also looks at several RL techniques. The main goal of RL is to provide algorithms that let agents discover the best policies through interactions with their surroundings while maximizing cumulative rewards. In the first part of the chapter, Markov decision processes (MDPs), which provide a mathematical foundation for modeling RL problems, are discussed. We look at value iteration and policy iteration as iterative approaches to addressing MDPs. To help you find the ideal action-value function, we present Q-Learning, an off-policy model-free RL algorithm. Deep Q-networks (DQNs), which combine Q-learning with deep neural networks, are also addressed in order to handle high-dimensional state spaces. Policy gradient methods are presented as an alternative approach that directly optimizes policy parameters using gradient ascent. Proximal policy optimization (PPO), a leading policy gradient algorithm, is discussed for its ability to balance stability and policy performance. The chapter concludes by emphasizing the significance of RL methods in training agents to make sequential decisions in complex environments across various domains.

Abstract

The field of reinforcement learning (RL) is introduced in this chapter, which also looks at several RL techniques. The main goal of RL is to provide algorithms that let agents discover the best policies through interactions with their surroundings while maximizing cumulative rewards. In the first part of the chapter, Markov decision processes (MDPs), which provide a mathematical foundation for modeling RL problems, are discussed. We look at value iteration and policy iteration as iterative approaches to addressing MDPs. To help you find the ideal action-value function, we present Q-Learning, an off-policy model-free RL algorithm. Deep Q-networks (DQNs), which combine Q-learning with deep neural networks, are also addressed in order to handle high-dimensional state spaces. Policy gradient methods are presented as an alternative approach that directly optimizes policy parameters using gradient ascent. Proximal policy optimization (PPO), a leading policy gradient algorithm, is discussed for its ability to balance stability and policy performance. The chapter concludes by emphasizing the significance of RL methods in training agents to make sequential decisions in complex environments across various domains.

Chapters in this book

  1. Frontmatter I
  2. Preface V
  3. Contents VII
  4. List of authors IX
  5. About the editors XIII
  6. 1 Introduction to artificial intelligence 1
  7. 2 AI technologies, tools, and industrial use cases 21
  8. 3 Classification and regression algorithms 53
  9. 4 Clustering and association algorithm 87
  10. 5 Reinforcement learning 109
  11. 6 Evaluation of AI model performance 125
  12. 7 Methods of cross-validation and bootstrapping 145
  13. 8 Meta-learning through ensemble approach: bagging, boosting, and random forest strategies 167
  14. 9 AI: issues, concerns, and ethical considerations 189
  15. 10 The future with AI and AI in action 213
  16. 11 A survey of AI in industry: from basic concepts to industrial and business applications 233
  17. 12 The intelligent implications of artificial intelligence-driven decision-making in business management 251
  18. 13 An innovative analysis of AI-powered automation techniques for business management 269
  19. 14 The smart and secured AI-powered strategies for optimizing processes in multi-vendor business applications 287
  20. 15 Utilizing AI technologies to enhance e-commerce business operations 309
  21. 16 Exploring the potential of artificial intelligence in wireless sensor networks 331
  22. 17 Exploring artificial intelligence techniques for enhanced sentiment analysis through data mining 345
  23. 18 Exploring the potential of artificial intelligence for automated sentiment 361
  24. 19 A novel blockchain-based artificial intelligence application for healthcare automation 373
  25. 20 Enhancing industrial efficiency with AI-enabled blockchain-based solutions 387
  26. Index 401
Downloaded on 3.12.2025 from https://www.degruyterbrill.com/document/doi/10.1515/9783111323749-005/html
Scroll to top button