Bienvenue chez nous !

DEFR

ContactOffrir des bonsSuccursales

Recherche detaillée

Reinforcement Learning and Approximate Dynamic Programming for Feedback Control

Frank L. (Georgia Institute of Technology) Lewis

(0)

Donner la première évaluation

Informationen zum Autor Dr. Frank Lewis is a Professor of Electrical Engineering at The University of Texas at Arlington, where he...

Prix bas

CHF192.00

Habituellement expédié sous 3 semaines.

Pas de droit de retour !

Livre Relié

Description

Informationen zum Autor Dr. Frank Lewis is a Professor of Electrical Engineering at The University of Texas at Arlington, where he was awarded the Moncrief-O'Donnell Endowed Chair in 1990 at the Automation & Robotics Research Institute. He has served as Visiting Professor at Democritus University in Greece, Hong Kong University of Science and Technology, Chinese University of Hong Kong, City University of Hong Kong, National University of Singapore, Nanyang Technological University Singapore. Elected Guest Consulting Professor at Shanghai Jiao Tong University and South China University of Technology. Derong Liu received the B.S. degree in mechanical engineering from the East China Institute of Technology (now Nanjing University of Science and Technology), Nanjing, China, in 1982, the M.S. degree in automatic control theory and applications from the Institute of Automation, Chinese Academy of Sciences, Beijing, China, in 1987, and the Ph.D. degree in electrical engineering from the University of Notre Dame, Notre Dame, IN, in 1994. Klappentext Reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP techniques for decision and control in human engineered systems, covering both single player decision and control and multi-player games. Edited by the pioneers of RL and ADP research, the book brings together ideas and methods from many fields and provides an important and timely guidance on controlling a wide variety of systems, such as robots, industrial processes, and economic decision-making. Zusammenfassung Reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. Inhaltsverzeichnis PREFACE xix CONTRIBUTORS xxiii PART I FEEDBACK CONTROL USING RL AND ADP 1. Reinforcement Learning and Approximate Dynamic Programming (RLADP)-Foundations, Common Misconceptions, and the Challenges Ahead 3 Paul J. Werbos 1.1 Introduction 3 1.2 What is RLADP? 4 1.3 Some Basic Challenges in Implementing ADP 14 2. Stable Adaptive Neural Control of Partially Observable Dynamic Systems 31 J. Nate Knight and Charles W. Anderson 2.1 Introduction 31 2.2 Background 32 2.3 Stability Bias 35 2.4 Example Application 38 3. Optimal Control of Unknown Nonlinear Discrete-Time Systems Using the Iterative Globalized Dual Heuristic Programming Algorithm 52 Derong Liu and Ding Wang 3.1 Background Material 53 3.2 Neuro-Optimal Control Scheme Based on the Iterative ADP Algorithm 55 3.3 Generalization 67 3.4 Simulation Studies 68 3.5 Summary 74 4. Learning and Optimization in Hierarchical Adaptive Critic Design 78 Haibo He, Zhen Ni, and Dongbin Zhao 4.1 Introduction 78 4.2 Hierarchical ADP Architecture with Multiple-Goal Representation 80 4.3 Case Study: The Ball-and-Beam System 87 4.4 Conclusions and Future Work 94 5. Single Network Adaptive Critics Networks-Development, Analysis, and Applications 98 Jie Ding, Ali Heydari, and S.N. Balakrishnan 5.1 Introduction 98 5.2 Approximate Dynamic Programing 100 5.3 SNAC 102 5.4 J-SNAC 104 5.5 Finite-SNAC 108 5.6 Conclusions 116 6. Linearly Solvable Optimal Control 119 K. Dvijotham and E. Todorov 6.1 Introduction 119 6.2 Linearly Solvable Optimal Control Problems 123 6.3 Extension to Risk-Sensitive Control and Game Theory 130 6.4 Properties and Algorithms 134 6.5 Conclusions and Future Work 139 7. Approximating Optimal Control with V...

Auteur

Dr. Frank Lewis is a Professor of Electrical Engineering at The University of Texas at Arlington, where he was awarded the Moncrief-O'Donnell Endowed Chair in 1990 at the Automation & Robotics Research Institute. He has served as Visiting Professor at Democritus University in Greece, Hong Kong University of Science and Technology, Chinese University of Hong Kong, City University of Hong Kong, National University of Singapore, Nanyang Technological University Singapore. Elected Guest Consulting Professor at Shanghai Jiao Tong University and South China University of Technology.

Derong Liu received the B.S. degree in mechanical engineering from the East China Institute of Technology (now Nanjing University of Science and Technology), Nanjing, China, in 1982, the M.S. degree in automatic control theory and applications from the Institute of Automation, Chinese Academy of Sciences, Beijing, China, in 1987, and the Ph.D. degree in electrical engineering from the University of Notre Dame, Notre Dame, IN, in 1994.

Texte du rabat

Reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP techniques for decision and control in human engineered systems, covering both single player decision and control and multi-player games. Edited by the pioneers of RL and ADP research, the book brings together ideas and methods from many fields and provides an important and timely guidance on controlling a wide variety of systems, such as robots, industrial processes, and economic decision-making.

Contenu

PREFACE xix CONTRIBUTORS xxiii

PART I FEEDBACK CONTROL USING RL AND ADP

1. Reinforcement Learning and Approximate Dynamic Programming (RLADP)-Foundations, Common Misconceptions, and the Challenges Ahead 3
Paul J. Werbos

1.1 Introduction 3

1.2 What is RLADP? 4

1.3 Some Basic Challenges in Implementing ADP 14

2. Stable Adaptive Neural Control of Partially Observable Dynamic Systems 31
J. Nate Knight and Charles W. Anderson

2.1 Introduction 31

2.2 Background 32

2.3 Stability Bias 35

2.4 Example Application 38

3. Optimal Control of Unknown Nonlinear Discrete-Time Systems Using the Iterative Globalized Dual Heuristic Programming Algorithm 52
Derong Liu and Ding Wang

3.1 Background Material 53

3.2 Neuro-Optimal Control Scheme Based on the Iterative ADP Algorithm 55

3.3 Generalization 67

3.4 Simulation Studies 68

3.5 Summary 74

4. Learning and Optimization in Hierarchical Adaptive Critic Design 78
Haibo He, Zhen Ni, and Dongbin Zhao

4.1 Introduction 78

4.2 Hierarchical ADP Architecture with Multiple-Goal Representation 80

4.3 Case Study: The Ball-and-Beam System 87

4.4 Conclusions and Future Work 94

5. Single Network Adaptive Critics Networks-Development, Analysis, and Applications 98
Jie Ding, Ali Heydari, and S.N. Balakrishnan

5.1 Introduction 98

5.2 Approximate Dynamic Programing 100

5.3 SNAC 102

5.4 J-SNAC 104

5.5 Finite-SNAC 108

5.6 Conclusions 116

6. Linearly Solvable Optimal Control 119
K. Dvijotham and E. Todorov

6.1 Introduction 119

6.2 Linearly Solvable Optimal Control Problems 123

6.3 Extension to Risk-Sensitive Control and Game Theory 130

6.4 Properties and Algorithms 134

6.5 Conclusions and Future Work 139

7. Approximating Optimal Control with Value Gradient Learning 142
Michael Fairbank, Danil Prokhorov, and Eduardo Alonso

7.1 Introduction 142

7.2 Value Gradient Learning and BPTT Algorithms 144

7.3 A Convergence Proof for VGL(1) for Control with Function Approximation 148

7.4 Vertical Lander Experiment 154

7.5 Conclusions 159

8. A Constrained Backpropagation Approach to Function Approximation and Approximate Dynamic Programming 162
Silvia Ferrari, Keith Rudd, and Gianluca Di Muro

8.1 Background 163

8.2 Constrained Backpropagation (CPROP) Approach 163

8.3 Solution of Partial Differential Equations in Nonstationary Environments 170

8.4 Preserving Prior Knowledge in Exploratory Adaptive Critic Designs 174

8.5 Summary 179

9. Toward Design of Nonlinear ADP Learning Controllers with Performance Assurance 182
*Jennie Si, Lei Yang, Chao…