Residential College | false |
Status | 已發表Published |
Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation | |
Yang, Yongliang1; Kiumarsi, Bahare2; Modares, Hamidreza3; Xu, Chengzhong4 | |
2023-02-01 | |
Source Publication | IEEE Transactions on Neural Networks and Learning Systems
![]() |
ISSN | 2162-237X |
Volume | 34Issue:2Pages:635-649 |
Abstract | This article presents a model-free λ-policy iteration ( λ-PI) for the discrete-time linear quadratic regulation (LQR) problem. To solve the algebraic Riccati equation arising from solving the LQR in an iterative manner, we define two novel matrix operators, named the weighted Bellman operator and the composite Bellman operator. Then, the λ-PI algorithm is first designed as a recursion with the weighted Bellman operator, and its equivalent formulation as a fixed-point iteration with the composite Bellman operator is shown. The contraction and monotonic properties of the composite Bellman operator guarantee the convergence of the λ-PI algorithm. In contrast to the PI algorithm, the λ-PI does not require an admissible initial policy, and the convergence rate outperforms the value iteration (VI) algorithm. Model-free extension of the λ-PI algorithm is developed using the off-policy reinforcement learning technique. It is also shown that the off-policy variants of the λ-PI algorithm are robust against the probing noise. Finally, simulation examples are conducted to validate the efficacy of the λ-PI algorithm. |
Keyword | Algebraic Riccati Equation (Are) Fixed-point Theory Off-policy Reinforcement Learning (Rl) Optimal Control |
DOI | 10.1109/TNNLS.2021.3098985 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Computer Science ; Engineering |
WOS Subject | Computer Science, Artificial Intelligence ; Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic |
WOS ID | WOS:000732146500001 |
Scopus ID | 2-s2.0-85148473067 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology |
Affiliation | 1.University of Macau, State Key Laboratory of IoTSC, Taipa, 999078, Macao 2.Michigan State University, Department of Electrical and Computer Engineering, East Lansing, 48824, United States 3.Michigan State University, Department of Mechanical Engineering, East Lansing, 48824, United States 4.University of Macau, State Key Laboratory of IoTSC, Taipa, Macao |
First Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Yang, Yongliang,Kiumarsi, Bahare,Modares, Hamidreza,et al. Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(2), 635-649. |
APA | Yang, Yongliang., Kiumarsi, Bahare., Modares, Hamidreza., & Xu, Chengzhong (2023). Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation. IEEE Transactions on Neural Networks and Learning Systems, 34(2), 635-649. |
MLA | Yang, Yongliang,et al."Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation".IEEE Transactions on Neural Networks and Learning Systems 34.2(2023):635-649. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment