UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation
Yang, Yongliang1; Kiumarsi, Bahare2; Modares, Hamidreza3; Xu, Chengzhong4
2023-02-01
Source PublicationIEEE Transactions on Neural Networks and Learning Systems
ISSN2162-237X
Volume34Issue:2Pages:635-649
Abstract

This article presents a model-free λ-policy iteration ( λ-PI) for the discrete-time linear quadratic regulation (LQR) problem. To solve the algebraic Riccati equation arising from solving the LQR in an iterative manner, we define two novel matrix operators, named the weighted Bellman operator and the composite Bellman operator. Then, the λ-PI algorithm is first designed as a recursion with the weighted Bellman operator, and its equivalent formulation as a fixed-point iteration with the composite Bellman operator is shown. The contraction and monotonic properties of the composite Bellman operator guarantee the convergence of the λ-PI algorithm. In contrast to the PI algorithm, the λ-PI does not require an admissible initial policy, and the convergence rate outperforms the value iteration (VI) algorithm. Model-free extension of the λ-PI algorithm is developed using the off-policy reinforcement learning technique. It is also shown that the off-policy variants of the λ-PI algorithm are robust against the probing noise. Finally, simulation examples are conducted to validate the efficacy of the λ-PI algorithm.

KeywordAlgebraic Riccati Equation (Are) Fixed-point Theory Off-policy Reinforcement Learning (Rl) Optimal Control
DOI10.1109/TNNLS.2021.3098985
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaComputer Science ; Engineering
WOS SubjectComputer Science, Artificial Intelligence ; Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic
WOS IDWOS:000732146500001
Scopus ID2-s2.0-85148473067
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
Affiliation1.University of Macau, State Key Laboratory of IoTSC, Taipa, 999078, Macao
2.Michigan State University, Department of Electrical and Computer Engineering, East Lansing, 48824, United States
3.Michigan State University, Department of Mechanical Engineering, East Lansing, 48824, United States
4.University of Macau, State Key Laboratory of IoTSC, Taipa, Macao
First Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Yang, Yongliang,Kiumarsi, Bahare,Modares, Hamidreza,et al. Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(2), 635-649.
APA Yang, Yongliang., Kiumarsi, Bahare., Modares, Hamidreza., & Xu, Chengzhong (2023). Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation. IEEE Transactions on Neural Networks and Learning Systems, 34(2), 635-649.
MLA Yang, Yongliang,et al."Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation".IEEE Transactions on Neural Networks and Learning Systems 34.2(2023):635-649.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Yang, Yongliang]'s Articles
[Kiumarsi, Bahare]'s Articles
[Modares, Hamidreza]'s Articles
Baidu academic
Similar articles in Baidu academic
[Yang, Yongliang]'s Articles
[Kiumarsi, Bahare]'s Articles
[Modares, Hamidreza]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Yang, Yongliang]'s Articles
[Kiumarsi, Bahare]'s Articles
[Modares, Hamidreza]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.