UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
Self-Supervised Imitation for Offline Reinforcement Learning With Hindsight Relabeling
Yu, Xudong1; Bai, Chenjia2; Wang, Changhong1; Yu, Dengxiu3; Chen, C. L.Philip4,5; Wang, Zhen6
2023-12-01
Source PublicationIEEE Transactions on Systems, Man, and Cybernetics: Systems
ABS Journal Level3
ISSN2168-2216
Volume53Issue:12Pages:7732-7743
Abstract

Reinforcement learning (RL) requires a lot of interactions with the environment, which is usually expensive or dangerous in real-world tasks. To address this problem, offline RL considers learning policies from fixed datasets, which is promising in utilizing large-scale datasets, but still suffers from the unstable estimation for out-of-distribution data. Recent developments in RL via supervised learning methods offer an alternative to learning effective policies from suboptimal datasets while relying on oracle information from the environment. In this article, we present an offline RL algorithm that combines hindsight relabeling and supervised regression to predict actions without oracle information. We use hindsight relabeling on the original dataset and learn a command generator and command-conditional policies in a supervised manner, where the command represents the desired return or goal location according to the corresponding task. Theoretically, we illustrate that our method optimizes the lower bound of the goal-conditional RL objective. Empirically, our method achieves competitive performance in comparison with existing approaches in the sparse reward setting and favorable performance in continuous control tasks.

KeywordHindsight Relabeling Offline Reinforcement Learning (Rl) Supervised Learning
DOI10.1109/TSMC.2023.3297711
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaAutomation & Control Systems ; Computer Science
WOS SubjectAutomation & Control Systems ; Computer Science, Cybernetics
WOS IDWOS:001069562300001
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID2-s2.0-85168736234
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorWang, Zhen
Affiliation1.Harbin Institute of Technology, Space Control and Inertial Technology Research Center, Harbin, 150001, China
2.Shanghai Artificial Intelligence Laboratory, Fundamental Theory, Shanghai, 200232, China
3.Unmanned System Research Institute, Northwestern Polytechnical University, Xi'an, 710072, China
4.South China University of Technology, School of Computer Science and Engineering, Guangzhou, 510006, China
5.University of Macau, Faculty of Science and Technology, Macao
6.Northwestern Polytechnical University, School of Artificial Intelligence, Optics and Electronics and the School of Cyberspace, Xi'an, 710072, China
Recommended Citation
GB/T 7714
Yu, Xudong,Bai, Chenjia,Wang, Changhong,et al. Self-Supervised Imitation for Offline Reinforcement Learning With Hindsight Relabeling[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2023, 53(12), 7732-7743.
APA Yu, Xudong., Bai, Chenjia., Wang, Changhong., Yu, Dengxiu., Chen, C. L.Philip., & Wang, Zhen (2023). Self-Supervised Imitation for Offline Reinforcement Learning With Hindsight Relabeling. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 53(12), 7732-7743.
MLA Yu, Xudong,et al."Self-Supervised Imitation for Offline Reinforcement Learning With Hindsight Relabeling".IEEE Transactions on Systems, Man, and Cybernetics: Systems 53.12(2023):7732-7743.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Yu, Xudong]'s Articles
[Bai, Chenjia]'s Articles
[Wang, Changhong]'s Articles
Baidu academic
Similar articles in Baidu academic
[Yu, Xudong]'s Articles
[Bai, Chenjia]'s Articles
[Wang, Changhong]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Yu, Xudong]'s Articles
[Bai, Chenjia]'s Articles
[Wang, Changhong]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.