Residential Collegefalse
Status已發表Published
RPT: Toward Transferable Model on Heterogeneous Researcher Data via Pre-Training
Ziyue Qiao1,2; Yanjie Fu3; Pengyang Wang4; Meng Xiao5; Zhiyuan Ning5; Denghui Zhang6; Yi Du5; Yuanchun Zhou5
2022-02-18
Source PublicationIEEE Transactions on Big Data
ISSN2332-7790
Volume9Issue:1Pages:186-199
Abstract

With the growth of the academic engines, the mining and analysis acquisition of massive researcher data, such as collaborator recommendation and researcher retrieval, has become indispensable. It can improve the quality of services and intelligence of academic engines. Most of the existing studies for researcher data mining focus on a single task for a particular application scenario and learning a task-specific model, which is usually unable to transfer to out-of-scope tasks. The pre-training technology provides a generalized and sharing model to capture valuable information from enormous unlabeled data. The model can accomplish multiple downstream tasks via a few fine-tuning steps. In this paper, we propose a multi-task self-supervised learning-based researcher data pre-training model named RPT. Specifically, we divide the researchers' data into semantic document sets and community graph. We design the hierarchical Transformer and the local community encoder to capture information from the two categories of data, respectively. Then, we propose three self-supervised learning objectives to train the whole model. Finally, we also propose two transfer modes of RPT for fine-tuning in different scenarios. We conduct extensive experiments to evaluate RPT, results on three downstream tasks verify the effectiveness of pre-training for researcher data mining.

KeywordPre-training Contrastive Learning Transformer Graph Representation Learning
DOI10.1109/TBDATA.2022.3152386
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaComputer Science
WOS SubjectComputer Science, Information Systems ; Computer Science, Theory & Methods
WOS IDWOS:000920335300014
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID2-s2.0-85125335148
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionTHE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
Corresponding AuthorYi Du
Affiliation1.Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100045, China
2.University of Chinese Academy of Sciences, Beijing, 101408, China
3.Department of Computer Science, University of Central Florida, Orlando, 32816, FL, United States
4.State Key Laboratory of Internet of Things for Smart City, University of Macau, Taipa, 999078, Macao
5.Computer Network Information Center, Chinese Academy of Sciences, Beijing, 101408, China
6.Information System Department, Rutgers University, Piscataway, 08854, NJ, United States
Recommended Citation
GB/T 7714
Ziyue Qiao,Yanjie Fu,Pengyang Wang,et al. RPT: Toward Transferable Model on Heterogeneous Researcher Data via Pre-Training[J]. IEEE Transactions on Big Data, 2022, 9(1), 186-199.
APA Ziyue Qiao., Yanjie Fu., Pengyang Wang., Meng Xiao., Zhiyuan Ning., Denghui Zhang., Yi Du., & Yuanchun Zhou (2022). RPT: Toward Transferable Model on Heterogeneous Researcher Data via Pre-Training. IEEE Transactions on Big Data, 9(1), 186-199.
MLA Ziyue Qiao,et al."RPT: Toward Transferable Model on Heterogeneous Researcher Data via Pre-Training".IEEE Transactions on Big Data 9.1(2022):186-199.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Ziyue Qiao]'s Articles
[Yanjie Fu]'s Articles
[Pengyang Wang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Ziyue Qiao]'s Articles
[Yanjie Fu]'s Articles
[Pengyang Wang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Ziyue Qiao]'s Articles
[Yanjie Fu]'s Articles
[Pengyang Wang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.