Residential College | false |
Status | 已發表Published |
Redundancy-Free High-Performance Dynamic GNN Training with Hierarchical Pipeline Parallelism | |
Xia, Yaqi1; Zhang, Zheng1; Wang, Hulin1; Yang, Donglin2; Zhou, Xiaobo3; Cheng, Dazhao1 | |
2023-08 | |
Conference Name | HPDC '23: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing |
Pages | 17-13 |
Conference Date | June 16-23, 2023 |
Conference Place | Orlando, FL |
Country | USA |
Abstract | Temporal Graph Neural Networks (TGNNs) extend the success of Graph Neural Networks to dynamic graphs. Distributed TGNN training requires efficiently tackling temporal dependency, which often leads to excessive cross-device communication that generates significant redundant data. However, existing systems are unable to remove the redundancy in data reuse and transfer, and suffer from severe communication overhead in a distributed setting. This paper presents Sven, an algorithm and system co-designed TGNN training library for the end-to-end performance optimization on multi-node multi-GPU systems. Exploiting dependency patterns of TGNN models and characteristics of dynamic graph datasets, we design redundancy-free data organization and loadbalancing partitioning strategies that mitigate the redundant data communication and evenly partition dynamic graphs at the vertex level. Furthermore, we develop a hierarchical pipeline mechanism integrating data prefetching, micro-batch pipelining, and asynchronous pipelining to mitigate the communication overhead. As the first scaling study on the memory-based TGNNs training, experiments conducted on an HPC cluster of 64 GPUs show that Sven can achieve up to 1.7x-3.3x speedup over the state-of-art approaches and a factor of up to 5.26x communication efficiency improvement. |
DOI | 10.1145/3588195.3592990 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85169535776 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU) |
Affiliation | 1.Wuhan University 2.Nvidia Corp 3.University of Macau |
Recommended Citation GB/T 7714 | Xia, Yaqi,Zhang, Zheng,Wang, Hulin,et al. Redundancy-Free High-Performance Dynamic GNN Training with Hierarchical Pipeline Parallelism[C], 2023, 17-13. |
APA | Xia, Yaqi., Zhang, Zheng., Wang, Hulin., Yang, Donglin., Zhou, Xiaobo., & Cheng, Dazhao (2023). Redundancy-Free High-Performance Dynamic GNN Training with Hierarchical Pipeline Parallelism. , 17-13. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment